Thread: distibuted database for postgresql
Hi.All,
I Know MySql Has Distributed database like master-slave distibuted database. does the postgresql has?, if postgresql has distributed database how can i set this in order it's work?
--Devi Munandar
On Thu, Aug 12, 2004 at 08:56:37 +0700, Devi Munandar <devi@informatika.lipi.go.id> wrote: > Hi.All, > > I Know MySql Has Distributed database like master-slave distibuted database. does the postgresql has?, if postgresql hasdistributed database how can i set this in order it's work? Slony I is an external replication project that does master slave replication for recent versions of postgres.
I'd not call it a "distributed database". It's simply replication of data from one database to another, & SlonyI will do that for you in Postgres. Down loaded it & the documentation form the Postgres web site.
Dick Goulet
Senior Oracle DBA
Oracle Certified 8i DBA
-----Original Message-----
From: Devi Munandar [mailto:devi@informatika.lipi.go.id]
Sent: Wednesday, August 11, 2004 9:57 PM
To: pgsql-admin@postgresql.org
Subject: [ADMIN] distibuted database for postgresqlHi.All,I Know MySql Has Distributed database like master-slave distibuted database. does the postgresql has?, if postgresql has distributed database how can i set this in order it's work?--Devi Munandar
Hello all: I am upgrading a web application from postgres 7.1.2 to 7.4.3 (too long in coming to that point, I know). I have sifted through the history files and identified a restricted number of changes that potentially impact the app, a few of which I don¹t understand. Here¹s the currently most alarming, a change that is recorded for 7.2: CHAR(), VARCHAR() now reject strings that are too long (Peter E) What does this mean? Or how do I find out what it means? My understanding was that varchar fields had no text limit. But these are written like functions. Does this refer to coercion functions that now reject strings that are longer than the specified size of the underlying column? What if the column was specified as type text¹? And what was the old behavior? Thanks for any insight. This is the only change that¹s really concerning us deeply right now. -- sgl ======================================================= Steve Lane Vice President Soliant Consulting 14 North Peoria St Suite 2H Chicago, IL 60607 voice: (312) 850-3930 email: slane@soliantconsulting.com fax: (312) 850-3930 web: http://www.soliantconsulting.com =======================================================
slane <slane@moyergroup.com> writes: > Here�s the currently most alarming, a change that is recorded for 7.2: > CHAR(), VARCHAR() now reject strings that are too long (Peter E) > What does this mean? Or how do I find out what it means? If in doubt, experiment. In 7.1: regression=# create table foo(f1 varchar(5)); CREATE regression=# insert into foo values('1234567890'); INSERT 2913462 1 regression=# select * from foo; f1 ------- 12345 (1 row) In 7.4: regression=# create table foo(f1 varchar(5)); CREATE TABLE regression=# insert into foo values('1234567890'); ERROR: value too long for type character varying(5) 7.4 follows the SQL spec (or our current understanding of it anyway): strings will not be silently truncated. If you want to force truncation to a particular length you can cast the string explicitly: regression=# select 'way too long'::varchar(5); varchar --------- way t (1 row) That works the same in either version. regards, tom lane
Quoth slane@moyergroup.com (slane): > Hello all: > > I am upgrading a web application from postgres 7.1.2 to 7.4.3 (too long in > coming to that point, I know). > > I have sifted through the history files and identified a restricted number > of changes that potentially impact the app, a few of which I don¹t > understand. > > Here¹s the currently most alarming, a change that is recorded for 7.2: > > CHAR(), VARCHAR() now reject strings that are too long (Peter E) Consider this example: tutorial=# create table test_string (f1 varchar, f2 varchar(10)); CREATE TABLE tutorial=# insert into test_string (f1, f2) values ('abc', 'def'); INSERT 2623360 1 tutorial=# insert into test_string (f1, f2) values ('abcdefghijklmnopqrstuvwxyz', 'def'); INSERT 2623361 1 tutorial=# insert into test_string (f1, f2) values ('abcdefghijklmnopqrstuvwxyz', 'abcdefghiasdfasdfa'); ERROR: value too long for type character varying(10) If no maximum length is specified, PostgreSQL is free to stick a goodly amount of data in the field. But supposing you decide that a particular column is VARCHAR(10), trying to stuff more than 10 characters into it will fail, as you see above. Doing similar with char: tutorial=# create table test_chars (f1 char, f2 char(10)); CREATE TABLE tutorial=# insert into test_chars (f1, f2) values ('abc', 'def'); ERROR: value too long for type character(1) Does that illustrate what's going on? I hope so... -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://www3.sympatico.ca/cbbrowne/spreadsheets.html "Fashion is a form of ugliness so intolerable that we have to alter it every six months." -- Oscar Wilde
Thanks, Tom, that's very helpful. One more clarification: if the underlying column has no express length limit (for example, it was defined as type 'text'), is this issue moot? -- sgl > From: Tom Lane <tgl@sss.pgh.pa.us> > Date: Thu, 12 Aug 2004 23:57:11 -0400 > To: slane <slane@moyergroup.com> > Cc: pgsql-admin@postgresql.org, Jamie Thomas <jthomas@soliantconsulting.com> > Subject: Re: [ADMIN] PG 7.2 varchar change > > slane <slane@moyergroup.com> writes: >> Here¹s the currently most alarming, a change that is recorded for 7.2: >> CHAR(), VARCHAR() now reject strings that are too long (Peter E) >> What does this mean? Or how do I find out what it means? > > If in doubt, experiment. > > In 7.1: > > regression=# create table foo(f1 varchar(5)); > CREATE > regression=# insert into foo values('1234567890'); > INSERT 2913462 1 > regression=# select * from foo; > f1 > ------- > 12345 > (1 row) > > In 7.4: > > regression=# create table foo(f1 varchar(5)); > CREATE TABLE > regression=# insert into foo values('1234567890'); > ERROR: value too long for type character varying(5) > > 7.4 follows the SQL spec (or our current understanding of it anyway): > strings will not be silently truncated. If you want to force truncation > to a particular length you can cast the string explicitly: > > regression=# select 'way too long'::varchar(5); > varchar > --------- > way t > (1 row) > > That works the same in either version. > > regards, tom lane >
Steve Lane <slane@soliantconsulting.com> writes: > Thanks, Tom, that's very helpful. One more clarification: if the underlying > column has no express length limit (for example, it was defined as type > 'text'), is this issue moot? Right, it only applies to specifically-declared length limits. regards, tom lane
Hello all: I'm seeing some very odd query behavior on postgres 7.1.3. I know that's way out of date, and I do have a plan in place to upgrade, but any immediate help will be, well, very helpful. I have a server running two instances of postgres: 7.1.3 and 7.4.3. Each supports a different web application -- the web apps are served from a separate middleware server. Formerly both web apps ran on a single all-in-one server, on postgres 7.1.3. Since reconfiguring the servers (one web, one database) and deploying on the two-postgres config, performance of the web app running on the older pg is terrible, despite the new db-only machine being superficially "better" (in processor and RAM) than the original. I have enabled pretty extensive query logging on the 7.1.3 instance. What I see is that certain queries take hideously long. What is odd is that these are perfectly well optimized queries: they represents updates to a table of web application sessions. Each user's session is written once per web page view, so the query gets run frequently. Now, this session update is not, I think, the only query that's taking long. But there are so many of these queries that they are heavily represented in the slow queries (which I'm determining somewhat laboriously by running the huge logs through an awk script). Here's an example EXPLAIN: datapoint=# explain UPDATE nsse_session SET expiration = 1093541296, value = 'sessIdUser|s:5:\"31991\";sessPriv|s:1:\"u\";se\ datapoint'# ssCustId|s:5:\"14688\";sessSchoolId|s:5:\"14781\";sessUserName|s:6:\"gcruse\ ";sessImgVersion|i:2;sessAccessPrivs|N;sessAccessSchools|a:1:{i:14781;s:\ datapoint'# 5:\"14781\";}sessSort|a:0:{}!sessBack|sessDPVersion|s:4:\"full\";sessDataSou rce|s:4:\"live\";sessUserMeasures|s:14:\"1948,1913,1703\";sessUserGroups\ datapoint'# |s:84:\"3074,3113,3144,3179,3223,3263,5968,5997,6011,6027,6035,6046,6063,607 6,6087,6105,6116\";!sessGroupFilter|' WHERE id_session = '955af0272896ba\ datapoint'# a67956494dcb30a5fe' AND expiration > 1093441296; NOTICE: QUERY PLAN: Index Scan using nsse_session_pkey on nsse_session (cost=0.00..8.16 rows=3 width=50) EXPLAIN So, an index scan, as it should be. When I check the postgres log for the query stats, I find this: QUERY STATISTICS ! system usage stats: ! 0.012477 elapsed 0.000000 user 0.000000 system sec ! [0.070000 user 0.000000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 3/1 [1044/309] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written QUERY STATISTICS ! system usage stats: ! 1878.708691 elapsed 194.000000 user 1.180000 system sec ! [194.020000 user 1.190000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 36441/3233 [37108/3496] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 16863 read, 125 written, buffer hit rate = 99.99% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written Two sets of stats, of which the first looks OK, and the second is terrible. Virtually all of the long queries I've seen so far display this behavior, a double set of stats, with the second one looking like bad news. The second set seems almost to show a very elevated level of page faults. (what are these page faults against? They seem not to be against the pg shared buffers -- are they against the OS disk cache?) I did hear a suggestion that it's possible that a few very expensive queries are blocking others (if they're writing) and that I see the effects mostly on these session updates, because they're so frequent. I guess that's possible, but would like to know more about what the above stats might mean. Some of these session writes are fairly heavy -- we carry a lot of data in the session, often multiple tens of K of text going into one column. Is this a stats problem, symptom of insufficient vacuuming? Relevant params: postgres 7.1.3 running on RedHat ES 2.1. SHMMAX set to 800000000. (Physical ram is 3gig). Shared buffers for this install = 20000, sort mem is 8 meg, max connections = 50. (I don't think it's pure concurrent load, it happened during a user training session with only about 30 users). The other pg install on the box is similarly configured. Just to complete the data dump, here's the output of ipcs: ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x0052e6a9 3538944 postgres 600 167649280 2 0x0052e2c1 3473409 postgres 600 170762240 2 ------ Semaphore Arrays -------- key semid owner perms nsems status 0x0052e6a9 11567104 postgres 600 17 0x0052e6aa 11599873 postgres 600 17 0x0052e6ab 11632642 postgres 600 17 0x0052e6ac 11665411 postgres 600 17 0x0052e2c1 10878980 postgres 600 17 0x0052e2c2 10911749 postgres 600 17 0x0052e2c3 10944518 postgres 600 17 0x0052e2c4 10977295 postgres 600 17 0x0052e2c5 11010064 postgres 600 17 0x0052e2c6 11042835 postgres 600 17 0x0052e2c7 11075604 postgres 600 17 0x0052e2c8 11108373 postgres 600 17 0x0052e2c9 11141142 postgres 600 17 0x0052e2ca 11173911 postgres 600 17 0x0052e2cb 11206680 postgres 600 17 0x0052e2cc 11239449 postgres 600 17 0x0052e2cd 11272218 postgres 600 17 0x0052e2ce 11304987 postgres 600 17 0x0052e2cf 11337756 postgres 600 17 0x0052e2d0 11370525 postgres 600 17 0x0052e2d1 11403294 postgres 600 17 ------ Message Queues -------- key msqid owner perms used-bytes messages Sorry, I know that's a ton of data but the problem is urgent and I'm very perplexed. For comparison, on the original unified server, I had max connections of 256 (as opposed to 50 on the new one -- could this be the problem?), sort_mem of 6000, shared buffers of 20000. (overall, shared buffers have doubled on the new machine because I've devoted that much to each). -- sgl
Steve Lane <slane@soliantconsulting.com> writes: > I have enabled pretty extensive query logging on the 7.1.3 instance. What I > see is that certain queries take hideously long. Tell us about foreign keys associated with the table being updated? An UPDATE would fire triggers for both referencing and referenced keys ... I'm suspecting an unindexed or datatype-incompatible foreign key column. regards, tom lane
> From: Tom Lane <tgl@sss.pgh.pa.us> > Date: Wed, 25 Aug 2004 15:16:32 -0400 > To: Steve Lane <slane@soliantconsulting.com> > Cc: pgsql-admin@postgresql.org, Jamie Thomas <jthomas@soliantconsulting.com> > Subject: Re: [ADMIN] Odd query behavior [urgent, but long] > > Steve Lane <slane@soliantconsulting.com> writes: >> I have enabled pretty extensive query logging on the 7.1.3 instance. What I >> see is that certain queries take hideously long. > > Tell us about foreign keys associated with the table being updated? > An UPDATE would fire triggers for both referencing and referenced keys ... > > I'm suspecting an unindexed or datatype-incompatible foreign key column. Hi Tom: Thanks. There are, unfortunately, no foreign keys or triggers anywhere in the db that I know of. And this query is not always slow, just occasionally. -- sgl
OK, well I may have some more useful information. The queries that blow up seem, by and large, to be updates to the session table. Again, some of these updates are big: I measured one at 50K today. I suspected that insufficient vacuuming might be involved. I did a full vacuum and got this for the session table: NOTICE: --Relation nsse_session-- NOTICE: Pages 3544: Changed 11, reaped 696, Empty 0, New 0; Tup 26121: Vac 37327, Keep/VTL 0/0, Crash 0, UnUsed 107, MinLen 132, MaxLen 2032; Re-using: Free/Avail. Space 5345328/5336916; EndEmpty/Avail. Pages 0/684. CPU 0.08s/0.03u sec. NOTICE: Index nsse_session_pkey: Pages 1435; Tuples 26121: Deleted 36871. CPU 0.07s/0.13u sec. NOTICE: Rel nsse_session: Pages: 3544 --> 2873; Tuple(s) moved: 13. CPU 0.02s/0.06u sec. NOTICE: Index nsse_session_pkey: Pages 1435; Tuples 26121: Deleted 13. CPU 0.00s/0.01u sec. NOTICE: --Relation pg_toast_3521195-- NOTICE: Pages 83872: Changed 6, reaped 81999, Empty 0, New 0; Tup 8100: Vac 327763, Keep/VTL 0/0, Crash 0, UnUsed 88, MinLen 45, MaxLen 2034; Re-using: Free/Avail. Space 668306776/668306512; EndEmpty/Avail. Pages 0/81954. CPU 3.22s/0.30u sec. NOTICE: Index pg_toast_3521195_idx: Pages 1668; Tuples 8100: Deleted 327763. CPU 0.33s/1.07u sec. NOTICE: Rel pg_toast_3521195: Pages: 83872 --> 1933; Tuple(s) moved: 15. CPU 10.16s/4.87u sec. NOTICE: Index pg_toast_3521195_idx: Pages 1668; Tuples 8100: Deleted 15. CPU 0.05s/0.00u sec. Now granted, this was after I had written scripts to perform that 50K update about 10-15K times -- I did this just to check the degradation in insert performance on the unvacuumed table, and as expected I saw a slow, steady degradation in insert time, but nothing catastrophic. Clearly the toast table for nsse_session is big, and the table is very frequently accessed. This seems like a recipe for page faults of some kind, but it seems this could lead to two opposite conclusions: 1) lower shared buffers in case shared buffers are starving the OS disk caching buffers 2) raise shared buffers so as to get the whole session/toast table in memory ?? -- sgl > From: Steve Lane <slane@soliantconsulting.com> > Date: Wed, 25 Aug 2004 14:26:52 -0500 > To: Tom Lane <tgl@sss.pgh.pa.us> > Cc: <pgsql-admin@postgresql.org>, Jamie Thomas <jthomas@soliantconsulting.com> > Subject: Re: [ADMIN] Odd query behavior [urgent, but long] > > > >> From: Tom Lane <tgl@sss.pgh.pa.us> >> Date: Wed, 25 Aug 2004 15:16:32 -0400 >> To: Steve Lane <slane@soliantconsulting.com> >> Cc: pgsql-admin@postgresql.org, Jamie Thomas <jthomas@soliantconsulting.com> >> Subject: Re: [ADMIN] Odd query behavior [urgent, but long] >> >> Steve Lane <slane@soliantconsulting.com> writes: >>> I have enabled pretty extensive query logging on the 7.1.3 instance. What I >>> see is that certain queries take hideously long. >> >> Tell us about foreign keys associated with the table being updated? >> An UPDATE would fire triggers for both referencing and referenced keys ... >> >> I'm suspecting an unindexed or datatype-incompatible foreign key column. > > Hi Tom: > > Thanks. There are, unfortunately, no foreign keys or triggers anywhere in > the db that I know of. And this query is not always slow, just occasionally. > > -- sgl > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
My odd "double" queries continue. On the theory that I had some kind of page-faulting issue tied into large, frequent updates of a table, I vacuumed the whole database and began to watch it closely this morning. Already, after very little activity, I get this in the log: 2004-08-26 07:01:26 [22056] DEBUG: query: INSERT INTO nsse_session (id_user, id_session, expiration, value) VALUES (<values>) 2004-08-26 07:01:26 [22056] ERROR: Cannot insert a duplicate key into unique index nsse_session_pkey 2004-08-26 07:01:26 [22056] DEBUG: query: UPDATE nsse_session SET <values> WHERE id_session = '32845b87350846bf51176079c19f2fb4' EXECUTOR STATISTICS ! system usage stats: ! 0.000183 elapsed 0.000000 user 0.000000 system sec ! [0.000000 user 0.030000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 2/4 [414/257] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written EXECUTOR STATISTICS ! system usage stats: ! 10.626398 elapsed 10.490000 user 0.010000 system sec ! [10.510000 user 0.010000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 261/10 [691/281] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written Same pattern as before -- the same query seems to get run twice, the second time ridiculously long, with a high number of page faults. (Again, this is a straightforward query involving a single index scan, and the planner says as much). I realize no one may have any insight in the whole problem, but can anyone at least tell me the significance of each of these numbers from the stats: 261/10 [691/281] page faults/reclaims, 0 [0] swaps That might give me an idea of where to begin. -- sgl
Steve Lane <slane@soliantconsulting.com> writes: > I realize no one may have any insight in the whole problem, but can anyone > at least tell me the significance of each of these numbers from the stats: > 261/10 [691/281] page faults/reclaims, 0 [0] swaps It's just repeating what getrusage() told it. See the docs for your operating system. It looks like the numbers inside the square brackets are totals since process start, while the ones before are the delta since query start. regards, tom lane
Okay, that's helpful. I found the relevant code in /backend/tcop/postgres.c and it's as you say, process start vs query start ... OK, so these numbers are OS-level page faults. Can you help me understand what a page fault on an update would mean? Is it that the disk page containing some or all of the row to be updated is not found in the OS cache? Lastly, any idea why I would see executor stats twice (or in one case this morning) three times in a row? Is it actually re-running the query? Here's one last oddness: the vast bulk of these problem queries occur when we're updating a session table, where our logic is: 1) Send the query as an INSERT 2) See if it fails 3) If so, resend as an UPDATE The initial INSERTs never show this kind of problem, only the subsequent UPDATES. Is there something in our pattern that somehow increases the possibility of a cache miss? The example of sequential reads on a non-interleaved hard drive comes to mind, but I can't imagine that that's useful except in some vague metaphorical way ... -- sgl > From: Tom Lane <tgl@sss.pgh.pa.us> > Date: Thu, 26 Aug 2004 10:51:21 -0400 > To: Steve Lane <slane@soliantconsulting.com> > Cc: pgsql-admin@postgresql.org, Jamie Thomas <jthomas@soliantconsulting.com> > Subject: Re: [ADMIN] Odd double queries continues > > Steve Lane <slane@soliantconsulting.com> writes: >> I realize no one may have any insight in the whole problem, but can anyone >> at least tell me the significance of each of these numbers from the stats: >> 261/10 [691/281] page faults/reclaims, 0 [0] swaps > > It's just repeating what getrusage() told it. See the docs for your > operating system. > > It looks like the numbers inside the square brackets are totals since > process start, while the ones before are the delta since query start. > > regards, tom lane >