Re: performance modality in 7.1 for large text attributes? - Mailing list pgsql-hackers
From | Alex Pilosov |
---|---|
Subject | Re: performance modality in 7.1 for large text attributes? |
Date | |
Msg-id | Pine.BSO.4.10.10012190920331.559-100000@spider.pilosoft.com Whole thread Raw |
In response to | performance modality in 7.1 for large text attributes? (Paul A Vixie <vixie@mfnx.net>) |
Responses |
Re: performance modality in 7.1 for large text attributes?
|
List | pgsql-hackers |
Paul, 1) Have you ran vacuum analyze after all these inserts to update database statistics? :) Without vacuum, pgsql will opt to table scan even when there's an index. 2) I'm not sure if you are executing pgcat 70k times or executing inner loop in pgcat 70k times. Postgres connection establishment is expensive. 3) Postgres INSERT is not very efficient if you are doing a bulk load of data (it has to reparse the statement every time). If you want to delete everything and load new data, use "COPY", which is about 5 times faster. Also, there's a patch by someone to do following: INSERT INTO (fields...) VALUES (...), (...), (...), which results in parsing the statement only once. Oh...And since I have your attention, could you please resolve long-standing discussion between me and Tom Lane? :) Question is whether proper (standard/most-commonly-used) format for printing CIDR network address is 10/8 or 10.0.0.0/8 (i.e. should all octets be printed even if they are 0). After search of RFCs, there's nothing that specifies the standard, but 10.0.0.0/8 is used more often in examples than 10/8 form. Postgres uses 10/8 form, and I'm saying that 10.0.0.0/8 is more accepted by everyone else. (I.E. all software can deal with that, but not all software accepts 10/8). -alex On Mon, 18 Dec 2000, Paul A Vixie wrote: > (plz cc me on your replies, i'm not on pgsql-hackers for some reason.) > > http://www.vix.com/~vixie/results-psql.png shows a gnuplot of the wall time > of 70K executions of "pgcat" (shown below) using a CIDR key and TEXT value. > (this is for storing the MAPS RSS, which we presently have in flat files.) > > i've benchmarked this against a flat directory with IP addresses as filenames, > and against a deep directory with squid/netnews style hashing (127/0/0/1.txt) > and while it's way more predictable than either of those, there's nothing in > my test framework which explains the 1.5s mode shown in the above *.png file. > > anybody know what i could be doing wrong? (i'm also wondering why SELECT > takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless > TOAST is doing a LOT better than i think.) > > furthermore, are there any plans to offer a better libpq interface to INSERT? > the things i'm doing now to quote the text, and the extra copy i'm maintaining, > are painful. arbitrary-sized "text" attributes are a huge boon -- we would > never have considered using postgres for MAPS RSS (or RBL) with "large > objects". (kudos to all who were involved, with both WAL and TOAST!) > > here's the test jig -- please don't redistribute it yet since there's no man > page and i want to try binary cursors and other things to try to speed it up > or clean it up or both. but if someone can look at my code (which i'm running > against the 7.1 bits at the head of the pgsql cvs tree) and at the *.png file > and help me enumerate the sources of my stupidity, i will be forever grateful.
pgsql-hackers by date: