Thread: SELECT's take a long time compared to other DBMS
I have a table with 102,384 records in it, each record is 934 bytes. Using the follow select statement: SELECT * from <table> PG Info: version 7.3.4 under cygwin on Windows 2000 ODBC: version 7.3.100 Machine: 500 Mhz/ 512MB RAM / IDE HDD Under PG: Data is returned in 26 secs!! Under SQL Server: Data is returned in 5 secs. Under SQLBase: Data is returned in 6 secs. Under SAPDB: Data is returned in 7 secs. This is the ONLY table in the database and only 1 user. And yes I did a vacuum. Is this normal behavior for PG? Thanks
Hi, >And yes I did a vacuum. > Did you 'Analyze' too ? Cheers Rudi.
> Under PG: Data is returned in 26 secs!! > Under SQL Server: Data is returned in 5 secs. > Under SQLBase: Data is returned in 6 secs. > Under SAPDB: Data is returned in 7 secs. What did you use as the client? Do those times include ALL resulting data or simply the first few lines? PostgreSQL performance on windows (via Cygwin) is known to be poor. Do you receive similar results with 7.4 beta 2?
Attachment
Yes I Analyze also, but there was no need to because it was a fresh brand new database. "Rudi Starcevic" <rudi@oasis.net.au> wrote in message news:3F569012.3090209@oasis.net.au... > Hi, > > > >And yes I did a vacuum. > > > > Did you 'Analyze' too ? > > Cheers > Rudi. > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
Hi, >Yes I Analyze also, but there was no need to because it was a fresh brand >new database. > Hmm ... Sorry I'm not sure then. I only use Linux with PG. Even though it's 'brand new' you still need to Analyze so that any Indexes etc. are built. I'll keep an eye on this thread - Good luck. Regards Rudi.
All queries were ran on the SERVER for all of the databases I tested. This is all resulting data for all of the databases that I tested. "Rod Taylor" <rbt@rbt.ca> wrote in message news:1062637505.84923.7.camel@jester...
On Wed, 2003-09-03 at 21:32, Rudi Starcevic wrote: > Hmm ... Sorry I'm not sure then. I only use Linux with PG. > Even though it's 'brand new' you still need to Analyze so that any > Indexes etc. are built. ANALYZE doesn't build indexes, it only updates the statistics used by the query optimizer (and in any case, "select * from <foo>" has only one reasonable query plan anyway). -Neil
Quoth "Relaxin" <me@yourhouse.com>: > Yes I Analyze also, but there was no need to because it was a fresh > brand new database. That is _absolutely not true_. It is not true with any DBMS that uses a cost-based optimizer. Cost-based optimizers need some equivalent to ANALYZE in order to collect statistics to allow them to pick any path other than a sequential scan. In this particular case, a seq scan is pretty likely to be the best answer when there is no WHERE clause on the query. Actually, it doesn't make all that much sense that the other systems would be terribly much faster, because they obviously need to do some processing on 102,384 records. Can you tell us what you were *actually* doing? Somehow it sounds as though the other databases were throwing away the data whereas PostgreSQL was returning it all "kawhump!" in one batch. What programs were you using to submit the queries? -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://cbbrowne.com/info/oses.html "Computers let you make more mistakes faster than any other invention in human history, with the possible exception of handguns and tequila." -- Mitch Radcliffe
In the last exciting episode, "Relaxin" <me@yourhouse.com> wrote: > All queries were ran on the SERVER for all of the databases I tested. Queries obviously run "on the server." That's kind of the point of the database system being a "client/server" system. The question is what client program(s) you used to process the result sets. I'd be surprised to see any client process 100K records in any meaningful way in much less than 30 seconds. Rendering that much data into a console will take some time. Drawing it into cells on a GUI window will take a lot more time. Supposing you were using a graphical client, it would be unsurprising for it to have submitted something equivalent to "limit 30 rows" (or whatever you can display on screen), and defer further processing 'til later. If that were the case, then 26s to process the whole thing would be a lot more efficient than 5-6s to process a mere 30 rows... > This is all resulting data for all of the databases that I tested. You seem to have omitted "all resulting data." -- If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me http://www.ntlug.org/~cbbrowne/sap.html "Women who seek to be equal to men lack ambition. " -- Timothy Leary
> Yes I Analyze also, but there was no need to because it was a fresh brand > new database. This apparently wasn't the source of problem since he did an analyze anyway, but my impression was that a fresh brand new database is exactly the situation where an analyze is needed- ie: a batch of data has just been loaded and stats haven't been collected yet. Am I mistaken? -Nick
"Nick Fankhauser" <nickf@ontko.com> writes: > This apparently wasn't the source of problem since he did an analyze anyway, > but my impression was that a fresh brand new database is exactly the > situation where an analyze is needed- ie: a batch of data has just been > loaded and stats haven't been collected yet. Indeed. But as someone else already pointed out, a seqscan is the only reasonable plan for an unqualified "SELECT whatever FROM table" query; lack of stats wouldn't deter the planner from arriving at that conclusion. My guess is that the OP is failing to account for some client-side inefficiency in absorbing a large query result. regards, tom lane
> Can you tell us what you were *actually* doing? Somehow it sounds as > though the other databases were throwing away the data whereas > PostgreSQL was returning it all "kawhump!" in one batch. All of the databases that I tested the query against gave me immediate access to ANY row of the resultset once the data had been returned. Ex. If I'm currently at the first row and then wanted to goto the 100,000 row, I would be there immediately, and if I wanted to then goto the 5 row...same thing, I have the record immediately! The other databases I tested against stored the entire resultset on the Server, I'm not sure what PG does...It seems that brings the entire resultset client side. If that is the case, how can I have PG store the resultset on the Server AND still allow me immediate access to ANY row in the resultset? > What programs were you using to submit the queries? I used the same program for all of the database. I was using ODBC as connectivity.
On 4 Sep 2003 at 0:48, Relaxin wrote: > All of the databases that I tested the query against gave me immediate > access to ANY row of the resultset once the data had been returned. > Ex. If I'm currently at the first row and then wanted to goto the 100,000 > row, I would be there immediately, and if I wanted to then goto the 5 > row...same thing, I have the record immediately! > > The other databases I tested against stored the entire resultset on the > Server, I'm not sure what PG does...It seems that brings the entire > resultset client side. > If that is the case, how can I have PG store the resultset on the Server AND > still allow me immediate access to ANY row in the resultset? You can use a cursor and get only required rows. Bye Shridhar -- Nick the Greek's Law of Life: All things considered, life is 9 to 5 against.
All rows are required. ""Shridhar Daithankar"" <shridhar_daithankar@persistent.co.in> wrote in message news:3F573E8B.31916.A1063F8@localhost... > On 4 Sep 2003 at 0:48, Relaxin wrote: > > All of the databases that I tested the query against gave me immediate > > access to ANY row of the resultset once the data had been returned. > > Ex. If I'm currently at the first row and then wanted to goto the 100,000 > > row, I would be there immediately, and if I wanted to then goto the 5 > > row...same thing, I have the record immediately! > > > > The other databases I tested against stored the entire resultset on the > > Server, I'm not sure what PG does...It seems that brings the entire > > resultset client side. > > If that is the case, how can I have PG store the resultset on the Server AND > > still allow me immediate access to ANY row in the resultset? > > You can use a cursor and get only required rows. > > > Bye > Shridhar > > -- > Nick the Greek's Law of Life: All things considered, life is 9 to 5 against. > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
Relaxin kirjutas N, 04.09.2003 kell 03:28: > I have a table with 102,384 records in it, each record is 934 bytes. I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes each and did [hannu@fuji hannu]$ time psql test100k -c 'select * from test' > /dev/null real 0m3.970s user 0m0.980s sys 0m0.570s so it seems definitely not a problem with postgres as such, but perhaps with Cygwin and/or ODBC driver I also ran the same query using the "standard" pg adapter: >>> import pg, time >>> >>> con = pg.connect('test100k') >>> >>> def getall(): ... t1 = time.time() ... res = con.query('select * from test') ... t2 = time.time() ... list = res.getresult() ... t3 = time.time() ... print t2 - t1, t3-t2 ... >>> getall() 3.27637195587 1.10105705261 >>> getall() 3.07413101196 0.996125936508 >>> getall() 3.03377199173 1.07322502136 which gave similar results ------------------------------ Hannu
So after you did that, where able to position to ANY record within the resultset? Ex. Position 100,000; then to Position 5; then to position 50,000, etc... If you are able to do that and have your positioned row available to you immediately, then I'll believe that it's the ODBC driver. "Hannu Krosing" <hannu@tm.ee> wrote in message news:1062673303.5200.135.camel@fuji.krosing.net... > Relaxin kirjutas N, 04.09.2003 kell 03:28: > > I have a table with 102,384 records in it, each record is 934 bytes. > > I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide > drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes > each and did > > [hannu@fuji hannu]$ time psql test100k -c 'select * from test' > > /dev/null > > real 0m3.970s > user 0m0.980s > sys 0m0.570s > > so it seems definitely not a problem with postgres as such, but perhaps > with Cygwin and/or ODBC driver > > I also ran the same query using the "standard" pg adapter: > > >>> import pg, time > >>> > >>> con = pg.connect('test100k') > >>> > >>> def getall(): > ... t1 = time.time() > ... res = con.query('select * from test') > ... t2 = time.time() > ... list = res.getresult() > ... t3 = time.time() > ... print t2 - t1, t3-t2 > ... > >>> getall() > 3.27637195587 1.10105705261 > >>> getall() > 3.07413101196 0.996125936508 > >>> getall() > 3.03377199173 1.07322502136 > > which gave similar results > > ------------------------------ > Hannu > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
You forgot that the original poster's query was: SELECT * from <table> This should require a simple table scan. NO need for stats. Either the table has not been properly vacuumed or he's got seq_scan off... JLL Nick Fankhauser wrote: > > > Yes I Analyze also, but there was no need to because it was a fresh brand > > new database. > > This apparently wasn't the source of problem since he did an analyze anyway, > but my impression was that a fresh brand new database is exactly the > situation where an analyze is needed- ie: a batch of data has just been > loaded and stats haven't been collected yet. > > Am I mistaken? > > -Nick > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
The table has been Vacuumed and seq_scan is turned on. "Jean-Luc Lachance" <jllachan@nsd.ca> wrote in message news:3F5753B9.F4A5A63F@nsd.ca... > You forgot that the original poster's query was: > SELECT * from <table> > > This should require a simple table scan. NO need for stats. > Either the table has not been properly vacuumed or he's got seq_scan > off... > > JLL > > > Nick Fankhauser wrote: > > > > > Yes I Analyze also, but there was no need to because it was a fresh brand > > > new database. > > > > This apparently wasn't the source of problem since he did an analyze anyway, > > but my impression was that a fresh brand new database is exactly the > > situation where an analyze is needed- ie: a batch of data has just been > > loaded and stats haven't been collected yet. > > > > Am I mistaken? > > > > -Nick > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
Relaxin kirjutas N, 04.09.2003 kell 17:35: > So after you did that, where able to position to ANY record within the > resultset? > > Ex. Position 100,000; then to Position 5; then to position 50,000, etc... not in the case of : time psql test100k -c 'select * from test' > /dev/null as the whole result would be written to dev null (i.e discarded) Yes in case of python: after doing res = con.query('select * from test') # 3 sec - perform query list = res.getresult() # 1 sec - construct list of tuples the whole 128k records are in a python list , so that i can immediately access any record by python list syntax, ie list[5], list[50000] etc. > If you are able to do that and have your positioned row available to you > immediately, then I'll believe that it's the ODBC driver. It can also be the Cygwin port, which is known to have several problems, and if you run both your client and server on the same machine, then it can also be an interaction of the two processes (cygwin/pgsql server and native win32 ODBC client) not playing together very well. > "Hannu Krosing" <hannu@tm.ee> wrote in message > news:1062673303.5200.135.camel@fuji.krosing.net... > > Relaxin kirjutas N, 04.09.2003 kell 03:28: > > > I have a table with 102,384 records in it, each record is 934 bytes. > > > > I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide > > drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes > > each and did > > > > [hannu@fuji hannu]$ time psql test100k -c 'select * from test' > > > /dev/null > > > > real 0m3.970s > > user 0m0.980s > > sys 0m0.570s > > > > so it seems definitely not a problem with postgres as such, but perhaps > > with Cygwin and/or ODBC driver > > > > I also ran the same query using the "standard" pg adapter: > > > > >>> import pg, time > > >>> > > >>> con = pg.connect('test100k') > > >>> > > >>> def getall(): > > ... t1 = time.time() > > ... res = con.query('select * from test') > > ... t2 = time.time() > > ... list = res.getresult() > > ... t3 = time.time() > > ... print t2 - t1, t3-t2 > > ... > > >>> getall() > > 3.27637195587 1.10105705261 > > >>> getall() > > 3.07413101196 0.996125936508 > > >>> getall() > > 3.03377199173 1.07322502136 > > > > which gave similar results ------------------- Hannu
I had these same issues with the PeerDirect version also. "Hannu Krosing" <hannu@tm.ee> wrote in message news:1062693009.6174.21.camel@fuji.krosing.net... > Relaxin kirjutas N, 04.09.2003 kell 17:35: > > So after you did that, where able to position to ANY record within the > > resultset? > > > > Ex. Position 100,000; then to Position 5; then to position 50,000, etc... > > not in the case of : > time psql test100k -c 'select * from test' > /dev/null > as the whole result would be written to dev null (i.e discarded) > > Yes in case of python: after doing > > res = con.query('select * from test') # 3 sec - perform query > list = res.getresult() # 1 sec - construct list of tuples > > the whole 128k records are in a python list , > so that i can immediately access any record by python list syntax, > ie list[5], list[50000] etc. > > > If you are able to do that and have your positioned row available to you > > immediately, then I'll believe that it's the ODBC driver. > > It can also be the Cygwin port, which is known to have several problems, > and if you run both your client and server on the same machine, then it > can also be an interaction of the two processes (cygwin/pgsql server and > native win32 ODBC client) not playing together very well. > > > "Hannu Krosing" <hannu@tm.ee> wrote in message > > news:1062673303.5200.135.camel@fuji.krosing.net... > > > Relaxin kirjutas N, 04.09.2003 kell 03:28: > > > > I have a table with 102,384 records in it, each record is 934 bytes. > > > > > > I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide > > > drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes > > > each and did > > > > > > [hannu@fuji hannu]$ time psql test100k -c 'select * from test' > > > > /dev/null > > > > > > real 0m3.970s > > > user 0m0.980s > > > sys 0m0.570s > > > > > > so it seems definitely not a problem with postgres as such, but perhaps > > > with Cygwin and/or ODBC driver > > > > > > I also ran the same query using the "standard" pg adapter: > > > > > > >>> import pg, time > > > >>> > > > >>> con = pg.connect('test100k') > > > >>> > > > >>> def getall(): > > > ... t1 = time.time() > > > ... res = con.query('select * from test') > > > ... t2 = time.time() > > > ... list = res.getresult() > > > ... t3 = time.time() > > > ... print t2 - t1, t3-t2 > > > ... > > > >>> getall() > > > 3.27637195587 1.10105705261 > > > >>> getall() > > > 3.07413101196 0.996125936508 > > > >>> getall() > > > 3.03377199173 1.07322502136 > > > > > > which gave similar results > ------------------- > Hannu > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend >
Relaxin wrote: > I have a table with 102,384 records in it, each record is 934 bytes. > > Using the follow select statement: > SELECT * from <table> > > PG Info: version 7.3.4 under cygwin on Windows 2000 > ODBC: version 7.3.100 > > Machine: 500 Mhz/ 512MB RAM / IDE HDD > > Under PG: Data is returned in 26 secs!! > Under SQL Server: Data is returned in 5 secs. > Under SQLBase: Data is returned in 6 secs. > Under SAPDB: Data is returned in 7 secs. I created a similar table (934 bytes, 102K records) on a slightly faster machine: P3/800 + 512MB RAM + IDE HD. The server OS is Solaris 8 x86 and the version is 7.3.3. On the server (via PSQL client) : 7.5 seconds Using ODBC under VFPW: 10.5 seconds How that translates to what you should see, I'm not sure. Assuming it was just the CPU difference, you should see numbers of roughly 13 seconds. But the documentation says PG under CYGWIN is significantly slower than PG under UNIX so your mileage may vary... Have you changed any of the settings yet in postgresql.conf, specifically the shared_buffers setting?
On Wed, 3 Sep 2003, Relaxin wrote: > I have a table with 102,384 records in it, each record is 934 bytes. > > Using the follow select statement: > SELECT * from <table> > > PG Info: version 7.3.4 under cygwin on Windows 2000 > ODBC: version 7.3.100 > > Machine: 500 Mhz/ 512MB RAM / IDE HDD > > > Under PG: Data is returned in 26 secs!! > Under SQL Server: Data is returned in 5 secs. > Under SQLBase: Data is returned in 6 secs. > Under SAPDB: Data is returned in 7 secs. This is typical of postgresql under cygwin, it's much faster under a Unix OS like Linux or BSD. That said, you CAN do some things to help speed it up, the biggest being tuning the shared_buffers to be something large enough to hold a fair bit of data. Set the shared_buffers to 1000, restart, and see if things get better. Running Postgresql in a unix emulation layer is guaranteed to make it slow. If you've got a spare P100 with 128 Meg of RAM you can throw redhat 9 or FreeBSD 4.7 on and run Postgresql on, it will likely outrun your 500MHZ cygwin box, and might even keep up with the other databases on that machine as well.
You would "get" all rows, but they'd be stored server side until your client asked for them. I.e. a cursor would level the field here, since you say that the other test cases stored the entire result set on the server. Or did I misunderstand what you meant there? On Thu, 4 Sep 2003, Relaxin wrote: > All rows are required. > > ""Shridhar Daithankar"" <shridhar_daithankar@persistent.co.in> wrote in > message news:3F573E8B.31916.A1063F8@localhost... > > On 4 Sep 2003 at 0:48, Relaxin wrote: > > > All of the databases that I tested the query against gave me immediate > > > access to ANY row of the resultset once the data had been returned. > > > Ex. If I'm currently at the first row and then wanted to goto the > 100,000 > > > row, I would be there immediately, and if I wanted to then goto the 5 > > > row...same thing, I have the record immediately! > > > > > > The other databases I tested against stored the entire resultset on the > > > Server, I'm not sure what PG does...It seems that brings the entire > > > resultset client side. > > > If that is the case, how can I have PG store the resultset on the Server > AND > > > still allow me immediate access to ANY row in the resultset? > > > > You can use a cursor and get only required rows. > > > > > > Bye > > Shridhar > > > > -- > > Nick the Greek's Law of Life: All things considered, life is 9 to 5 > against. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
> > Have you changed any of the settings yet in postgresql.conf, > specifically the shared_buffers setting? > fsync = false tcpip_socket = true shared_buffers = 128
I reset the shared_buffers to 1000 from 128, but it made no difference. ""scott.marlowe"" <scott.marlowe@ihs.com> wrote in message news:Pine.LNX.4.33.0309041625300.28714-100000@css120.ihs.com... > On Wed, 3 Sep 2003, Relaxin wrote: > > > I have a table with 102,384 records in it, each record is 934 bytes. > > > > Using the follow select statement: > > SELECT * from <table> > > > > PG Info: version 7.3.4 under cygwin on Windows 2000 > > ODBC: version 7.3.100 > > > > Machine: 500 Mhz/ 512MB RAM / IDE HDD > > > > > > Under PG: Data is returned in 26 secs!! > > Under SQL Server: Data is returned in 5 secs. > > Under SQLBase: Data is returned in 6 secs. > > Under SAPDB: Data is returned in 7 secs. > > This is typical of postgresql under cygwin, it's much faster under a Unix > OS like Linux or BSD. That said, you CAN do some things to help speed it > up, the biggest being tuning the shared_buffers to be something large > enough to hold a fair bit of data. Set the shared_buffers to 1000, > restart, and see if things get better. > > Running Postgresql in a unix emulation layer is guaranteed to make it > slow. If you've got a spare P100 with 128 Meg of RAM you can throw redhat > 9 or FreeBSD 4.7 on and run Postgresql on, it will likely outrun your > 500MHZ cygwin box, and might even keep up with the other databases on that > machine as well. > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com> wrote: >> Have you changed any of the settings yet in postgresql.conf, >> specifically the shared_buffers setting? > > fsync = false > tcpip_socket = true > shared_buffers = 128 Change fsync to true (you want your data to survive, right?) and increase shared buffers to something that represents ~10% of your system memory, in blocks of 8K. So, if you have 512MB of RAM, then the total blocks is 65536, and it would likely be reasonable to increase shared_buffers to 1/10 of that, or about 6500. What is the value of effective_cache_size? That should probably be increased a whole lot, too. If you are mainly just running the database on your system, then it would be reasonable to set it to most of memory, or (* 1/2 (/ (* 512 1024 1024) 8192)) 32768. None of this is likely to substantially change the result of that one query, however, and it seems quite likely that it is because PostgreSQL is honestly returning the whole result set of ~100K rows at once, whereas the other DBMSes are probably using cursors to return only the few rows of the result that you actually looked at. -- "cbbrowne","@","cbbrowne.com" http://www3.sympatico.ca/cbbrowne/linuxdistributions.html Rules of the Evil Overlord #14. "The hero is not entitled to a last kiss, a last cigarette, or any other form of last request." <http://www.eviloverlord.com/>
Thank you Christopher. > Change fsync to true (you want your data to survive, right?) and > increase shared buffers to something that represents ~10% of your > system memory, in blocks of 8K. I turned it off just in the hope that things would run faster. > None of this is likely to substantially change the result of that one > query, however, and it seems quite likely that it is because > PostgreSQL is honestly returning the whole result set of ~100K rows at > once, whereas the other DBMSes are probably using cursors to return > only the few rows of the result that you actually looked at. Finally, someone who will actually assume/admit that it is returning the entire result set to the client. Where as other DBMS manage the records at the server. I hope PG could fix/enhance this issue. There are several issues that's stopping our company from going with PG (with paid support, if available), but this seems to big the one at the top of the list. The next one is the handling of BLOBS. PG handles them like no other system I have ever come across. After that is a native Windows port, but we would deal cygwin (for a very little while) if these other issues were handled. Thanks "Christopher Browne" <cbbrowne@acm.org> wrote in message news:m3fzjc58ll.fsf@chvatal.cbbrowne.com... > A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com> wrote: > >> Have you changed any of the settings yet in postgresql.conf, > >> specifically the shared_buffers setting? > > > > fsync = false > > tcpip_socket = true > > shared_buffers = 128 > > Change fsync to true (you want your data to survive, right?) and > increase shared buffers to something that represents ~10% of your > system memory, in blocks of 8K. > > So, if you have 512MB of RAM, then the total blocks is 65536, and it > would likely be reasonable to increase shared_buffers to 1/10 of that, > or about 6500. > > What is the value of effective_cache_size? That should probably be > increased a whole lot, too. If you are mainly just running the > database on your system, then it would be reasonable to set it to most > of memory, or > (* 1/2 (/ (* 512 1024 1024) 8192)) > 32768. > > None of this is likely to substantially change the result of that one > query, however, and it seems quite likely that it is because > PostgreSQL is honestly returning the whole result set of ~100K rows at > once, whereas the other DBMSes are probably using cursors to return > only the few rows of the result that you actually looked at. > -- > "cbbrowne","@","cbbrowne.com" > http://www3.sympatico.ca/cbbrowne/linuxdistributions.html > Rules of the Evil Overlord #14. "The hero is not entitled to a last > kiss, a last cigarette, or any other form of last request." > <http://www.eviloverlord.com/>
On Thu, 2003-09-04 at 22:13, Relaxin wrote: > Finally, someone who will actually assume/admit that it is returning the > entire result set to the client. > Where as other DBMS manage the records at the server. Is there a reason you can't use cursors (explicitely, or via ODBC if it provides some glue on top of them) to keep the result set on the server? http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-declare.html http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-fetch.html > The next one is the handling of BLOBS. PG handles them like no other system > I have ever come across. Just FYI, you can use both the lo_*() functions, as well as simple bytea/text columns (which can be very large in PostgreSQL). -Neil
> Is there a reason you can't use cursors (explicitely, or via ODBC if it > provides some glue on top of them) to keep the result set on the server? > > http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-declare.html > http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-fetch.html I can only use generally accepted forms of connectivity (ie. ODBC, ADO or OLEDB). This is what many of the people on the Windows side are going to need, because most of us are going to be converting from an existing already established system, such as Oracle, SQL Server or DB2, all of which have 1 or more of the 3 mentioned above. > > The next one is the handling of BLOBS. PG handles them like no other system > > I have ever come across. > > Just FYI, you can use both the lo_*() functions, as well as simple > bytea/text columns (which can be very large in PostgreSQL). > > -Neil I know PG has a ODBC driver (that's all I've been using), but it or PG just doesn't handle BLOBS the way people on the Windows side (don't know about Unix) are use too. There is this conversion to octet that must be performed on the data , I don't understand why, but I guess there was a reason for it long ago, but it seems that it can now be modified to just accept ANY byte you give it and then store it without any manipulation of the data. This will make Postgresql much more portable for the Windows developers...no need for any special handling for a data type that all large RDBMS support. Thanks
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I think the problem is the ODBC driver NOT using cursors properly even if it should. The database itself is not doing anything it shouldn't do, in fact it has all the needed functionality to handle this request in a fast and effective way - just like any other respectable RDBMS. I don't know what ODBC driver you are using, and how it is configrued - and I never actually used PostgreSQL with ODBC myself. However in the applications I have developed we DO use 'standardized' DB access libraries, which work on just about any DBMS you throw them at. In our development system, which is running on a low-end dual P2-433mhz box with IDE drives, we routinely test both simple queries as yours and more complex ones, which at times returns several hundred thousand (or sometimes even millions) of rows. And processing time is, generally speaking, in range with what you are seeing on the other DBMSes you have. So if PG is indeed returning ALL the rows, it is because it is explicitly told to by the ODBC driver, so you need to look there to find the problem. Could there be some kind of connection parameters you are overlooking, or is the driver too old? Just throwing out ideas here, most likely you have already thought about it :) Just thought I'd point out that this is NOT expected behaviour from PG itself. /Eirik On Thu, 4 Sep 2003 21:59:01 -0700 "Relaxin" <noname@spam.com> wrote: > > Is there a reason you can't use cursors (explicitely, or via ODBC if > > it provides some glue on top of them) to keep the result set on the > > server? > > > > http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-declare.html > > http://candle.pha.pa.us/main/writings/pgsql/sgml/sql-fetch.html > > I can only use generally accepted forms of connectivity (ie. ODBC, ADO > or OLEDB). > This is what many of the people on the Windows side are going to need, > because most of us are going to be converting from an existing already > established system, such as Oracle, SQL Server or DB2, all of which > have 1 or more of the 3 mentioned above. > > > > > The next one is the handling of BLOBS. PG handles them like no > > > other > system > > > I have ever come across. > > > > Just FYI, you can use both the lo_*() functions, as well as simple > > bytea/text columns (which can be very large in PostgreSQL). > > > > -Neil > > I know PG has a ODBC driver (that's all I've been using), but it or PG > just doesn't handle BLOBS the way people on the Windows side (don't > know about Unix) are use too. > > There is this conversion to octet that must be performed on the data , > I don't understand why, but I guess there was a reason for it long > ago, but it seems that it can now be modified to just accept ANY byte > you give it and then store it without any manipulation of the data. > This will make Postgresql much more portable for the Windows > developers...no need for any special handling for a data type that all > large RDBMS support. > > > Thanks > > > > ---------------------------(end of > broadcast)--------------------------- TIP 3: if posting/reading > through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your message can get through to the mailing list cleanly -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/WBcAdAvR8ct7fEcRAvZPAJ9FgkYxck6Yh5gPeomk8QgWraeV0gCfQF/v CjyihMwTdrEZo2Y5YBwLVrI= =Ng2I -----END PGP SIGNATURE-----
Relaxin, I can't remember during this thread if you said you were using ODBC or not. If you are, then your problem is with the ODBC driver. You will need to check the Declare/Fetch box or you will definitely bring back the entire recordset. For small a small recordset this is not a problem, but the larger the recordset the slower the data is return to the client. I played around with the cache size on the driver and found a value between 100 to 200 provided good results. HTH Patrick Hatcher "Relaxin" <noname@spam.com> Sent by: To: pgsql-performance@postgresql.org pgsql-performance-owner@post cc: gresql.org Subject: Re: [PERFORM] SELECT's take a long time compared to otherDBMS 09/04/2003 07:13 PM Thank you Christopher. > Change fsync to true (you want your data to survive, right?) and > increase shared buffers to something that represents ~10% of your > system memory, in blocks of 8K. I turned it off just in the hope that things would run faster. > None of this is likely to substantially change the result of that one > query, however, and it seems quite likely that it is because > PostgreSQL is honestly returning the whole result set of ~100K rows at > once, whereas the other DBMSes are probably using cursors to return > only the few rows of the result that you actually looked at. Finally, someone who will actually assume/admit that it is returning the entire result set to the client. Where as other DBMS manage the records at the server. I hope PG could fix/enhance this issue. There are several issues that's stopping our company from going with PG (with paid support, if available), but this seems to big the one at the top of the list. The next one is the handling of BLOBS. PG handles them like no other system I have ever come across. After that is a native Windows port, but we would deal cygwin (for a very little while) if these other issues were handled. Thanks "Christopher Browne" <cbbrowne@acm.org> wrote in message news:m3fzjc58ll.fsf@chvatal.cbbrowne.com... > A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com> wrote: > >> Have you changed any of the settings yet in postgresql.conf, > >> specifically the shared_buffers setting? > > > > fsync = false > > tcpip_socket = true > > shared_buffers = 128 > > Change fsync to true (you want your data to survive, right?) and > increase shared buffers to something that represents ~10% of your > system memory, in blocks of 8K. > > So, if you have 512MB of RAM, then the total blocks is 65536, and it > would likely be reasonable to increase shared_buffers to 1/10 of that, > or about 6500. > > What is the value of effective_cache_size? That should probably be > increased a whole lot, too. If you are mainly just running the > database on your system, then it would be reasonable to set it to most > of memory, or > (* 1/2 (/ (* 512 1024 1024) 8192)) > 32768. > > None of this is likely to substantially change the result of that one > query, however, and it seems quite likely that it is because > PostgreSQL is honestly returning the whole result set of ~100K rows at > once, whereas the other DBMSes are probably using cursors to return > only the few rows of the result that you actually looked at. > -- > "cbbrowne","@","cbbrowne.com" > http://www3.sympatico.ca/cbbrowne/linuxdistributions.html > Rules of the Evil Overlord #14. "The hero is not entitled to a last > kiss, a last cigarette, or any other form of last request." > <http://www.eviloverlord.com/> ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
Expect that the Declare/Fetch only creates a forwardonly cursor, you can go backwards thru the result set. ""Patrick Hatcher"" <PHatcher@macys.com> wrote in message news:OFAD2A2CF4.499F8F67-ON88256D98.00527BCB-88256D98.00538130@fds.com... > > Relaxin, > I can't remember during this thread if you said you were using ODBC or not. > If you are, then your problem is with the ODBC driver. You will need to > check the Declare/Fetch box or you will definitely bring back the entire > recordset. For small a small recordset this is not a problem, but the > larger the recordset the slower the data is return to the client. I played > around with the cache size on the driver and found a value between 100 to > 200 provided good results. > > HTH > Patrick Hatcher > > > > > > "Relaxin" <noname@spam.com> > Sent by: To: pgsql-performance@postgresql.org > pgsql-performance-owner@post cc: > gresql.org Subject: Re: [PERFORM] SELECT's take a long time compared to other DBMS > > > 09/04/2003 07:13 PM > > > > > > Thank you Christopher. > > > Change fsync to true (you want your data to survive, right?) and > > increase shared buffers to something that represents ~10% of your > > system memory, in blocks of 8K. > > I turned it off just in the hope that things would run faster. > > > None of this is likely to substantially change the result of that one > > query, however, and it seems quite likely that it is because > > PostgreSQL is honestly returning the whole result set of ~100K rows at > > once, whereas the other DBMSes are probably using cursors to return > > only the few rows of the result that you actually looked at. > > Finally, someone who will actually assume/admit that it is returning the > entire result set to the client. > Where as other DBMS manage the records at the server. > > I hope PG could fix/enhance this issue. > > There are several issues that's stopping our company from going with PG > (with paid support, if available), but this seems to big the one at the top > of the list. > > The next one is the handling of BLOBS. PG handles them like no other > system > I have ever come across. > > After that is a native Windows port, but we would deal cygwin (for a very > little while) if these other issues were handled. > > Thanks > > > > > > "Christopher Browne" <cbbrowne@acm.org> wrote in message > news:m3fzjc58ll.fsf@chvatal.cbbrowne.com... > > A long time ago, in a galaxy far, far away, "Relaxin" <noname@spam.com> > wrote: > > >> Have you changed any of the settings yet in postgresql.conf, > > >> specifically the shared_buffers setting? > > > > > > fsync = false > > > tcpip_socket = true > > > shared_buffers = 128 > > > > Change fsync to true (you want your data to survive, right?) and > > increase shared buffers to something that represents ~10% of your > > system memory, in blocks of 8K. > > > > So, if you have 512MB of RAM, then the total blocks is 65536, and it > > would likely be reasonable to increase shared_buffers to 1/10 of that, > > or about 6500. > > > > What is the value of effective_cache_size? That should probably be > > increased a whole lot, too. If you are mainly just running the > > database on your system, then it would be reasonable to set it to most > > of memory, or > > (* 1/2 (/ (* 512 1024 1024) 8192)) > > 32768. > > > > None of this is likely to substantially change the result of that one > > query, however, and it seems quite likely that it is because > > PostgreSQL is honestly returning the whole result set of ~100K rows at > > once, whereas the other DBMSes are probably using cursors to return > > only the few rows of the result that you actually looked at. > > -- > > "cbbrowne","@","cbbrowne.com" > > http://www3.sympatico.ca/cbbrowne/linuxdistributions.html > > Rules of the Evil Overlord #14. "The hero is not entitled to a last > > kiss, a last cigarette, or any other form of last request." > > <http://www.eviloverlord.com/> > > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
On Fri, 2003-09-05 at 14:18, Relaxin wrote: > Expect that the Declare/Fetch only creates a forwardonly cursor, you can go > backwards thru the result set. No, DECLARE can create scrollable cursors, read the ref page again. This functionality is much improved in PostgreSQL 7.4, though. -Neil
It is forward only in the ODBC driver. "Neil Conway" <neilc@samurai.com> wrote in message news:1062796189.447.9.camel@tokyo... > On Fri, 2003-09-05 at 14:18, Relaxin wrote: > > Expect that the Declare/Fetch only creates a forwardonly cursor, you can go > > backwards thru the result set. > > No, DECLARE can create scrollable cursors, read the ref page again. This > functionality is much improved in PostgreSQL 7.4, though. > > -Neil > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >