Thread: Large # of rows in query extremely slow, not using index
Does postgres cache the entire result set before it begins returning data to the client? I have a table with ~8 million rows and I am executing a query which should return about ~800,000 rows. The problem is that as soon as I execute the query it absolutely kills my machine and begins swapping for 5 or 6 minutes before it begins returning results. Is postgres trying to load the whole query into memory before returning anything? Also, why would it choose not to use the index? It is properly estimating the # of rows returned. If I set enable_seqscan to off it is just as slow. Running postgres 8.0 beta2 dev2 explain select * from island_history where date='2004-09-07' and stock='QQQ'; QUERY PLAN --------------------------------------------------------------------------- Seq Scan on island_history (cost=0.00..266711.23 rows=896150 width=83) Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'QQQ'::text)) (2 rows) Any help would be appreciated --Stephen Table "public.island_history" Column | Type | Modifiers ------------------+------------------------+----------- date | date | not null stock | character varying(6) | time | time without time zone | not null reference_number | numeric(9,0) | not null message_type | character(1) | not null buy_sell_ind | character(1) | shares | numeric(6,0) | remaining_shares | numeric(6,0) | price | numeric(10,4) | display | character(1) | match_number | numeric(9,0) | not null Indexes: "island_history_pkey" PRIMARY KEY, btree (date, reference_number, message_type, "time", match_number) "island_history_date_stock_time" btree (date, stock, "time") "island_history_oid" btree (oid)
On Mon, 2004-09-13 at 20:51, Stephen Crowley wrote: > Does postgres cache the entire result set before it begins returning > data to the client? Sometimes you need to be careful as to how the clients treat the data. For example psql will resize columns width on the length (width) of the data returned. PHP and Perl will retrieve and cache all of the rows if you request a row count ($sth->rows() or pg_num_rows($rset)) You may find that using a cursor will help you out. > I have a table with ~8 million rows and I am executing a query which > should return about ~800,000 rows. The problem is that as soon as I > execute the query it absolutely kills my machine and begins swapping > for 5 or 6 minutes before it begins returning results. Is postgres > trying to load the whole query into memory before returning anything? > Also, why would it choose not to use the index? It is properly > estimating the # of rows returned. If I set enable_seqscan to off it > is just as slow. > > Running postgres 8.0 beta2 dev2 > > explain select * from island_history where date='2004-09-07' and stock='QQQ'; > QUERY PLAN > --------------------------------------------------------------------------- > Seq Scan on island_history (cost=0.00..266711.23 rows=896150 width=83) > Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'QQQ'::text)) > (2 rows) > > Any help would be appreciated > > --Stephen > > Table "public.island_history" > Column | Type | Modifiers > ------------------+------------------------+----------- > date | date | not null > stock | character varying(6) | > time | time without time zone | not null > reference_number | numeric(9,0) | not null > message_type | character(1) | not null > buy_sell_ind | character(1) | > shares | numeric(6,0) | > remaining_shares | numeric(6,0) | > price | numeric(10,4) | > display | character(1) | > match_number | numeric(9,0) | not null > Indexes: > "island_history_pkey" PRIMARY KEY, btree (date, reference_number, > message_type, "time", match_number) > "island_history_date_stock_time" btree (date, stock, "time") > "island_history_oid" btree (oid) > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- Rod Taylor <rbt [at] rbt [dot] ca> Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL PGP Key: http://www.rbt.ca/signature.asc
Attachment
Stephen Crowley <stephen.crowley@gmail.com> writes: > Does postgres cache the entire result set before it begins returning > data to the client? The backend doesn't, but libpq does, and I think JDBC does too. I'd recommend using a cursor so you can FETCH a reasonable number of rows at a time. > Also, why would it choose not to use the index? Selecting 1/10th of a table is almost always a poor candidate for an index scan. You've got about 100 rows per page (assuming the planner's width estimate is credible) and so on average every page of the table has about ten rows that need to be picked up and returned. You might as well just seqscan and be sure you don't read any page more than once. regards, tom lane
On Mon, 13 Sep 2004 21:11:07 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Stephen Crowley <stephen.crowley@gmail.com> writes: > > Does postgres cache the entire result set before it begins returning > > data to the client? > > The backend doesn't, but libpq does, and I think JDBC does too. > > I'd recommend using a cursor so you can FETCH a reasonable number of > rows at a time. That is incredible. Why would libpq do such a thing? JDBC as well? I know oracle doesn't do anything like that, not sure about mysql. Is there any way to turn it off? In this case I was just using psql but will be using JDBC for the app. About cursors, I thought a jdbc ResultSet WAS a cursor, am I mistaken? Thanks, Stephen
Stephen Crowley <stephen.crowley@gmail.com> writes: > On Mon, 13 Sep 2004 21:11:07 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Stephen Crowley <stephen.crowley@gmail.com> writes: >>> Does postgres cache the entire result set before it begins returning >>> data to the client? >> >> The backend doesn't, but libpq does, and I think JDBC does too. > That is incredible. Why would libpq do such a thing? Because the API it presents doesn't allow for the possibility of query failure after having given you back a PGresult: either you have the whole result available with no further worries, or you don't. If you think it's "incredible", let's see you design an equally easy-to-use API that doesn't make this assumption. (Now having said that, I would have no objection to someone extending libpq to offer an alternative streaming API for query results. It hasn't got to the top of anyone's to-do list though ... and I'm unconvinced that psql could use it if it did exist.) regards, tom lane
Problem solved.. I set the fetchSize to a reasonable value instead of the default of unlimited in the PreparedStatement and now the query is . After some searching it seeems this is a common problem, would it make sense to change the default value to something other than 0 in the JDBC driver? If I get some extra time I'll look into libpq and see what is required to fix the API. Most thirdparty programs and existing JDBC apps won't work with the current paradigm when returning large result sets. Thanks, Stephen On Mon, 13 Sep 2004 21:49:14 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Stephen Crowley <stephen.crowley@gmail.com> writes: > > On Mon, 13 Sep 2004 21:11:07 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Stephen Crowley <stephen.crowley@gmail.com> writes: > >>> Does postgres cache the entire result set before it begins returning > >>> data to the client? > >> > >> The backend doesn't, but libpq does, and I think JDBC does too. > > > That is incredible. Why would libpq do such a thing? > > Because the API it presents doesn't allow for the possibility of query > failure after having given you back a PGresult: either you have the > whole result available with no further worries, or you don't. > If you think it's "incredible", let's see you design an equally > easy-to-use API that doesn't make this assumption. > > (Now having said that, I would have no objection to someone extending > libpq to offer an alternative streaming API for query results. It > hasn't got to the top of anyone's to-do list though ... and I'm > unconvinced that psql could use it if it did exist.)
Hi, Stephen, On Mon, 13 Sep 2004 19:51:22 -0500 Stephen Crowley <stephen.crowley@gmail.com> wrote: > Does postgres cache the entire result set before it begins returning > data to the client? > > I have a table with ~8 million rows and I am executing a query which > should return about ~800,000 rows. The problem is that as soon as I > execute the query it absolutely kills my machine and begins swapping > for 5 or 6 minutes before it begins returning results. Is postgres > trying to load the whole query into memory before returning anything? > Also, why would it choose not to use the index? It is properly > estimating the # of rows returned. If I set enable_seqscan to off it > is just as slow. As you get about 10% of all rows in the table, the query will hit every page of the table. Maybe it helps to CLUSTER the table using the index on your query parameters, and then set enable_seqscan to off. But beware, that you have to re-CLUSTER after modifications. HTH, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
>> I have a table with ~8 million rows and I am executing a query which >> should return about ~800,000 rows. The problem is that as soon as I >> execute the query it absolutely kills my machine and begins swapping >> for 5 or 6 minutes before it begins returning results. Is postgres >> trying to load the whole query into memory before returning anything? >> Also, why would it choose not to use the index? It is properly >> estimating the # of rows returned. If I set enable_seqscan to off it >> is just as slow. 1; EXPLAIN ANALYZE. Note the time it takes. It should not swap, just read data from the disk (and not kill the machine). 2; Run the query in your software Note the time it takes. Watch RAM usage. If it's vastly longer and you're swimming in virtual memory, postgres is not the culprit... rather use a cursor to fetch a huge resultset bit by bit. Tell us what you find ? Regards.
Here are some results of explain analyze, I've included the LIMIT 10 because otherwise the resultset would exhaust all available memory. explain analyze select * from history where date='2004-09-07' and stock='ORCL' LIMIT 10; "Limit (cost=0.00..17.92 rows=10 width=83) (actual time=1612.000..1702.000 rows=10 loops=1)" " -> Index Scan using island_history_date_stock_time on island_history (cost=0.00..183099.72 rows=102166 width=83) (actual time=1612.000..1702.000 rows=10 loops=1)" " Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))" "Total runtime: 1702.000 ms" Ok, so for 100,000 rows it decides to use the index and returns very quicktly.. now for explain analyze select * from history where date='2004-09-07' and stock='MSFT' LIMIT 10; "Limit (cost=0.00..14.30 rows=10 width=83) (actual time=346759.000..346759.000 rows=10 loops=1)" " -> Seq Scan on island_history (cost=0.00..417867.13 rows=292274 width=83) (actual time=346759.000..346759.000 rows=10 loops=1)" " Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text))" "Total runtime: 346759.000 ms" Nearly 8 minutes.. Why would it take this long? Is there anything else I can do to debug this? When I set enable_seqscan to OFF and force everything to use the index every stock I query returns within 100ms, but turn seqscan back ON and its back up to taking several minutes for non-index using plans. Any ideas? --Stephen On Tue, 14 Sep 2004 21:27:55 +0200, Pierre-Frédéric Caillaud <lists@boutiquenumerique.com> wrote: > > >> I have a table with ~8 million rows and I am executing a query which > >> should return about ~800,000 rows. The problem is that as soon as I > >> execute the query it absolutely kills my machine and begins swapping > >> for 5 or 6 minutes before it begins returning results. Is postgres > >> trying to load the whole query into memory before returning anything? > >> Also, why would it choose not to use the index? It is properly > >> estimating the # of rows returned. If I set enable_seqscan to off it > >> is just as slow. > > 1; EXPLAIN ANALYZE. > > Note the time it takes. It should not swap, just read data from the disk > (and not kill the machine).
> When I set enable_seqscan to OFF and force everything to use the index > every stock I query returns within 100ms, but turn seqscan back ON and > its back up to taking several minutes for non-index using plans. > > Any ideas? > --Stephen Try increasing your statistics target and re-running analyze. Try say 100? Sincerely, Joshua D. Drake > > > On Tue, 14 Sep 2004 21:27:55 +0200, Pierre-Frédéric Caillaud > <lists@boutiquenumerique.com> wrote: > >>>>I have a table with ~8 million rows and I am executing a query which >>>>should return about ~800,000 rows. The problem is that as soon as I >>>>execute the query it absolutely kills my machine and begins swapping >>>>for 5 or 6 minutes before it begins returning results. Is postgres >>>>trying to load the whole query into memory before returning anything? >>>>Also, why would it choose not to use the index? It is properly >>>>estimating the # of rows returned. If I set enable_seqscan to off it >>>>is just as slow. >> >> 1; EXPLAIN ANALYZE. >> >> Note the time it takes. It should not swap, just read data from the disk >>(and not kill the machine). > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
On Thu, 16 Sep 2004 20:51:11 -0500, Stephen Crowley <stephen.crowley@gmail.com> wrote: >explain analyze select * from history where date='2004-09-07' and >stock='ORCL' LIMIT 10; >" -> Index Scan using island_history_date_stock_time on >island_history (cost=0.00..183099.72 rows=102166 width=83) (actual >time=1612.000..1702.000 rows=10 loops=1)" ^^ LIMIT 10 hides what would be the most interesting info here. I don't believe that EXPLAIN ANALYSE SELECT * FROM history WHERE ... consumes lots of memory. Please try it. And when you post the results please include your Postgres version, some info about hardware and OS, and your non-default settings, especially random_page_cost and effective_cache_size. May I guess that the correlation of the physical order of tuples in your table to the contents of the date column is pretty good (examine correlation in pg_stats) and that island_history_date_stock_time is a 3-column index? It is well known that the optimizer overestimates the cost of index scans in those situations. This can be compensated to a certain degree by increasing effective_cache_size and/or decreasing random_page_cost (which might harm other planner decisions). You could also try CREATE INDEX history_date_stock ON history("date", stock); This will slow down INSERTs and UPDATEs, though. Servus Manfred
Ok.. now I ran "VACUUM FULL' and things seem to be working as they should.. explain analyze select * from history where date='2004-09-07' and stock='MSFT'; Seq Scan on island_history (cost=0.00..275359.13 rows=292274 width=83) (actual time=50.000..411683.000 rows=265632 loops=1) Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text)) Total runtime: 412703.000 ms random_page_cost and effective_cache_size are both default, 8 and 1000 explain analyze select * from history where date='2004-09-07' and stock='ORCL'; "Index Scan using island_history_date_stock_time on island_history (cost=0.00..181540.07 rows=102166 width=83) (actual time=551.000..200268.000 rows=159618 loops=1)" " Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))" "Total runtime: 201009.000 ms" So now this in all in proportion and works as expected.. the question is, why would the fact that it needs to be vaccumed cause such a huge hit in performance? When i vacuumed it did free up nearly 25% of the space. --Stephen On Fri, 17 Sep 2004 22:44:05 +0200, Manfred Koizar <mkoi-pg@aon.at> wrote: > On Thu, 16 Sep 2004 20:51:11 -0500, Stephen Crowley > <stephen.crowley@gmail.com> wrote: > >explain analyze select * from history where date='2004-09-07' and > >stock='ORCL' LIMIT 10; > > >" -> Index Scan using island_history_date_stock_time on > >island_history (cost=0.00..183099.72 rows=102166 width=83) (actual > >time=1612.000..1702.000 rows=10 loops=1)" > ^^ > LIMIT 10 hides what would be the most interesting info here. I don't > believe that > EXPLAIN ANALYSE SELECT * FROM history WHERE ... > consumes lots of memory. Please try it. > > And when you post the results please include your Postgres version, some > info about hardware and OS, and your non-default settings, especially > random_page_cost and effective_cache_size. > > May I guess that the correlation of the physical order of tuples in your > table to the contents of the date column is pretty good (examine > correlation in pg_stats) and that island_history_date_stock_time is a > 3-column index? > > It is well known that the optimizer overestimates the cost of index > scans in those situations. This can be compensated to a certain degree > by increasing effective_cache_size and/or decreasing random_page_cost > (which might harm other planner decisions). > > You could also try > CREATE INDEX history_date_stock ON history("date", stock); > > This will slow down INSERTs and UPDATEs, though. > > Servus > Manfred >
Stephen, > " -> Seq Scan on island_history (cost=0.00..417867.13 rows=292274 > width=83) (actual time=346759.000..346759.000 rows=10 loops=1)" Take a look at your row comparisons. When was the last time you ran ANALYZE? -- Josh Berkus Aglio Database Solutions San Francisco
On Fri, 17 Sep 2004 19:23:44 -0500, Stephen Crowley <stephen.crowley@gmail.com> wrote: >Seq Scan [...] rows=265632 > Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text)) >Total runtime: 412703.000 ms > >random_page_cost and effective_cache_size are both default, 8 and 1000 Usually random_page_cost is 4.0 by default. And your effective_cache_size setting is far too low for a modern machine. >"Index Scan [...] rows=159618 >" Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))" >"Total runtime: 201009.000 ms" Extrapolating this to 265000 rows you should be able to get the MSFT result in ca. 330 seconds, if you can persuade the planner to choose an index scan. Fiddling with random_page_cost and effective_cache_size might do the trick. >So now this in all in proportion and works as expected.. the question >is, why would the fact that it needs to be vaccumed cause such a huge >hit in performance? When i vacuumed it did free up nearly 25% of the >space. So before the VACCUM a seq scan would have taken ca. 550 seconds. Your MSFT query with LIMIT 10 took ca. 350 seconds. It's not implausible to assume that more than half of the table had to be scanned to find the first ten rows matching the filter condition. Servus Manfred
On Tue, 14 Sep 2004, Stephen Crowley wrote: > Problem solved.. I set the fetchSize to a reasonable value instead of > the default of unlimited in the PreparedStatement and now the query > is . After some searching it seeems this is a common problem, would it > make sense to change the default value to something other than 0 in > the JDBC driver? In the JDBC driver, setting the fetch size to a non-zero value means that the query will be run using what the frontend/backend protocol calls a named statement. What this means on the backend is that the planner will not be able to use the values from the query parameters to generate the optimum query plan and must use generic placeholders and create a generic plan. For this reason we have decided not to default to a non-zero fetch size. This is something whose default value could be set by a URL parameter if you think that is something that is really required. Kris Jurka
Thanks for the explanation. So what sort of changes need to be made to the client/server protocol to fix this problem? On Thu, 23 Sep 2004 18:22:15 -0500 (EST), Kris Jurka <books@ejurka.com> wrote: > > > On Tue, 14 Sep 2004, Stephen Crowley wrote: > > > Problem solved.. I set the fetchSize to a reasonable value instead of > > the default of unlimited in the PreparedStatement and now the query > > is . After some searching it seeems this is a common problem, would it > > make sense to change the default value to something other than 0 in > > the JDBC driver? > > In the JDBC driver, setting the fetch size to a non-zero value means that > the query will be run using what the frontend/backend protocol calls a > named statement. What this means on the backend is that the planner will > not be able to use the values from the query parameters to generate the > optimum query plan and must use generic placeholders and create a generic > plan. For this reason we have decided not to default to a non-zero > fetch size. This is something whose default value could be set by a URL > parameter if you think that is something that is really required. > > Kris Jurka > >
On Thu, 23 Sep 2004, Stephen Crowley wrote: > Thanks for the explanation. So what sort of changes need to be made to > the client/server protocol to fix this problem? The problem is that there is no way to indicate why you are using a particular statement in the extended query protocol. For the JDBC driver there are two potential reasons, streaming a ResultSet and using a server prepared statement. For the streaming as default case you desire there needs to be a way to indicate that you don't want to create a generic server prepared statement and that this query is really just for one time use, so it can generate the best plan possible. Additionally you can only stream ResultSets that are of type FORWARD_ONLY. It would also be nice to be able to specify scrollability and holdability when creating a statement and the offset/direction when streaming data from a scrollable one. Kris Jurka