Thread: extremly bad select performance on huge table
Hi newsgroup, I have a very huge table (70 mio rows ) with a key (text length about 30 characters each key). A select on this indexed column "myprimkey" (index on column mycolumn) took more than 30 mins. Here is the explain (analyze,buffers) select mycolumn from myhugetable "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 loops=1)" " Heap Fetches: 356861" "Total runtime: 2503009.611 ms" Even repeating the query does not show a performance improvement. I assume that the index itself is too large for my db cache. What can I do to gain performance? Which parameters can I adapt? Having a huge Linux machine with 72 GB RAM. Note: This select is just for testing. My final statement will be a join on this table via the "mycolumn" column. Thanks for your help Björn
Sorry forget to copy the buffer information: " Heap Fetches: 356861" " Buffers: shared hit=71799472 read=613813" > Hi newsgroup, > > I have a very huge table (70 mio rows ) with a key (text length about > 30 characters each key). A select on this indexed column "myprimkey" > (index on column mycolumn) took more than 30 mins. > > Here is the explain (analyze,buffers) select mycolumn from myhugetable > > "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 > rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 > loops=1)" > > " Heap Fetches: 356861" > > "Total runtime: 2503009.611 ms" > > > Even repeating the query does not show a performance improvement. I > assume that the index itself is too large for my db cache. What can I > do to gain performance? Which parameters can I adapt? Having a huge > Linux machine with 72 GB RAM. > > Note: This select is just for testing. My final statement will be a > join on this table via the "mycolumn" column. > > Thanks for your help > Björn > > > >
-----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Björn Wittich Sent: Tuesday, October 21, 2014 1:35 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] extremly bad select performance on huge table Sorry forget to copy the buffer information: " Heap Fetches: 356861" " Buffers: shared hit=71799472 read=613813" > Hi newsgroup, > > I have a very huge table (70 mio rows ) with a key (text length about > 30 characters each key). A select on this indexed column "myprimkey" > (index on column mycolumn) took more than 30 mins. > > Here is the explain (analyze,buffers) select mycolumn from myhugetable > > "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 > rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 > loops=1)" > > " Heap Fetches: 356861" > > "Total runtime: 2503009.611 ms" > > > Even repeating the query does not show a performance improvement. I > assume that the index itself is too large for my db cache. What can I > do to gain performance? Which parameters can I adapt? Having a huge > Linux machine with 72 GB RAM. > > Note: This select is just for testing. My final statement will be a > join on this table via the "mycolumn" column. > > Thanks for your help > Björn > > > > Did you check the bloat in your myprimkey index? Regards, Igor Neyman
=?ISO-8859-15?Q?Bj=F6rn_Wittich?= <Bjoern_Wittich@gmx.de> writes: > Here is the explain (analyze,buffers) select mycolumn from myhugetable > "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 > rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 > loops=1)" > " Heap Fetches: 356861" > " Buffers: shared hit=71799472 read=613813" > "Total runtime: 2503009.611 ms" So that works out to about 4 msec per page fetched considering only I/O costs, which is about as good as you're likely to get if the data is sitting on spinning rust. You could potentially make it faster with a VACUUM (to mark all pages all-visible and eliminate the "heap fetches" costs), or a REINDEX (so that the index scan becomes more nearly sequential instead of random access). However, unless the data is nearly static those will just be temporary fixes: the time will degrade again as you update the table. > Note: This select is just for testing. My final statement will be a join > on this table via the "mycolumn" column. In that case it's probably a waste of time to worry about the performance of this query as such. In the first place, a join is not likely to use the index at all unless it's fetching a relatively small number of rows, and in the second place it seems unlikely that the join query can use an IndexOnlyScan on this index --- I imagine that the purpose of the join will require fetching additional columns. regards, tom lane
Hi Tom and Igor, thanks for your help. With the reindex the select query running time was reduced from 5200 sec to 130 sec. Impressive! Even a join on this table is now fast. Unfortunately, there is now another problem: The table in my example has 500 columns which I want to retrieve with my join command. Example which is fast "select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" Example which is slow "select value,c1,c2,c3,...,c10 from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" Which is the number of columns to fetch so bad ? Which action is done in the db system when querying this via pgadmin? I think that there is no real retrieval included, why is the number of additional columns so bad for the join performance? > =?ISO-8859-15?Q?Bj=F6rn_Wittich?= <Bjoern_Wittich@gmx.de> writes: >> Here is the explain (analyze,buffers) select mycolumn from myhugetable >> "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 >> rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 >> loops=1)" >> " Heap Fetches: 356861" >> " Buffers: shared hit=71799472 read=613813" >> "Total runtime: 2503009.611 ms" > So that works out to about 4 msec per page fetched considering only I/O > costs, which is about as good as you're likely to get if the data is > sitting on spinning rust. > > You could potentially make it faster with a VACUUM (to mark all pages > all-visible and eliminate the "heap fetches" costs), or a REINDEX > (so that the index scan becomes more nearly sequential instead of random > access). However, unless the data is nearly static those will just be > temporary fixes: the time will degrade again as you update the table. > >> Note: This select is just for testing. My final statement will be a join >> on this table via the "mycolumn" column. > In that case it's probably a waste of time to worry about the performance > of this query as such. In the first place, a join is not likely to use > the index at all unless it's fetching a relatively small number of rows, > and in the second place it seems unlikely that the join query can use > an IndexOnlyScan on this index --- I imagine that the purpose of the join > will require fetching additional columns. > > regards, tom lane > >
-----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Björn Wittich Sent: Tuesday, October 21, 2014 3:32 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] extremly bad select performance on huge table Hi Tom and Igor, thanks for your help. With the reindex the select query running time was reduced from 5200 sec to 130 sec. Impressive! Even a join on this table is now fast. Unfortunately, there is now another problem: The table in my example has 500 columns which I want to retrieve with my join command. Example which is fast "select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" Example which is slow "select value,c1,c2,c3,...,c10 from smallertable inner join myhugetable ON smallertable.mycolumn =myhugetable.mycolumn" Which is the number of columns to fetch so bad ? Which action is done in the db system when querying this via pgadmin? I think that there is no real retrieval included, why is the number of additional columns so bad for the join performance? > =?ISO-8859-15?Q?Bj=F6rn_Wittich?= <Bjoern_Wittich@gmx.de> writes: >> Here is the explain (analyze,buffers) select mycolumn from myhugetable >> "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 >> rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 >> loops=1)" >> " Heap Fetches: 356861" >> " Buffers: shared hit=71799472 read=613813" >> "Total runtime: 2503009.611 ms" > So that works out to about 4 msec per page fetched considering only I/O > costs, which is about as good as you're likely to get if the data is > sitting on spinning rust. > > You could potentially make it faster with a VACUUM (to mark all pages > all-visible and eliminate the "heap fetches" costs), or a REINDEX > (so that the index scan becomes more nearly sequential instead of random > access). However, unless the data is nearly static those will just be > temporary fixes: the time will degrade again as you update the table. > >> Note: This select is just for testing. My final statement will be a join >> on this table via the "mycolumn" column. > In that case it's probably a waste of time to worry about the performance > of this query as such. In the first place, a join is not likely to use > the index at all unless it's fetching a relatively small number of rows, > and in the second place it seems unlikely that the join query can use > an IndexOnlyScan on this index --- I imagine that the purpose of the join > will require fetching additional columns. > > regards, tom lane > > Björn, I think, the timing difference you see between 2 queries is caused by delivering to the front-end (PgAdmin) and displayingall additional columns that you include in the second query (much bigger amount of data to pass from the db tothe client). Pretty sure, if you do explain analyze on both queries, you'll see the same timing, because it'll reflect only db time withoutwhat's spent on delivering data to the client. Regards, Igor Neyman
Hi Igor, that was also my assumption, but unfortunately this isn't true. I am using the explain analyze. Example which is fast "explain analyze select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" 130 - 140 sec Example which is fast "explain analyze select value,c1 from smallertable inner join myhugetable ON smallertable.mycolumn= myhugetable.mycolumn" does not complete after several hours although the c1 coulmn should only be relevant for retrieval. Comparing the explain comparison of both statements gave me a hint: adding the c1 column changes the query planner to make a sequential scan on myhugetable as well as on smallertable. This is much slower. When I set enable_seqscan=false the queryplanner shows the same query plan for both statements but the statement including the c1 column does not complete after several hours. How can this be explained? I do not want the db server to prepare the whole query result at once, my intention is that the asynchronous retrieval starts as fast as possible. Thanks Björn > > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Björn Wittich > Sent: Tuesday, October 21, 2014 3:32 PM > To: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] extremly bad select performance on huge table > > Hi Tom and Igor, > > thanks for your help. With the reindex the select query running time was reduced from 5200 sec to 130 sec. Impressive! > > Even a join on this table is now fast. > > Unfortunately, there is now another problem: The table in my example has > 500 columns which I want to retrieve with my join command. > > Example which is fast "select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" > > Example which is slow "select value,c1,c2,c3,...,c10 from smallertable inner join myhugetable ON smallertable.mycolumn= myhugetable.mycolumn" > > > Which is the number of columns to fetch so bad ? Which action is done in > the db system when querying this via pgadmin? I think that there is no > real retrieval included, why is the number of additional columns so bad > for the join performance? > >> =?ISO-8859-15?Q?Bj=F6rn_Wittich?= <Bjoern_Wittich@gmx.de> writes: >>> Here is the explain (analyze,buffers) select mycolumn from myhugetable >>> "Index Only Scan using myprimkey on myhugetable (cost=0.00..8224444.82 >>> rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 >>> loops=1)" >>> " Heap Fetches: 356861" >>> " Buffers: shared hit=71799472 read=613813" >>> "Total runtime: 2503009.611 ms" >> So that works out to about 4 msec per page fetched considering only I/O >> costs, which is about as good as you're likely to get if the data is >> sitting on spinning rust. >> >> You could potentially make it faster with a VACUUM (to mark all pages >> all-visible and eliminate the "heap fetches" costs), or a REINDEX >> (so that the index scan becomes more nearly sequential instead of random >> access). However, unless the data is nearly static those will just be >> temporary fixes: the time will degrade again as you update the table. >> >>> Note: This select is just for testing. My final statement will be a join >>> on this table via the "mycolumn" column. >> In that case it's probably a waste of time to worry about the performance >> of this query as such. In the first place, a join is not likely to use >> the index at all unless it's fetching a relatively small number of rows, >> and in the second place it seems unlikely that the join query can use >> an IndexOnlyScan on this index --- I imagine that the purpose of the join >> will require fetching additional columns. >> >> regards, tom lane >> >> > Björn, > > I think, the timing difference you see between 2 queries is caused by delivering to the front-end (PgAdmin) and displayingall additional columns that you include in the second query (much bigger amount of data to pass from the db tothe client). > Pretty sure, if you do explain analyze on both queries, you'll see the same timing, because it'll reflect only db timewithout what's spent on delivering data to the client. > > Regards, > Igor Neyman > > >
-----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Björn Wittich Sent: Wednesday, October 22, 2014 1:06 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] extremly bad select performance on huge table Hi Igor, that was also my assumption, but unfortunately this isn't true. I am using the explain analyze. Example which is fast "explain analyze select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" 130 - 140 sec Example which is fast "explain analyze select value,c1 from smallertable inner join myhugetable ON smallertable.mycolumn= myhugetable.mycolumn" does not complete after several hours although the c1 coulmn should only be relevant for retrieval. Comparing the explain comparison of both statements gave me a hint: adding the c1 column changes the query planner to make a sequential scan on myhugetable as well as on smallertable. Thisis much slower. When I set enable_seqscan=false the queryplanner shows the same query plan for both statements but the statement includingthe c1 column does not complete after several hours. How can this be explained? I do not want the db server to prepare the whole query result at once, my intention is that the asynchronous retrieval startsas fast as possible. Thanks Björn > > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Björn > Wittich > Sent: Tuesday, October 21, 2014 3:32 PM > To: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] extremly bad select performance on huge table > > Hi Tom and Igor, > > thanks for your help. With the reindex the select query running time was reduced from 5200 sec to 130 sec. Impressive! > > Even a join on this table is now fast. > > Unfortunately, there is now another problem: The table in my example > has > 500 columns which I want to retrieve with my join command. > > Example which is fast "select value from smallertable inner join myhugetable ON smallertable.mycolumn = myhugetable.mycolumn" > > Example which is slow "select value,c1,c2,c3,...,c10 from smallertable inner join myhugetable ON smallertable.mycolumn= myhugetable.mycolumn" > > > Which is the number of columns to fetch so bad ? Which action is done > in the db system when querying this via pgadmin? I think that there is > no real retrieval included, why is the number of additional columns so > bad for the join performance? > >> =?ISO-8859-15?Q?Bj=F6rn_Wittich?= <Bjoern_Wittich@gmx.de> writes: >>> Here is the explain (analyze,buffers) select mycolumn from >>> myhugetable "Index Only Scan using myprimkey on myhugetable >>> (cost=0.00..8224444.82 >>> rows=71768080 width=33) (actual time=16.722..2456300.778 >>> rows=71825999 loops=1)" >>> " Heap Fetches: 356861" >>> " Buffers: shared hit=71799472 read=613813" >>> "Total runtime: 2503009.611 ms" >> So that works out to about 4 msec per page fetched considering only >> I/O costs, which is about as good as you're likely to get if the data >> is sitting on spinning rust. >> >> You could potentially make it faster with a VACUUM (to mark all pages >> all-visible and eliminate the "heap fetches" costs), or a REINDEX (so >> that the index scan becomes more nearly sequential instead of random >> access). However, unless the data is nearly static those will just >> be temporary fixes: the time will degrade again as you update the table. >> >>> Note: This select is just for testing. My final statement will be a >>> join on this table via the "mycolumn" column. >> In that case it's probably a waste of time to worry about the >> performance of this query as such. In the first place, a join is not >> likely to use the index at all unless it's fetching a relatively >> small number of rows, and in the second place it seems unlikely that >> the join query can use an IndexOnlyScan on this index --- I imagine >> that the purpose of the join will require fetching additional columns. >> >> regards, tom lane >> >> > Björn, > > I think, the timing difference you see between 2 queries is caused by delivering to the front-end (PgAdmin) and displayingall additional columns that you include in the second query (much bigger amount of data to pass from the db tothe client). > Pretty sure, if you do explain analyze on both queries, you'll see the same timing, because it'll reflect only db timewithout what's spent on delivering data to the client. > > Regards, > Igor Neyman > > > Okay, So, REINDEX helped with original query, which execution plan used Index Only Scan, if I remember correctly, since you askedonly for the column in PK index. Now, when you add some other column which is not in the index, it switches to Sequential Scan. So, check the bloat on the table. May be performance could be improved if you VACUUM bloated table. Regards, Igor Neyman
Björn Wittich <Bjoern_Wittich@gmx.de> wrote: > I do not want the db server to prepare the whole query result at > once, my intention is that the asynchronous retrieval starts as > fast as possible. Then you probably should be using a cursor. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi Kevin, this is what I need (I think). Hopefully a cursor can operate on a join. Will read docu now. Thanks! Björn Am 22.10.2014 16:53, schrieb Kevin Grittner: > Björn Wittich <Bjoern_Wittich@gmx.de> wrote: > >> I do not want the db server to prepare the whole query result at >> once, my intention is that the asynchronous retrieval starts as >> fast as possible. > Then you probably should be using a cursor. > > -- > Kevin Grittner > EDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > >
Hi, with a cursor the behaviour is the same. So I would like to ask a more general question: My client needs to receive data from a huge join. The time the client waits for being able to fetch the first row is very long. When the retrieval starts after about 10 mins, the client itself is I/O bound so it is not able to catch up the elapsed time. My workaround was to build a queue of small joins (assuming the huge join delivers 10 mio rows I now have 10000 joins delivering 1000 rows ). So the general question is: Is there a better solution then my crude workaround? Thank you > Hi Kevin, > > > this is what I need (I think). Hopefully a cursor can operate on a > join. Will read docu now. > > Thanks! > > > Björn > > Am 22.10.2014 16:53, schrieb Kevin Grittner: >> Björn Wittich <Bjoern_Wittich@gmx.de> wrote: >> >>> I do not want the db server to prepare the whole query result at >>> once, my intention is that the asynchronous retrieval starts as >>> fast as possible. >> Then you probably should be using a cursor. >> >> -- >> Kevin Grittner >> EDB: http://www.enterprisedb.com >> The Enterprise PostgreSQL Company >> >> > > >