Thread: Planning large IN lists
When planning queries with a large IN expression in the WHERE clause, the planner transforms the IN list into a scalar array expression. In clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr by calling scalararraysel(), which in turn estimates the selectivity of *each* array element in order to determine the selectivity of the array expression as a whole. This is quite inefficient when the IN list is large. In a test case that someone sent me privately, a simple query involving two cheap joins and a ~1800 element IN list in the WHERE clause requires about 100ms to plan but only ~10 ms to execute -- about 85% of the total runtime is spent in scalararraysel(). (I'd include the profiling data, but KCacheGrind seems stubbornly opposed to providing a textual summary of its results...) Clearly, the current approach is fine when the array is small -- perhaps for arrays above a certain number of elements, we could switch to randomly sampling array elements, estimating their selectivities, and then using that information to infer the estimated selectivity of the entire array expression. That seems fairly crude, though: does anyone have any better ideas? -Neil
Neil Conway wrote: > Clearly, the current approach is fine when the array is small -- perhaps > for arrays above a certain number of elements, we could switch to > randomly sampling array elements, estimating their selectivities, and > then using that information to infer the estimated selectivity of the > entire array expression. That seems fairly crude, though: does anyone > have any better ideas? Optimizer hints in SQL /me ducks and runs for cover. regards, Lukas
Neil Conway <neilc@samurai.com> writes: > When planning queries with a large IN expression in the WHERE clause, > the planner transforms the IN list into a scalar array expression. In > clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr > by calling scalararraysel(), which in turn estimates the selectivity of > *each* array element in order to determine the selectivity of the array > expression as a whole. > This is quite inefficient when the IN list is large. That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. regards, tom lane
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Tom Lane > Sent: Thursday, May 10, 2007 11:53 AM > To: Neil Conway > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Planning large IN lists > > Neil Conway <neilc@samurai.com> writes: > > When planning queries with a large IN expression in the WHERE clause, > > the planner transforms the IN list into a scalar array expression. In > > clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr > > by calling scalararraysel(), which in turn estimates the selectivity of > > *each* array element in order to determine the selectivity of the array > > expression as a whole. > > > This is quite inefficient when the IN list is large. > > That's the least of the problems. We really ought to convert such cases > into an IN (VALUES(...)) type of query, since often repeated indexscans > aren't the best implementation. It seems to me that if you have a unique index on the in list column, then the problem is simplified. In that case, you just have to estimate how many index seeks cost more than a table scan. Usually, it's around 5-10% of the table size for the average database. Not sure how it works out in PostgreSQL. So in the special case of an in list on a unique indexed column, compare the cardinality of the table with the number of in list items and decide to table scan or index seek based on that. For arbitrary queries, it seems that it would be necessary to keep histograms for the columns in question. Perhaps it could be collected with an advanced analyze option.
Is this a TODO? --------------------------------------------------------------------------- Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > When planning queries with a large IN expression in the WHERE clause, > > the planner transforms the IN list into a scalar array expression. In > > clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr > > by calling scalararraysel(), which in turn estimates the selectivity of > > *each* array element in order to determine the selectivity of the array > > expression as a whole. > > > This is quite inefficient when the IN list is large. > > That's the least of the problems. We really ought to convert such cases > into an IN (VALUES(...)) type of query, since often repeated indexscans > aren't the best implementation. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi, Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: >> When planning queries with a large IN expression in the WHERE clause, >> the planner transforms the IN list into a scalar array expression. In >> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr >> by calling scalararraysel(), which in turn estimates the selectivity of >> *each* array element in order to determine the selectivity of the array >> expression as a whole. > >> This is quite inefficient when the IN list is large. > > That's the least of the problems. We really ought to convert such cases > into an IN (VALUES(...)) type of query, since often repeated indexscans > aren't the best implementation. > I thought of giving this a shot and while I was working on it, it occurred to me that we need to decide on a threshold value of the IN list size above which such transformation should take place. For small sizes of the IN list, scalararraysel() of IN list wins over the hash join involved in IN (VALUES(...)). But for larger sizes of the IN list, IN (VALUES(...)) comes out to be a clear winner. I would like to know what does the community think should be a heuristic value of the IN list size beyond which this transformation should take place. I was thinking of a GUC variable (or a hard coded value) which defaults to say 30. This is based on numbers from the following test: postgres=# create table w (w text); CREATE TABLE postgres=# \copy w from '/usr/share/dict/words' And run the following query with different IN list sizes explain analyze select * from w where w in ('one', 'two', ...); I got the following runtimes: ------------------------------------ IN list IN (VALUES(...)) IN size ------------------------------------ 150 ~2000 ms ~5500 ms 100 ~1500 ms ~4000 ms 80 ~1400 ms ~3000 ms 50 ~1400 ms ~2500 ms 30 ~1500 ms ~1500 ms 20 ~1400 ms ~1200 ms 10 ~1400 ms ~1200 ms ------------------------------------ The IN (VALUES(...)) gives an almost steady state behavior, while the IN runtimes deteriorate with growing list size. There would obviously be different conditions on which to base this value. I seek community opinion on this. -- Atul EnterpriseDB www.enterprisedb.com
Hi, Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: >> When planning queries with a large IN expression in the WHERE clause, >> the planner transforms the IN list into a scalar array expression. In >> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr >> by calling scalararraysel(), which in turn estimates the selectivity of >> *each* array element in order to determine the selectivity of the array >> expression as a whole. > >> This is quite inefficient when the IN list is large. > > That's the least of the problems. We really ought to convert such cases > into an IN (VALUES(...)) type of query, since often repeated indexscans > aren't the best implementation. > I thought of giving this a shot and while I was working on it, it occurred to me that we need to decide on a threshold value of the IN list size above which such transformation should take place. For small sizes of the IN list, scalararraysel() of IN list wins over the hash join involved in IN (VALUES(...)). But for larger sizes of the IN list, IN (VALUES(...)) comes out to be a clear winner. I would like to know what does the community think should be a heuristic value of the IN list size beyond which this transformation should take place. I was thinking of a GUC variable (or a hard coded value) which defaults to say 30. This is based on numbers from the following test: postgres=# create table w (w text); CREATE TABLE postgres=# \copy w from '/usr/share/dict/words' And run the following query with different IN list sizes explain analyze select * from w where w in ('one', 'two', ...); I got the following runtimes: ------------------------------------ IN list IN (VALUES(...)) IN size ------------------------------------ 150 ~2000 ms ~5500 ms 100 ~1500 ms ~4000 ms 80 ~1400 ms ~3000 ms 50 ~1400 ms ~2500 ms 30 ~1500 ms ~1500 ms 20 ~1400 ms ~1200 ms 10 ~1400 ms ~1200 ms ------------------------------------ The IN (VALUES(...)) gives an almost steady state behavior, while the IN runtimes deteriorate with growing list size. There would obviously be different conditions on which to base this value. I seek community opinion on this. -- Atul EnterpriseDB www.enterprisedb.com
"Atul Deopujari" <atul.deopujari@enterprisedb.com> writes: > Hi, > Tom Lane wrote: >> That's the least of the problems. We really ought to convert such cases >> into an IN (VALUES(...)) type of query, since often repeated indexscans >> aren't the best implementation. >> > I thought of giving this a shot and while I was working on it, it > occurred to me that we need to decide on a threshold value of the IN > list size above which such transformation should take place. I see no good reason to suppose that there is/should be a constant threshold --- most likely it depends on size of table, availability of indexes, etc. Having the planner try it both ways and compare costs would be best. regards, tom lane
Hi, Tom Lane wrote: > "Atul Deopujari" <atul.deopujari@enterprisedb.com> writes: >> Hi, >> Tom Lane wrote: >>> That's the least of the problems. We really ought to convert such cases >>> into an IN (VALUES(...)) type of query, since often repeated indexscans >>> aren't the best implementation. >>> >> I thought of giving this a shot and while I was working on it, it >> occurred to me that we need to decide on a threshold value of the IN >> list size above which such transformation should take place. > > I see no good reason to suppose that there is/should be a constant > threshold --- most likely it depends on size of table, availability of > indexes, etc. Having the planner try it both ways and compare costs > would be best. > Yes, letting the planner make its own decision would seem best (in accordance with what we do for different join paths). But for large IN lists, a substantial part of the planner is spent in estimating the selectivity of the ScalarArrayExpr by calling scalararraysel. If we are not eliminating this step in processing the IN list then we are not doing any optimization. Asking the planner to do scalararraysel and also compute cost of any other way and choose between the two is asking planner to do more work. Factors such as size of table, availability of index etc. would affect both the ways similarly. So, if we see a gain in the execution of the IN list due to an external factor then we will also see a similar gain in the execution of the transformed IN (VALUES(...)) clause. I agree that one value would not fit all cases. The problem with this approach is that for some cases, large IN list would perform better than the transformed IN (VALUES(...)) clause. But we know that the transformed IN (VALUES(...)) clause has almost a steady state behavior and it would not blow off the planner estimates. The error would be just marginal. -- Atul EnterpriseDB www.enterprisedb.com
"Atul Deopujari" <atuld@enterprisedb.com> writes: > Yes, letting the planner make its own decision would seem best (in > accordance with what we do for different join paths). But for large IN > lists, a substantial part of the planner is spent in estimating the > selectivity of the ScalarArrayExpr by calling scalararraysel. If we are > not eliminating this step in processing the IN list then we are not > doing any optimization. Asking the planner to do scalararraysel and also > compute cost of any other way and choose between the two is asking > planner to do more work. So? Better planning usually involves more work. In any case the above argument seems irrelevant, because making scalararraysel more approximate and less expensive for long lists could be done independently of anything else. > Factors such as size of table, availability of index etc. would affect > both the ways similarly. So, if we see a gain in the execution of the IN > list due to an external factor then we will also see a similar gain in > the execution of the transformed IN (VALUES(...)) clause. Incorrect. There is more than one way to do a join, and the above argument only applies if the VALUES case is planned as a nestloop with inner indexscan, which indeed is isomorphic to the scalararrayop implementation ... except that it has higher per-tuple overhead, and therefore will consistently lose, disregarding artifacts of planning costs such as how hard we try to estimate the result size. The case where VALUES is actually a better plan is where the planner switches to merge or hash join because there are too many values. In the current implementation, the planner is incapable of generating those plan shapes from a scalararrayop, and that's what I'd like to see fixed. regards, tom lane
Tom Lane wrote: > "Atul Deopujari" <atuld@enterprisedb.com> writes: >> Yes, letting the planner make its own decision would seem best (in >> accordance with what we do for different join paths). But for large IN >> lists, a substantial part of the planner is spent in estimating the >> selectivity of the ScalarArrayExpr by calling scalararraysel. If we are >> not eliminating this step in processing the IN list then we are not >> doing any optimization. Asking the planner to do scalararraysel and also >> compute cost of any other way and choose between the two is asking >> planner to do more work. > > So? Better planning usually involves more work. In any case the above > argument seems irrelevant, because making scalararraysel more > approximate and less expensive for long lists could be done > independently of anything else. > Got you and agree with you. -- Atul EnterpriseDB www.enterprisedb.com
Added to TODO: * Consider using a hash for joining to a large IN (VALUES ...) list http://archives.postgresql.org/pgsql-hackers/2007-05/msg00450.php --------------------------------------------------------------------------- Atul Deopujari wrote: > Hi, > > Tom Lane wrote: > > Neil Conway <neilc@samurai.com> writes: > >> When planning queries with a large IN expression in the WHERE clause, > >> the planner transforms the IN list into a scalar array expression. In > >> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr > >> by calling scalararraysel(), which in turn estimates the selectivity of > >> *each* array element in order to determine the selectivity of the array > >> expression as a whole. > > > >> This is quite inefficient when the IN list is large. > > > > That's the least of the problems. We really ought to convert such cases > > into an IN (VALUES(...)) type of query, since often repeated indexscans > > aren't the best implementation. > > > I thought of giving this a shot and while I was working on it, it > occurred to me that we need to decide on a threshold value of the IN > list size above which such transformation should take place. For small > sizes of the IN list, scalararraysel() of IN list wins over the hash > join involved in IN (VALUES(...)). But for larger sizes of the IN list, > IN (VALUES(...)) comes out to be a clear winner. > I would like to know what does the community think should be a heuristic > value of the IN list size beyond which this transformation should take > place. > I was thinking of a GUC variable (or a hard coded value) which defaults > to say 30. This is based on numbers from the following test: > > postgres=# create table w (w text); > CREATE TABLE > > postgres=# \copy w from '/usr/share/dict/words' > > And run the following query with different IN list sizes > explain analyze select * from w where w in ('one', 'two', ...); > > I got the following runtimes: > ------------------------------------ > IN list IN (VALUES(...)) IN > size > ------------------------------------ > 150 ~2000 ms ~5500 ms > 100 ~1500 ms ~4000 ms > 80 ~1400 ms ~3000 ms > 50 ~1400 ms ~2500 ms > 30 ~1500 ms ~1500 ms > 20 ~1400 ms ~1200 ms > 10 ~1400 ms ~1200 ms > ------------------------------------ > > The IN (VALUES(...)) gives an almost steady state behavior, while the IN > runtimes deteriorate with growing list size. > > There would obviously be different conditions on which to base this > value. I seek community opinion on this. > > -- > Atul > > EnterpriseDB > www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +