Thread: Why does a simple query not use an obvious index?
The query:
select count(*) from billing where timestamp > now()-60
should obviously use the index CREATE INDEX billing_timestamp_idx ON billing USING btree ("timestamp" timestamp_ops); on a table with 1400000 rows. But it uses a Seq Scan. If I set enable_seqscan=no, it indicates a queryplan could not be calculated. Why does this simple query not use the timestamp index, and how can I get it to? Thanks, Jack
Guest-Tek is a leading provider of broadband technology solutions for the hospitality industry. Guest-Tek's GlobalSuite high-speed Internet solution enables hotels to offer their guests the convenience of wired and/or wireless broadband Internet access from guest rooms, meeting rooms and public areas. |
Strangely enough, I don't find that result surprising. if the vast bulk of the data is in the past and now()-60 represents a very small slice of the data we might expect that using an index is optimal, but there could be many reasons why it doesn't get used. AFAIK postgres doesn't peek at values used in a query when optimizing so any query with a ">" type condition is gonna have a seq scan as the plan since the best guess is that you are gonna match 50% of the table. That's one possible explanation. Another is that if the condition data types don't match then an indes won't be used you could try: select count(*) from billing where timestamp > (now()-60)::timestamp Might make a difference, I dunno, it's a case of testing amd seing what happens. You could try lowering the random page cost, it might help, but I don't like your chances. If your problem is that you want to access the most recent data from a large table with fast response, then you could consider: 1. a "recent" index. If the data is within the "recent" time from set a flag to true, other wise null. Reset the flags periodically. Nulls aren't indexed so the selectivity of such an index is much higher. Can work wonders. 2, duplicate recent data in another table that is purged when data passes the age limit. This is basic archiving. Something like that. Hopefully someone with more knowlege of the optimaizer will have a brighter suggestion for you. What version are you using by the way? Regards Mr Pink --- Jack Kerkhof <jack.kerkhof@guest-tek.com> wrote: > The query: > select count(*) from billing where timestamp > now()-60 > > should obviously use the index > > CREATE INDEX billing_timestamp_idx ON billing USING btree ("timestamp" > timestamp_ops); > > on a table with 1400000 rows. > > But it uses a Seq Scan. If I set enable_seqscan=no, it indicates a queryplan > could not be calculated. > > Why does this simple query not use the timestamp index, and how can I get it > to? > > Thanks, Jack > > Jack Kerkhof > Research & Development > jack.kerkhof@guest-tek.com > www.guest-tek.com > 1.866.509.1010 3480 > > -------------------------------------------------------------------------- > > Guest-Tek is a leading provider of broadband technology solutions for > the hospitality industry. Guest-Tek's GlobalSuite�Ehigh-speed Internet > solution enables hotels to offer their guests the convenience of wired > and/or wireless broadband Internet access from guest rooms, meeting rooms > and public areas. > > > __________________________________ Do you Yahoo!? Take Yahoo! Mail with you! Get it on your mobile phone. http://mobile.yahoo.com/maildemo
On Fri, 2004-08-27 at 11:12, Jack Kerkhof wrote: > The query: > > select count(*) from billing where timestamp > now()-60 > > should obviously use the index > > CREATE INDEX billing_timestamp_idx ON billing USING btree > ("timestamp" timestamp_ops); > > on a table with 1400000 rows. > > But it uses a Seq Scan. If I set enable_seqscan=no, it indicates a > queryplan could not be calculated. Have you tried this: marlowe=> select now()-60; ERROR: operator does not exist: timestamp with time zone - integer HINT: No operator matches the given name and argument type(s). You may need to add explicit type casts. you likely need: smarlowe=> select now()-'60 seconds'::interval; ?column? ------------------------------- 2004-08-29 12:25:38.249564-06 inside there. Also, count(*) is likely to always generate a seq scan due to the way aggregates are implemented currently in pgsql. you might want to try: select somefield from sometable where timestampfield > now()-'60 seconds'::interval and count the number of returned rows. If there's a lot, it won't be any faster, if there's a few, it should be a win.
On Sun, Aug 29, 2004 at 11:04:48AM -0700, Mr Pink wrote: > Another is that if the condition data types don't match then an indes won't be used you could try: > > select count(*) from billing where timestamp > (now()-60)::timestamp In fact, I've had success with code like select count(*) from billing where timestamp > ( select now() - interval '1 minute' ) At least in my case (PostgreSQL 7.2, though), it made PostgreSQL magically do an index scan. :-) /* Steinar */ -- Homepage: http://www.sesse.net/
Mr Pink <mr_pink_is_the_only_pro@yahoo.com> writes: > AFAIK postgres doesn't peek at values used in a query when optimizing Of course it does. However sometimes things don't work perfectly. To get good answers rather than just guesses we'll need two things: . What version of postgres are you using. . The output of EXPLAIN ANALYZE select ... -- greg
"Scott Marlowe" <smarlowe@qwest.net> writes: > Also, count(*) is likely to always generate a seq scan due to the way > aggregates are implemented currently in pgsql. you might want to try: Huh? I'm curious to know what you're talking about here. > select somefield from sometable where timestampfield > now()-'60 > seconds'::interval > > and count the number of returned rows. If there's a lot, it won't be > any faster, if there's a few, it should be a win. Why would this ever be faster? And how could postgres ever calculate that without doing a sequential scan when count(*) would force it to do a sequential scan? -- greg
On Sun, 2004-08-29 at 15:12, Greg Stark wrote: > "Scott Marlowe" <smarlowe@qwest.net> writes: > > > Also, count(*) is likely to always generate a seq scan due to the way > > aggregates are implemented currently in pgsql. you might want to try: > > Huh? I'm curious to know what you're talking about here. This has been discussed ad infinitum on the lists in the past. And explained by better minds than mine, but I'll give it a go. PostgreSQL has a "generic" aggregate method. Imagine instead doing a select count(id1+id2-id3) from table where ... In that instance, it's not a simple shortcut to just grab the number of rows anymore. Since PostgreSQL uses a generic aggregate method that can be expanded by the user with custom aggregates et. al., it has no optimizations to make simple count(*) fast, like many other databases. Add to that the fact that even when postgresql uses an index it still has to hit the data store to get the actual value of the tuple, and you've got very few instances in which an index scan of more than some small percentage of the table is worth while. I.e. a sequential scan tends to "win" over an index scan quicker in postgresql than in other databases like Oracle, where the data store is serialized and the indexes have the correct information along with the application of the transaction / roll back segments. > > select somefield from sometable where timestampfield > now()-'60 > > seconds'::interval > > > > and count the number of returned rows. If there's a lot, it won't be > > any faster, if there's a few, it should be a win. > > Why would this ever be faster? And how could postgres ever calculate that > without doing a sequential scan when count(*) would force it to do a > sequential scan? Because, count(*) CANNOT use an index. So, if you're hitting, say, 0.01% of the table (let's say 20 out of 20,000,000 rows or something like that) then the second should be MUCH faster.
On Sun, 2004-08-29 at 15:38, Scott Marlowe wrote: > On Sun, 2004-08-29 at 15:12, Greg Stark wrote: > > "Scott Marlowe" <smarlowe@qwest.net> writes: > > > > > Also, count(*) is likely to always generate a seq scan due to the way > > > aggregates are implemented currently in pgsql. you might want to try: > > > > Huh? I'm curious to know what you're talking about here. > > This has been discussed ad infinitum on the lists in the past. And > explained by better minds than mine, but I'll give it a go. > > PostgreSQL has a "generic" aggregate method. Imagine instead doing a > select count(id1+id2-id3) from table where ... that should be avg(id1+id2-id3)... doh
>> select somefield from sometable where timestampfield > now()-'60 >> seconds'::interval This is a FAQ, but since the archives don't seem to be up at the moment, here's the answer once again: The expression "now() - something" is not a constant, so the planner is faced with "timestampfield > unknownvalue". Its default assumption about the number of rows that will match is much too high to make an indexscan look profitable (from memory, I think it guesses that about a third of the table will match...). There are a couple of hacks you can use to deal with this. Plan A is just "set enable_seqscan = false" for this query. This is ugly and not really recommended, but you should try it first to verify that you do get an indexscan that way, just to be sure that lack of statistics is the problem and not something else. Plan B is to add an extra WHERE clause to make the problem look like a range query, eg where timestampfield > now() - ... AND timestampfield <= now(); The planner still doesn't know the exact values involved, so it still can't make use of any statistics, but it can see that this is a range constraint on timestampfield. The default guess about the selectivity will be a lot smaller than in the case of the one-sided inequality, and in most cases you should get an indexscan out of it. This isn't completely guaranteed though. Also, it's got a severe problem in that if you sometimes do queries with a large interval, it'll still do an indexscan even though that may be quite inappropriate. Plan C is to fix things so that the compared-to value *does* look like a constant; then the planner will correctly observe that only a small part of the table is to be scanned, and do the right thing (given reasonably up-to-date ANALYZE statistics, anyway). The most trustworthy way of doing that is to compute the "now() - interval" value on the client side and send over a timestamp constant. If that's not convenient for some reason, people frequently use a hack like this: create function ago(interval) returns timestamptz as 'select now() - $1' language sql strict immutable; select ... where timestampfield > ago('60 seconds'); This is a kluge because you are lying when you say that the result of ago() is immutable; it obviously isn't. But the planner will believe you, fold the function call to a constant during planning, and use the result. CAUTION: this works nicely for interactively-issued SQL queries, but it will come back to bite you if you try to use ago() in prepared queries or plpgsql functions, because the premature collapsing of the now() result will become significant. We have speculated about ways to get the planner to treat expressions involving now() and similar functions as pseudo-constants, so that it would do the right thing in this sort of situation without any kluges. It's not been done yet though. BTW, the above discussion applies to PG 7.3 and later; if you're dealing with an old version then there are some different considerations. regards, tom lane
On Sun, Aug 29, 2004 at 03:38:00PM -0600, Scott Marlowe wrote: >>> select somefield from sometable where timestampfield > now()-'60 >>> seconds'::interval >>> >>> and count the number of returned rows. If there's a lot, it won't be >>> any faster, if there's a few, it should be a win. >> Why would this ever be faster? And how could postgres ever calculate that >> without doing a sequential scan when count(*) would force it to do a >> sequential scan? > Because, count(*) CANNOT use an index. So, if you're hitting, say, > 0.01% of the table (let's say 20 out of 20,000,000 rows or something > like that) then the second should be MUCH faster. Of course count(*) can use an index: images=# explain analyze select count(*) from images where event='test'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=168.97..168.97 rows=1 width=0) (actual time=68.211..68.215 rows=1 loops=1) -> Index Scan using unique_filenames on images (cost=0.00..168.81 rows=63 width=0) (actual time=68.094..68.149 rows=8loops=1) Index Cond: ((event)::text = 'test'::text) Total runtime: 68.369 ms (4 rows) However, it cannot rely on an index _alone_; it has to go fetch the relevant pages, but of course, so must "select somefield from" etc.. /* Steinar */ -- Homepage: http://www.sesse.net/
"Scott Marlowe" <smarlowe@qwest.net> writes: > PostgreSQL has a "generic" aggregate method. Imagine instead doing a > select count(id1+id2-id3) from table where ... In that instance, it's > not a simple shortcut to just grab the number of rows anymore. Since > PostgreSQL uses a generic aggregate method that can be expanded by the > user with custom aggregates et. al., it has no optimizations to make > simple count(*) fast, like many other databases. People expect count(*) _without a where clause_ to be cached in a single global variable. Postgres can't do this, but the reason has everything to do with MVCC, not with postgres's generalized aggregates. Because of MVCC Postgres can't just store a single cached value, because there is no single cached value. It would have to store a complete history back to the oldest extant transaction. However in this case the user has a where clause. No database is going to cache values of count(*) for random where clauses. But that doesn't stop Postgres from using an index to fetch the records. > > > select somefield from sometable where timestampfield > now()-'60 > > > seconds'::interval > > > > > > and count the number of returned rows. If there's a lot, it won't be > > > any faster, if there's a few, it should be a win. > > > > Why would this ever be faster? And how could postgres ever calculate that > > without doing a sequential scan when count(*) would force it to do a > > sequential scan? > > Because, count(*) CANNOT use an index. So, if you're hitting, say, > 0.01% of the table (let's say 20 out of 20,000,000 rows or something > like that) then the second should be MUCH faster. I think you've applied these past discussions and come up with some bogus conclusions. The problem here has nothing to do with the count(*) and everything to do with the WHERE clause. To fetch the records satisfying that where clause postgres has to do exactly the same thing regardless of whether it's going to feed the data to count(*) or return some or all of it to the client. If postgres decides the where clause isn't selective enough it'll choose to use a sequential scan. However it would do that regardless of whether you're calling count(*) or not. If the number is records is substantial then you would get the overhead of the scan plus the time it takes to transfer all that unnecessary data to the user. What you're probably thinking of when you talk about general purpose aggregate interfaces is the difficulty of making min()/max() use indexes. That's a whole other case entirely. That's where postgres's generalized aggregates leaves it without enough information about what records the aggregate functions are interested in and what index scans might make them faster. None of these common cases end up making it a good idea to read the records into the clients and do the work in the client. The only cases where that would make sense would be if the function requires doing some manipulation of the data that's awkward to express in sql. The "top n" type of query is the usual culprit, but with postgres's new array functions even that becomes tractable. -- greg
> People expect count(*) _without a where clause_ to be cached in a single > global variable. Postgres can't do this, but the reason has everything to do Someone should write an approx_count('table') function that reads reltuples from pg_class and tell them to use it in combination with autovac. I've yet to see someone use count(*) across a table and not round the result themselves (approx 9 million clients).
--- Greg Stark <gsstark@mit.edu> wrote: > > Mr Pink <mr_pink_is_the_only_pro@yahoo.com> writes: > > > AFAIK postgres doesn't peek at values used in a query when optimizing > > Of course it does. But not ones returned by a function such as now(), or when you use bind variables, as Tom aptly explained. That's what I meant by 'peek'. Interestingly enough Oracle does that, it's inline with their policy of recommending the use of bind variables. Perhaps postgres could use such a feature some day. __________________________________ Do you Yahoo!? Yahoo! Mail - 50x more storage than other providers! http://promotions.yahoo.com/new_mail
Mr Pink <mr_pink_is_the_only_pro@yahoo.com> writes: >>> AFAIK postgres doesn't peek at values used in a query when optimizing >> >> Of course it does. > But not ones returned by a function such as now(), or when you use > bind variables, as Tom aptly explained. FWIW, 8.0 does have the ability to use the values of bind variables for planning estimation (Oliver Jowett did the work for this). The main issue in the way of dealing with now() is that whatever we did to now() would apply to all functions marked "stable", and it's a bit nervous-making to assume that they should all be treated this way. Or we could introduce more function volatility categories, but that's not much fun either. regards, tom lane
Yeah! Bind variable peeking is great news. I did actually read the guff, but forgot about that. Version 8 is looking great on paper, I hope I'll get a chance to play wth it soon. I can kind of appreciate your point about peeking stable functions, however, I would have thought that if it was possible to do for bind variables (which could change many times in a transaction) then it would make even more sense for a stable function which doesn't change for the life of the transaction. No doubt this is an oversimplification the situation. regards Mr Pink _______________________________ Do you Yahoo!? Win 1 of 4,000 free domain names from Yahoo! Enter now. http://promotions.yahoo.com/goldrush
On Sun, Aug 29, 2004 at 06:03:43PM -0400, Tom Lane wrote: > >> select somefield from sometable where timestampfield > now()-'60 > >> seconds'::interval > > This is a FAQ, but since the archives don't seem to be up at the moment, > here's the answer once again: > > The expression "now() - something" is not a constant, so the planner > is faced with "timestampfield > unknownvalue". Its default assumption > about the number of rows that will match is much too high to make an > indexscan look profitable (from memory, I think it guesses that about > a third of the table will match...). Ok; this explains some really wierd stuff I've been seeing. However, I'm seeing breakage of the form mentioned by the original poster even when the query uses a _constant_ timestamp: [Postgres 7.4.3] ntais# \d detect.stats Table "detect.stats" Column | Type | Modifiers --------------+--------------------------+------------------------------------------------------------- anomaly_id | integer | not null at | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone resolution | real | default 1.0 values | real[] | stat_type_id | integer | not null Indexes: "stats_pkey" primary key, btree (anomaly_id, stat_type_id, "at") "stats__ends_at" btree (stats__ends_at("at", resolution, "values")) Foreign-key constraints: "$1" FOREIGN KEY (anomaly_id) REFERENCES anomalies(anomaly_id) ON DELETE CASCADE "$2" FOREIGN KEY (stat_type_id) REFERENCES stat_types(stat_type_id) ntais=# SET enable_seqscan = on; SET ntais=# EXPLAIN ANALYZE SELECT anomaly_id, stat_type_id FROM detect.stats WHERE detect.stats__ends_at(at, resolution, values) > '2004-08-30 16:21:09+12'::timestamptz ORDER BY anomaly_id, stat_type_id ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Sort (cost=602473.59..608576.72 rows=2441254 width=8) (actual time=198577.407..198579.136 rows=6152 loops=1) Sort Key: anomaly_id, stat_type_id -> Seq Scan on stats (cost=0.00..248096.42 rows=2441254 width=8) (actual time=198299.685..198551.460 rows=6152loops=1) Filter: (stats__ends_at("at", resolution, "values") > '2004-08-30 16:21:09+12'::timestamp with time zone) Total runtime: 198641.649 ms (5 rows) ntais=# EXPLAIN ANALYZE SELECT anomaly_id, stat_type_id FROM detect.stats WHERE detect.stats__ends_at(at, resolution, values) > '2004-08-30 16:21:09+12'::timestamptz ORDER BY anomaly_id, stat_type_id ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=10166043.26..10172146.40 rows=2441254 width=8) (actual time=44.710..46.661 rows=6934 loops=1) Sort Key: anomaly_id, stat_type_id -> Index Scan using stats__ends_at on stats (cost=0.00..9811666.09 rows=2441254 width=8) (actual time=0.075..24.702rows=6934 loops=1) Index Cond: (stats__ends_at("at", resolution, "values") > '2004-08-30 16:21:09+12'::timestamp with time zone) Total runtime: 50.354 ms (5 rows) ntais=# SELECT count(*) FROM detect.stats; count --------- 7326151 (1 row) Ive done repeated ANALYZE's, both table-specific and database-wide, and get the same result every time. For us, a global 'enable_seqscan = off' in postgresql.conf is the way to go. You occasionally see an odd plan while developing a query (eg: scanning an index with no contraint to simply get ORDER BY). Usually thats a broken query/index, and I simply fix it. Guy Thornley
Guy Thornley <guy@esphion.com> writes: > However, I'm seeing breakage of the form mentioned by the original poster > even when the query uses a _constant_ timestamp: [Postgres 7.4.3] > Indexes: > "stats_pkey" primary key, btree (anomaly_id, stat_type_id, "at") > "stats__ends_at" btree (stats__ends_at("at", resolution, "values")) > ntais=# EXPLAIN ANALYZE > SELECT anomaly_id, stat_type_id > FROM detect.stats > WHERE detect.stats__ends_at(at, resolution, values) > '2004-08-30 16:21:09+12'::timestamptz > ORDER BY anomaly_id, stat_type_id > ; Here I'm afraid you're just stuck until 8.0 comes out (or you're feeling brave enough to use a beta). Releases before 8.0 do not maintain any statistics about the contents of functional indexes, so the planner is flying blind here in any case, and you end up with the very same 1/3rd default assumption no matter what the right-hand side looks like. You'll have to fall back to Plan A or Plan B to get this case to work in 7.4. regards, tom lane
Guy Thornley <guy@esphion.com> writes: > "stats__ends_at" btree (stats__ends_at("at", resolution, "values")) Postgres 7.4 doesn't have any stats on functional indexes. So it's back to just guessing at the selectivity of this. 8.0 does gather stats for functional indexes so it should be better off. -- greg
On Sun, Aug 29, 2004 at 06:03:43PM -0400, Tom Lane wrote: > The expression "now() - something" is not a constant, so the planner > is faced with "timestampfield > unknownvalue". Its default assumption > about the number of rows that will match is much too high to make an > indexscan look profitable (from memory, I think it guesses that about > a third of the table will match...). Out of curiosity, does the subselect query I presented earlier in the thread count as "a constant"? It gives the correct query plan, but this could of course just be correct by accident... /* Steinar */ -- Homepage: http://www.sesse.net/
"Steinar H. Gunderson" <sgunderson@bigfoot.com> writes: > On Sun, Aug 29, 2004 at 06:03:43PM -0400, Tom Lane wrote: >> The expression "now() - something" is not a constant, so the planner >> is faced with "timestampfield > unknownvalue". > Out of curiosity, does the subselect query I presented earlier in the thread > count as "a constant"? It gives the correct query plan, but this could of > course just be correct by accident... That was on 7.2, wasn't it? I don't remember any longer exactly how 7.2 does this stuff, but it's different from 7.3 and later (and certainly not any more "right"). You did at one time need to hide now() in a subselect to get the planner to consider an indexscan at all --- that was before we made the distinction between immutable and stable functions, and so now() had to be treated as unsafe to index against (just as random() still is). I think 7.2 behaved that way but I'm not totally sure. regards, tom lane
Most likely your table has a SERIAL PRIMARY KEY in it, in this case, do the following : my_limit = select primary_key_field from billing where timestamp > (now()-60)::timestamp ORDER BY timestamp ASC LIMIT 1; then SELECT count(*) FROM billing WHERE primary_key_field>=my_limit; I don't know if it'll work better, but you can try. When you insert records in the table, they are appended at the end, so this type of recent records query only requires reading the tail of the table. It should be fast if planned correctly. > Strangely enough, I don't find that result surprising. > > if the vast bulk of the data is in the past and now()-60 represents a > very small slice of the data > we might expect that using an index is optimal, but there could be many > reasons why it doesn't get > used. > > AFAIK postgres doesn't peek at values used in a query when optimizing so > any query with a ">" type > condition is gonna have a seq scan as the plan since the best guess is > that you are gonna match > 50% of the table. That's one possible explanation. > > Another is that if the condition data types don't match then an indes > won't be used you could try: > > select count(*) from billing where timestamp > (now()-60)::timestamp > > Might make a difference, I dunno, it's a case of testing amd seing what > happens. > > You could try lowering the random page cost, it might help, but I don't > like your chances. > > If your problem is that you want to access the most recent data from a > large table with fast > response, then you could consider: > > 1. a "recent" index. If the data is within the "recent" time from set a > flag to true, other wise > null. Reset the flags periodically. Nulls aren't indexed so the > selectivity of such an index is > much higher. Can work wonders. > > 2, duplicate recent data in another table that is purged when data > passes the age limit. This is > basic archiving. > > Something like that. Hopefully someone with more knowlege of the > optimaizer will have a brighter > suggestion for you. > > What version are you using by the way? > Regards > Mr Pink > --- Jack Kerkhof <jack.kerkhof@guest-tek.com> wrote: > >> The query: >> select count(*) from billing where timestamp > now()-60 >> >> should obviously use the index >> >> CREATE INDEX billing_timestamp_idx ON billing USING btree >> ("timestamp" >> timestamp_ops); >> >> on a table with 1400000 rows. >> >> But it uses a Seq Scan. If I set enable_seqscan=no, it indicates a >> queryplan >> could not be calculated. >> >> Why does this simple query not use the timestamp index, and how can I >> get it >> to? >> >> Thanks, Jack >> >> Jack Kerkhof >> Research & Development >> jack.kerkhof@guest-tek.com >> www.guest-tek.com >> 1.866.509.1010 3480 >> >> -------------------------------------------------------------------------- >> >> Guest-Tek is a leading provider of broadband technology solutions >> for >> the hospitality industry. Guest-Tek's GlobalSuite�Ehigh-speed Internet >> solution enables hotels to offer their guests the convenience of wired >> and/or wireless broadband Internet access from guest rooms, meeting >> rooms >> and public areas. >> >> >> > > > > > __________________________________ > Do you Yahoo!? > Take Yahoo! Mail with you! Get it on your mobile phone. > http://mobile.yahoo.com/maildemo > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
>> Also, count(*) is likely to always generate a seq scan due to the way >> aggregates are implemented currently in pgsql. you might want to try: By the way, in an ideal world, count(*) should only read the index on the timetamp column, not the rows. I guess this is not the case. Would this be an useful optimization ?
Another primary key trick : If you insert records with a serial primary key, and rarely delete them or update the timestamp, you can use the primary key to compute an approximate number of rows. a := SELECT pkey FROM table WHERE timestamp() > threshold ORDER BY timestamp ASC LIMIT 1; b := SELECT pkey FROM table WHERE ORDER BY pkey DESC LIMIT 1; (b-a) is an approximate count. Performance is great because you only fetch two rows. Index scan is guaranteed (LIMIT 1). On the downside, you get an approximation, and this only works for tables where timestamp is a date of INSERT, timestamp worrelated wiht pkey) not when timestamp is a date of UPDATE (uncorrelated with pkey).
On Mon, Aug 30, 2004 at 21:21:26 +0200, Pierre-Frédéric Caillaud <lists@boutiquenumerique.com> wrote: > >>Also, count(*) is likely to always generate a seq scan due to the way > >>aggregates are implemented currently in pgsql. you might want to try: > > > By the way, in an ideal world, count(*) should only read the index > on the timetamp column, not the rows. I guess this is not the case. Would > this be an useful optimization ? It's in the archives. The short answer is that no, postgres has to check the heap to check tuple visibility to the current transaction.
[I'm actually responding to the previous post from Tom Lane, but I've deleted it and the archives seem to be down again.] The assumption being made is that the first provided result is representative of all future results. I don't see any reason that making this assumption of all stable functions should be less scary than making the assumption about user provided parameters. However I have the complementary reaction. I find peeking at the first bind parameter to be scary as hell. Functions seem slightly less scary. On Oracle Peeking at bind parameters is a feature explicitly intended for DSS data warehouse type systems. The use of placeholders there was purely for security and programming ease, not efficiency, since the queries are only planned executed a small number of times per plan. These are systems that suffered enormously without the parameter values. They often involved full table scans or bitmap index scans and without the statistics produced awful plans. For OLTP systems peeking at placeholders is more a danger than a benefit. The query will be executed thousands of times and if it's planned based on a single unusual value initially the entire system could fail. Consider the following scenario which isn't farfetched at all. In fact I think it well describes my current project: I have a table with a few million records. 99% of the time users are working with only a few hundred records at most. There's an index on the column they're keying off of. 1% of the key values have an unusually large number of records. Without peeking at placeholders the system should see that virtually all the key values are well under the threshold for an index scan to be best. So it always uses an index scan. 1% of the time it takes longer than that it would have with a sequential scan, but only by a small factor. (On the whole we're probably still better off avoiding the cache pollution anyways.) With peeking at placeholders 99% of the backends would perform the same way. However 1 backend in 100 sees one of these unusual values for its first query. This backend will use a sequential scan for *every* request. Executing a sequential table scan of this big table once a second this backend will drive the entire system into the ground. This means every time I start the system up I stand a small but significant chance of it just completely failing to perform properly. Worse, apache is designed to periodically start new processes, so at any given time the system could just randomly fall over and die. I would rather incur a 10% penalty on every query than have a 1% chance of it keeling over and dieing. Given this I would when I upgrade to 8.0 have to ensure that my application driver is either not using placeholders at all (at the protocol level -- I always prefer them at the api level) or ensure that postgres is *not* peeking at the value. I like the feature but I just want to be sure that it's optional. -- greg
Greg Stark <gsstark@mit.edu> writes: > However I have the complementary reaction. I find peeking at the first > bind parameter to be scary as hell. Functions seem slightly less scary. FWIW, we only do it in the context of unnamed parameterized queries. As the protocol docs say, those are optimized on the assumption that they will be executed only once. It seems entirely legitimate to me to use the parameter values in such a case. We might in future get braver about using sample parameter values, but 8.0 is conservative about it. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > However I have the complementary reaction. I find peeking at the first > > bind parameter to be scary as hell. Functions seem slightly less scary. > > FWIW, we only do it in the context of unnamed parameterized queries. I knew that. That's why I hadn't been jumping up and down screaming. I was watching though to insist on an option to disable it if it became more widespread. > As the protocol docs say, those are optimized on the assumption that > they will be executed only once. It seems entirely legitimate to me > to use the parameter values in such a case. Sure. It's a great feature to have; it means people can be more aggressive about using placeholders for other reasons without worrying about performance impacts. > We might in future get braver about using sample parameter values, > but 8.0 is conservative about it. If they're used for named parameters I would strongly recommend guc variable to control the default on a server-wide basis. It could be a variable that individual sessions could override since there's no security or resource implications. It's purely a protocol interface issue. For that matter, would it be possible for the default selectivity estimates to be a guc variable? It's something that the DBA -- or even programmer on a per-session basis -- might be able to provide a better value for his applications than any hard coded default. Or perhaps it would be one valid use of hints to provide selectivity estimates for blind placeholders. It would be nice to be able to say for example: select * from foo where col > $0 /*+ 5% */ AND col2 > $1 /*+ 10% */ Would there be any hope of convincing you that this is a justifiable use of hints; providing information that the optimizer has absolutely no possibility of ever being able to calculate on its own? -- greg
Hi Greg, Tom, etal It's true that oracle only peeks during a hard parse, and this can have good or bad results depending on the situation. Basically, the first value used in that query will determine the plan until that query is bumped from the sql cache or the server is restarted. As far as I know, there is no option to disable that feature in Oracle, I don't know about postgres. Overall, I think it's a good feature because it helps us in the goal of reducing hardparsing (that was it's real purpose in oracle). The trick as with all good features is to use it cleverly. For example, you could run scripts on server startup that run such queries with optimal values before any one gets back on. If your application has optimal use of bind variables allowing re-use of query plan, and the sql cache has enough memory then the query plans you created at server startup could be expected to be current for the life of that instance. I write all this from my knowlegdge of Oracle, but I can't be sure how it applies to postgres. Come to think about it, I don't think I've seen a good discussion of plan caching, hard parsing and such like specifically related to pg. I'd really like to know more about how pg treats that stuff. regards Mr Pink --- Greg Stark <gsstark@mit.edu> wrote: > > > [I'm actually responding to the previous post from Tom Lane, but I've deleted > it and the archives seem to be down again.] > > > The assumption being made is that the first provided result is representative > of all future results. I don't see any reason that making this assumption of > all stable functions should be less scary than making the assumption about > user provided parameters. > > However I have the complementary reaction. I find peeking at the first > bind parameter to be scary as hell. Functions seem slightly less scary. > > On Oracle Peeking at bind parameters is a feature explicitly intended for DSS > data warehouse type systems. The use of placeholders there was purely for > security and programming ease, not efficiency, since the queries are only > planned executed a small number of times per plan. These are systems that > suffered enormously without the parameter values. They often involved full > table scans or bitmap index scans and without the statistics produced awful > plans. > > For OLTP systems peeking at placeholders is more a danger than a benefit. The > query will be executed thousands of times and if it's planned based on a > single unusual value initially the entire system could fail. > > Consider the following scenario which isn't farfetched at all. In fact I think > it well describes my current project: > > I have a table with a few million records. 99% of the time users are working > with only a few hundred records at most. There's an index on the column > they're keying off of. 1% of the key values have an unusually large number of > records. > > Without peeking at placeholders the system should see that virtually all the > key values are well under the threshold for an index scan to be best. So it > always uses an index scan. 1% of the time it takes longer than that it would > have with a sequential scan, but only by a small factor. (On the whole we're > probably still better off avoiding the cache pollution anyways.) > > With peeking at placeholders 99% of the backends would perform the same way. > However 1 backend in 100 sees one of these unusual values for its first query. > This backend will use a sequential scan for *every* request. Executing a > sequential table scan of this big table once a second this backend will drive > the entire system into the ground. > > This means every time I start the system up I stand a small but significant > chance of it just completely failing to perform properly. Worse, apache is > designed to periodically start new processes, so at any given time the system > could just randomly fall over and die. > > I would rather incur a 10% penalty on every query than have a 1% chance of it > keeling over and dieing. Given this I would when I upgrade to 8.0 have to > ensure that my application driver is either not using placeholders at all (at > the protocol level -- I always prefer them at the api level) or ensure that > postgres is *not* peeking at the value. > > I like the feature but I just want to be sure that it's optional. > > -- > greg > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > __________________________________ Do you Yahoo!? Yahoo! Mail - 50x more storage than other providers! http://promotions.yahoo.com/new_mail