Thread: [PERFORM] Querying with multicolumn index
Hi all, I have a query that I *think* should use a multicolumn index, but sometimes isn't, resulting in slow queries. We have a DB that records GPS coordinates for vehicles: Table "public.updates" Column | Type | Modifiers ------------+--------------------------+------------------------------------------------------ id | integer | not null default nextval('updates_id_seq'::regclass) driver_id | integer | not null latitude | double precision | not null longitude | double precision | not null time | timestamp with time zone | not null default now() vehicle_id | integer | Indexes: "updates_pkey" PRIMARY KEY, btree (id) "ix_updates_time" btree ("time") "updates_driver_id_time_idx" btree (driver_id, "time") "updates_vehicle_id_time_idx" btree (vehicle_id, "time") Table has about 15M records across 100 distinct driver_id. I want to get the last record for a specific driver: SELECT * FROM updates WHERE driver_id=123 ORDER BY "time" DESC LIMIT 1; For some values of driver_id, it does what I expect and uses updates_driver_id_time_idx to fetch the records in 2 ms or less. For other values of driver_id, it does an index scan backwards on ix_updates_time, taking upwards of 2 minutes. Good plan: Limit (cost=0.11..1.38 rows=1 width=56) (actual time=2.710..2.710 rows=1 loops=1) -> Index Scan Backward using updates_driver_id_time_idx on updates (cost=0.11..139278.28 rows=110051 width=56) (actual time=2.709..2.709 rows=1 loops=1) Index Cond: (driver_id = 17127) Total runtime: 2.732 ms (4 rows) Bad plan: Limit (cost=0.09..0.69 rows=1 width=56) (actual time=216769.111..216769.112 rows=1 loops=1) -> Index Scan Backward using ix_updates_time on updates (cost=0.09..272339.04 rows=448679 width=56) (actual time=216769.110..216769.110 rows=1 loops=1) Filter: (driver_id = 30132) Rows Removed by Filter: 5132087 Total runtime: 216769.174 ms From cursory testing, the difference seems to be based on how many total rows there are for a particular driver. The above query uses updates_driver_id_time_idx for drivers with less than about 300K rows, but uses ix_updates_time for drivers with more than about 300K rows. Anything we can do to make it do the "right" thing? We are also considering denormalizing the data and keeping a "cache" of the same data in another table. pgsql version: 9.3.14 and 9.5.3, already tried vacuum analyze. Thanks, Eric -- Eric Jiang, DoubleMap eric@doublemap.com | www.doublemap.com
På fredag 09. desember 2016 kl. 18:00:16, skrev Eric Jiang <eric@doublemap.com>:
Hi all,
I have a query that I *think* should use a multicolumn index, but
sometimes isn't, resulting in slow queries.
We have a DB that records GPS coordinates for vehicles:
Table "public.updates"
Column | Type | Modifiers
------------+--------------------------+------------------------------------------------------
id | integer | not null default
nextval('updates_id_seq'::regclass)
driver_id | integer | not null
latitude | double precision | not null
longitude | double precision | not null
time | timestamp with time zone | not null default now()
vehicle_id | integer |
Indexes:
"updates_pkey" PRIMARY KEY, btree (id)
"ix_updates_time" btree ("time")
"updates_driver_id_time_idx" btree (driver_id, "time")
"updates_vehicle_id_time_idx" btree (vehicle_id, "time")
Table has about 15M records across 100 distinct driver_id.
I want to get the last record for a specific driver:
SELECT * FROM updates WHERE driver_id=123 ORDER BY "time" DESC LIMIT 1;
For some values of driver_id, it does what I expect and uses
updates_driver_id_time_idx to fetch the records in 2 ms or less. For
other values of driver_id, it does an index scan backwards on
ix_updates_time, taking upwards of 2 minutes.
Good plan:
Limit (cost=0.11..1.38 rows=1 width=56) (actual time=2.710..2.710
rows=1 loops=1)
-> Index Scan Backward using updates_driver_id_time_idx on updates
(cost=0.11..139278.28 rows=110051 width=56) (actual time=2.709..2.709
rows=1 loops=1)
Index Cond: (driver_id = 17127)
Total runtime: 2.732 ms
(4 rows)
Bad plan:
Limit (cost=0.09..0.69 rows=1 width=56) (actual
time=216769.111..216769.112 rows=1 loops=1)
-> Index Scan Backward using ix_updates_time on updates
(cost=0.09..272339.04 rows=448679 width=56) (actual
time=216769.110..216769.110 rows=1 loops=1)
Filter: (driver_id = 30132)
Rows Removed by Filter: 5132087
Total runtime: 216769.174 ms
From cursory testing, the difference seems to be based on how many
total rows there are for a particular driver. The above query uses
updates_driver_id_time_idx for drivers with less than about 300K rows,
but uses ix_updates_time for drivers with more than about 300K rows.
Anything we can do to make it do the "right" thing? We are also
considering denormalizing the data and keeping a "cache" of the same
data in another table.
pgsql version: 9.3.14 and 9.5.3, already tried vacuum analyze.
Thanks,
Eric
You should be having this index:
create index updates_driver_time_idx ON updates(driver_id, "time" DESC);
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On Fri, Dec 9, 2016 at 9:56 AM, Andreas Joseph Krogh <andreas@visena.com> wrote:
You should be having this index:create index updates_driver_time_idx ON updates(driver_id, "time" DESC);
I'm not sure I understand the intent of this fix - are you saying that btree indexes only work in a certain direction?
I created this index and the query plans did not change.
--
Eric Jiang <eric@doublemap.com> writes: > I have a query that I *think* should use a multicolumn index, but > sometimes isn't, resulting in slow queries. I tried to duplicate this behavior, without success. Are you running with nondefault planner parameters? regards, tom lane
Hi As a quick fix: Have you considered dropping ix_updates_time index? I’ve been able to reproduce the issue, but with bigger numbers than you. When I dropped ix_updates_time it went much muchfaster. It uses updates_driver_id_time_idx index instead. For some reason the planner does not make a good estimation in this case. Can you show us EXPLAIN (ANALYZE, BUFFERS) beforeand after dropping ix_updates_time index? Can you show us too the output of SHOW shared_buffers; ? I suspect the issue has to do with low shared_buffers configuration and cache misses, and maybe some costs suboptimal configurationI’ll try to find it out, if anyone can enlighten us it will be very welcomed. P.S. Meanwhile If you still need 'time' index, you can create an index using ‘time' and ‘customer_id' in this order. Cheers, Daniel Blanch. > El 9 dic 2016, a las 18:00, Eric Jiang <eric@doublemap.com> escribió: > > Hi all, > I have a query that I *think* should use a multicolumn index, but > sometimes isn't, resulting in slow queries. > > We have a DB that records GPS coordinates for vehicles: > > Table "public.updates" > Column | Type | Modifiers > ------------+--------------------------+------------------------------------------------------ > id | integer | not null default > nextval('updates_id_seq'::regclass) > driver_id | integer | not null > latitude | double precision | not null > longitude | double precision | not null > time | timestamp with time zone | not null default now() > vehicle_id | integer | > Indexes: > "updates_pkey" PRIMARY KEY, btree (id) > "ix_updates_time" btree ("time") > "updates_driver_id_time_idx" btree (driver_id, "time") > "updates_vehicle_id_time_idx" btree (vehicle_id, "time") > > Table has about 15M records across 100 distinct driver_id. > > I want to get the last record for a specific driver: > > SELECT * FROM updates WHERE driver_id=123 ORDER BY "time" DESC LIMIT 1; > > For some values of driver_id, it does what I expect and uses > updates_driver_id_time_idx to fetch the records in 2 ms or less. For > other values of driver_id, it does an index scan backwards on > ix_updates_time, taking upwards of 2 minutes. > > Good plan: > > Limit (cost=0.11..1.38 rows=1 width=56) (actual time=2.710..2.710 > rows=1 loops=1) > -> Index Scan Backward using updates_driver_id_time_idx on updates > (cost=0.11..139278.28 rows=110051 width=56) (actual time=2.709..2.709 > rows=1 loops=1) > Index Cond: (driver_id = 17127) > Total runtime: 2.732 ms > (4 rows) > > Bad plan: > > Limit (cost=0.09..0.69 rows=1 width=56) (actual > time=216769.111..216769.112 rows=1 loops=1) > -> Index Scan Backward using ix_updates_time on updates > (cost=0.09..272339.04 rows=448679 width=56) (actual > time=216769.110..216769.110 rows=1 loops=1) > Filter: (driver_id = 30132) > Rows Removed by Filter: 5132087 > Total runtime: 216769.174 ms > > > From cursory testing, the difference seems to be based on how many > total rows there are for a particular driver. The above query uses > updates_driver_id_time_idx for drivers with less than about 300K rows, > but uses ix_updates_time for drivers with more than about 300K rows. > > Anything we can do to make it do the "right" thing? We are also > considering denormalizing the data and keeping a "cache" of the same > data in another table. > > pgsql version: 9.3.14 and 9.5.3, already tried vacuum analyze. > > Thanks, > Eric > > > -- > Eric Jiang, DoubleMap > eric@doublemap.com | www.doublemap.com > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
We aren't using any special planner settings - all enable_* options are "on". On Sat, Dec 10, 2016 at 12:06 AM, Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com> wrote: > As a quick fix: Have you considered dropping ix_updates_time index? We do occasionally want to use ix_updates_time, although not very often. > I’ve been able to reproduce the issue, but with bigger numbers than you. When I dropped ix_updates_time it went much muchfaster. It uses updates_driver_id_time_idx index instead. > > For some reason the planner does not make a good estimation in this case. Can you show us EXPLAIN (ANALYZE, BUFFERS) beforeand after dropping ix_updates_time index? Can you show us too the output of SHOW shared_buffers; ? Here's EXPLAIN (ANALYZE, BUFFERS) with the above bad query on a cold cache: Limit (cost=0.09..0.70 rows=1 width=56) (actual time=244846.915..244846.915 rows=1 loops=1) Buffers: shared hit=3999254 read=57831 I/O Timings: read=242139.661 -> Index Scan Backward using ix_updates_time on updates (cost=0.09..271696.74 rows=442550 width=56) (actual time=244846.913..244846.913 rows=1 loops=1) Filter: (driver_id = 30132) Rows Removed by Filter: 5316811 Buffers: shared hit=3999254 read=57831 I/O Timings: read=242139.661 Total runtime: 244846.946 ms and after dropping ix_updates_time: Limit (cost=0.11..0.98 rows=1 width=56) (actual time=2.270..2.271 rows=1 loops=1) Buffers: shared hit=1 read=4 I/O Timings: read=2.230 -> Index Scan Backward using updates_driver_id_time_idx on updates (cost=0.11..382307.69 rows=442550 width=56) (actual time=2.270..2.270 rows=1 loops=1) Index Cond: (driver_id = 30132) Buffers: shared hit=1 read=4 I/O Timings: read=2.230 Total runtime: 2.305 ms and `SHOW shared_buffers;` shared_buffers ---------------- 244MB > I suspect the issue has to do with low shared_buffers configuration and cache misses, and maybe some costs suboptimal configurationI’ll try to find it out, if anyone can enlighten us it will be very welcomed. > > > P.S. Meanwhile If you still need 'time' index, you can create an index using ‘time' and ‘customer_id' in this order. Did you mean an index on (time, driver_id)? I did: CREATE INDEX CONCURRENTLY ix_updates_time_driver_id ON updates ("time", driver_id) but seems like the planner will use it for driver_id having more than ~300k rows: Limit (cost=0.11..0.79 rows=1 width=56) (actual time=115.051..115.052 rows=1 loops=1) Buffers: shared hit=20376 -> Index Scan Backward using ix_updates_time_driver_id on updates (cost=0.11..302189.90 rows=443924 width=56) (actual time=115.048..115.048 rows=1 loops=1) Index Cond: (driver_id = 30132) Buffers: shared hit=20376 Total runtime: 115.091 ms It does seem faster than when having an index on just "time", but still not optimal. Really appreciate everyone's help! -- Eric Jiang, DoubleMap eric@doublemap.com | www.doublemap.com
Hi, On 12/10/2016 12:51 AM, Tom Lane wrote: > Eric Jiang <eric@doublemap.com> writes: >> I have a query that I *think* should use a multicolumn index, but >> sometimes isn't, resulting in slow queries. > > I tried to duplicate this behavior, without success. Are you running > with nondefault planner parameters? > My guess is this is a case of LIMIT the matching rows are uniformly distributed in the input data. The planner likely concludes that for a driver with a lot of data we'll find the first row using ix_updates_time very quickly, and that it will be cheaper than inspecting the larger multi-column index. But imagine a driver with a lots of data long time ago. That breaks the LIMIT fairly quickly. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Eric Jiang <eric@doublemap.com> writes: > We aren't using any special planner settings - all enable_* options are "on". No, I'm asking about the cost settings (random_page_cost etc). The cost estimates you're showing seem impossible with the default settings. regards, tom lane
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 12/10/2016 12:51 AM, Tom Lane wrote: >> I tried to duplicate this behavior, without success. Are you running >> with nondefault planner parameters? > My guess is this is a case of LIMIT the matching rows are uniformly > distributed in the input data. The planner likely concludes that for a > driver with a lot of data we'll find the first row using ix_updates_time > very quickly, and that it will be cheaper than inspecting the larger > multi-column index. But imagine a driver with a lots of data long time > ago. That breaks the LIMIT fairly quickly. The fact that it's slow enough to be a problem is doubtless related to that effect. But AFAICS, the planner should never prefer that index for this query, because even with a uniform-density assumption, the index that really matches the query ought to look better. regards, tom lane
On Sat, Dec 10, 2016 at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> We aren't using any special planner settings - all enable_* options are "on". > > No, I'm asking about the cost settings (random_page_cost etc). The cost > estimates you're showing seem impossible with the default settings. Tom, really appreciate your pointers. This problem was occurring on Heroku Postgres databases, and they seem to have set different cost constants. I tried using SET LOCAL to set them back to the default settings before running EXPLAIN. My testing here shows that resetting all of random_page_cost, cpu_tuple_cost, cpu_index_tuple_cost, and cpu_operator_cost does not change the plan (but does change the cost estimates), while setting effective_cache_size alone will change the plan. Specifically, changing only effective_cache_size from '900000kB' to '4GB' caused the planner to prefer the optimal index updates_driver_id_time_idx. Is increasing the DB's RAM the correct fix for this problem? It seems to me that no matter how much cache is available, looking at the (driver_id, time) index is always the optimal choice for this query. Thanks, Eric -- Eric Jiang, DoubleMap eric@doublemap.com | www.doublemap.com
Hi all, Thomas is absolutely right, the distribution I synthetically made, had 6M records but very old, 9M old, as you can see ithad to skip 9M records before finding a suitable record using time index. EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM updates WHERE driver_id = 100 ORDER BY "time" DESC LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.44..0.65 rows=1 width=36) (actual time=3827.807..3827.807 rows=1 loops=1) Buffers: shared hit=24592 read=99594 written=659 -> Index Scan Backward using updates_time_idx on updates (cost=0.44..1284780.53 rows=6064800 width=36) (actual time=3827.805..3827.805rows=1 loops=1) Filter: (driver_id = 100) Rows Removed by Filter: 9000000 Buffers: shared hit=24592 read=99594 written=659 Planning time: 0.159 ms Execution time: 3827.846 ms (8 rows) Here you have my tests where I was able to reproduce the problem using default settings on 9.6, 9.5 and 9.3. 9.6 and 9.5choose the wrong index, while 9.3 didn’t. (update: 9.5 didn’t fail last time) However when I tried to add more than one value with this strange distribution ~ 30% of distribution to one value the indexbad choice problem didn’t happen again in none of the different versions. I Hope this helps. Regards, Daniel Blanch. > El 10 dic 2016, a las 21:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> escribió: > > Hi, > > On 12/10/2016 12:51 AM, Tom Lane wrote: >> Eric Jiang <eric@doublemap.com> writes: >>> I have a query that I *think* should use a multicolumn index, but >>> sometimes isn't, resulting in slow queries. >> >> I tried to duplicate this behavior, without success. Are you running >> with nondefault planner parameters? >> > > My guess is this is a case of LIMIT the matching rows are uniformly distributed in the input data. The planner likely concludesthat for a driver with a lot of data we'll find the first row using ix_updates_time very quickly, and that it willbe cheaper than inspecting the larger multi-column index. But imagine a driver with a lots of data long time ago. Thatbreaks the LIMIT fairly quickly. > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Attachment
Hi all,
If anyone still interested in the issue I think I have a very plausible explanation of Eric’s postgresql bad index choice that is: bloated updates_driver_id_time_idx index.
Though it’s possible to fool postgresql planner, as I’ve shown in previous tests, this happens with a very concrete data distribution ~ 100 evenly distributed keys over 15M records and ~ 6M records under one single key, if you play a bit with figures it doesn’t happen anymore.
Eric’s data distribution wasn’t that extreme, as far as I know he had ~ 100K vs 500K distributions … Well, I’ve been able to reproduce the problem with a close data distribution to Erik’s. I made it creating a ‘bloated index’. If optimal index is too big, postgres tries with another suboptimal index, in this case index ’time’.
See this excerpt of my tests results:
(..)
-- populate table with 99 homogeneus distributed values
INSERT INTO updates SELECT q, q % 99, q, q, to_timestamp(q), q % 99 FROM generate_series(1, 15000000) q;
INSERT 0 15000000
Time: 65686,547 ms
-- populate table with 1 value with 500K rows, simmilar distribution you posted.
INSERT INTO updates SELECT q + 15000000, 100, q, q, to_timestamp(q), -- timestamp will start at 1 at end at 6M
100 FROM generate_series(1, 500000) q;
INSERT 0 500000
Time: 2463,073 ms
-- add constraints and indexes
(…)
-- create 'bloated' driver_id, time index.
CREATE INDEX ON updates (driver_id, "time") WITH (fillfactor = 10);
CREATE INDEX
Time: 41234,091 ms
-- check index sizes, updates_driver_id_idx is huge.
SELECT relname, relpages FROM pg_class WHERE relname LIKE 'updates%';
relname | relpages
-----------------------------+----------
updates | 129167
updates_driver_id_time_idx | 576919
updates_id_seq | 1
updates_pkey | 42502
updates_time_idx | 42502
updates_vehicle_id_time_idx | 59684
(6 rows)
Time: 16,810 ms
-- check behavior with bloated index
ANALYZE updates;
ANALYZE
Time: 254,917 ms
(..)
Time: 4,635 ms
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM updates WHERE driver_id = 100 ORDER BY "time" DESC LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..1.91 rows=1 width=36) (actual time=21486.015..21486.015 rows=1 loops=1)
Buffers: shared hit=39618 read=160454 written=592
-> Index Scan Backward using updates_time_idx on updates (cost=0.43..691283.45 rows=469134 width=36) (actual time=21486.014..21486.014 rows=1 loops=1)
Filter: (driver_id = 100)
Rows Removed by Filter: 14500000
Buffers: shared hit=39618 read=160454 written=592
Planning time: 0.171 ms
Execution time: 21486.068 ms
(8 rows)
Time: 21486,905 ms
-- rebuild index with default fillfactor
ALTER INDEX updates_driver_id_time_idx SET (fillfactor = 90);
ALTER INDEX
Time: 0,682 ms
REINDEX INDEX updates_driver_id_time_idx;
REINDEX
Time: 23559,530 ms
-- recheck index sizes, updates_driver_id_idx should look pretty simmilar to others.
SELECT relname, relpages FROM pg_class WHERE relname LIKE 'updates%';
relname | relpages
-----------------------------+----------
updates | 129167
updates_driver_id_time_idx | 59684
updates_id_seq | 1
updates_pkey | 42502
updates_time_idx | 42502
updates_vehicle_id_time_idx | 59684
(6 rows)
Time: 0,452 ms
-- check behavior with regular sized index
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM updates WHERE driver_id = 100 ORDER BY "time" DESC LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..1.69 rows=1 width=36) (actual time=0.032..0.033 rows=1 loops=1)
Buffers: shared hit=2 read=3
-> Index Scan Backward using updates_driver_id_time_idx on updates (cost=0.56..529197.34 rows=469133 width=36) (actual time=0.032..0.032 rows=1 loops=1)
Index Cond: (driver_id = 100)
Buffers: shared hit=2 read=3
Planning time: 0.074 ms
Execution time: 0.046 ms
(7 rows)
Time: 0,312 ms
@Eric
How to solve the problem:
First of all check if this is the case, check indexes sizes, if you have this problem updates_driver_id_time_idx should be significantly bigger than others.
SELECT relname, relpages FROM pg_class WHERE relname LIKE 'updates%’;
Check index configuration to see if you have different fillfactor configuration
\d+ updates_driver_id_time_idx
If you have setup a different fillfactor, turn it to normal, that is 90%. I don’t see why you should have a low fillfactor, your data doesn’t seem to have frecuent updates, by the contrary, it seems only write and read data.
ALTER INDEX updates_driver_id_time_idx SET (fillfactor = 90)
If your index fillfactor is normal, there is a chance it got bloated, but this is rare.
Reindex your data.
REINDEX INDEX updates_driver_id_time_idx;
Run tests again.
Regards,
Daniel Blanch.
P.S. Here you have my full tests and output, you might find them useful. Don’t forget to show us index sizes and index configuration, please.
El 11 dic 2016, a las 7:04, Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com> escribió:Hi all,
Thomas is absolutely right, the distribution I synthetically made, had 6M records but very old, 9M old, as you can see it had to skip 9M records before finding a suitable record using time index.
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM updates WHERE driver_id = 100 ORDER BY "time" DESC LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..0.65 rows=1 width=36) (actual time=3827.807..3827.807 rows=1 loops=1)
Buffers: shared hit=24592 read=99594 written=659
-> Index Scan Backward using updates_time_idx on updates (cost=0.44..1284780.53 rows=6064800 width=36) (actual time=3827.805..3827.805 rows=1 loops=1)
Filter: (driver_id = 100)
Rows Removed by Filter: 9000000
Buffers: shared hit=24592 read=99594 written=659
Planning time: 0.159 ms
Execution time: 3827.846 ms
(8 rows)
Here you have my tests where I was able to reproduce the problem using default settings on 9.6, 9.5 and 9.3. 9.6 and 9.5 choose the wrong index, while 9.3 didn’t. (update: 9.5 didn’t fail last time)
<test_bad_index_choice.sql><bad_idx_choice.9.6.out><bad_idx_choice.9.5.out><bad_idx_choice.9.3.out>
However when I tried to add more than one value with this strange distribution ~ 30% of distribution to one value the index bad choice problem didn’t happen again in none of the different versions.
I Hope this helps. Regards,
Daniel Blanch.El 10 dic 2016, a las 21:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> escribió:
Hi,
On 12/10/2016 12:51 AM, Tom Lane wrote:Eric Jiang <eric@doublemap.com> writes:I have a query that I *think* should use a multicolumn index, but
sometimes isn't, resulting in slow queries.
I tried to duplicate this behavior, without success. Are you running
with nondefault planner parameters?
My guess is this is a case of LIMIT the matching rows are uniformly distributed in the input data. The planner likely concludes that for a driver with a lot of data we'll find the first row using ix_updates_time very quickly, and that it will be cheaper than inspecting the larger multi-column index. But imagine a driver with a lots of data long time ago. That breaks the LIMIT fairly quickly.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance