Re: How to speed up min/max(id) in 50M rows table? - Mailing list pgsql-performance

From Tom Lane
Subject Re: How to speed up min/max(id) in 50M rows table?
Date
Msg-id 5754.1192226626@sss.pgh.pa.us
Whole thread Raw
In response to How to speed up min/max(id) in 50M rows table?  (henk de wit <henk53602@hotmail.com>)
Responses Re: How to speed up min/max(id) in 50M rows table?
Re: How to speed up min/max(id) in 50M rows table?
List pgsql-performance
henk de wit <henk53602@hotmail.com> writes:
> The plan looks like this:

> "Result  (cost=3D0.37..0.38 rows=3D1 width=3D0) (actual time=3D184231.636..=
> 184231.638 rows=3D1 loops=3D1)"
> "  InitPlan"
> "    ->  Limit  (cost=3D0.00..0.37 rows=3D1 width=3D8) (actual time=3D18423=
> 1.620..184231.622 rows=3D1 loops=3D1)"
> "          ->  Index Scan Backward using trans_payment_id_index on transact=
> ions  (cost=3D0.00..19144690.58 rows=3D51122691 width=3D8) (actual time=3D1=
> 84231.613..184231.613 rows=3D1 loops=3D1)"
> "                Filter: (payment_id IS NOT NULL)"
> "Total runtime: 184231.755 ms"

The only way I can see for that to be so slow is if you have a very
large number of rows where payment_id is null --- is that the case?

There's not a lot you could do about that in existing releases :-(.
In 8.3 it'll be possible to declare the index as NULLS FIRST, which
moves the performance problem from the max end to the min end ...

> select min(time) from transactions where payment_id =3D 67

> There are indexes on both the time (a timestamp with time zone) and payment=
> _id (a bigint) columns.

Creating indexes at random with no thought about how the system could
use them is not a recipe for speeding up your queries.  What you'd need
to make this query fast is a double-column index on (payment_id, time)
so that a forward scan on the items with payment_id = 67 would
immediately find the minimum time entry.  Neither of the single-column
indexes offers any way to find the desired entry without scanning over
lots of unrelated entries.

            regards, tom lane

pgsql-performance by date:

Previous
From: henk de wit
Date:
Subject: Re: Huge amount of memory consumed during transaction
Next
From: Alan Hodgson
Date:
Subject: Re: How to speed up min/max(id) in 50M rows table?