Thread: Rules, Windows and ORDER BY

Rules, Windows and ORDER BY

From
Jason Dusek
Date:
Hello List,

I have a simple table of keys and values which periodically
receives updated values. It's desirable to keep older values
but, most of the time, we query only for the latest value of a
particular key.

  CREATE TABLE kv
  ( k bytea NOT NULL,
    at timestamptz NOT NULL,
    realm bytea NOT NULL,
    v bytea NOT NULL );
  CREATE INDEX ON kv USING hash(k);
  CREATE INDEX ON kv (t);
  CREATE INDEX ON kv USING hash(realm);

  SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;

It would be nice to encapsulate this common query with a VIEW;
for example:

  CREATE VIEW kv_new AS
    SELECT * FROM kv WHERE at =
      ( SELECT at FROM kv AS _
         WHERE _.k = kv.k AND _.realm = kv.realm
         ORDER BY at DESC LIMIT 1 );

I tried partition functions, at first, but they were really very
slow. This view is pretty sprightly but has a more complicated
plan than the original query, which only has a sort followed by
an index scan, and is consequently not as fast. Please find the
plans below my signature.

Ideally, I'd be able to create a rule where the ORDER BY and
LIMIT were simply appended to whatever SELECT was given; but I
am at a loss as to how to do that. Creating a VIEW with the
order and limit just gives me a table with one row in it (of
course).

Is there something better than a sub-select here? I tried using
one with max(at) but it's not noticeably faster. I would be
interested to see how others have approached this kind of log-
-structured storage in Postgres. The window functions make,
alas, no use of indexes.

--
Jason Dusek
pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B




  EXPLAIN (COSTS FALSE, FORMAT YAML)
    SELECT * FROM kv WHERE k = ... AND realm = ... ORDER BY at LIMIT 1;
  -[ RECORD 1 ]-----------------------------------------
  QUERY PLAN | - Plan:
             |     Node Type: "Limit"
             |     Plans:
             |       - Node Type: "Sort"
             |         Parent Relationship: "Outer"
             |         Sort Key:
             |           - "at"
             |         Plans:
             |           - Node Type: "Index Scan"
             |             Parent Relationship: "Outer"
             |             Scan Direction: "NoMovement"
             |             Index Name: "kv_k_idx"
             |             Relation Name: "kv"
             |             Alias: "kv"
             |             Index Cond: "(k = ...)"
             |             Filter: "(realm = ...)"


  EXPLAIN (COSTS FALSE, FORMAT YAML)
    SELECT * FROM kv_new WHERE k = ... AND realm = ...;
  -[ RECORD 1 ]-----------------------------------------------------
  QUERY PLAN | - Plan:
             |     Node Type: "Index Scan"
             |     Scan Direction: "NoMovement"
             |     Index Name: "kv_k_idx"
             |     Relation Name: "kv"
             |     Alias: "kv"
             |     Index Cond: "(k = ...)"
             |     Filter: "((realm = ...) AND (at = (SubPlan 1)))"
             |     Plans:
             |       - Node Type: "Limit"
             |         Parent Relationship: "SubPlan"
             |         Subplan Name: "SubPlan 1"
             |         Plans:
             |           - Node Type: "Sort"
             |             Parent Relationship: "Outer"
             |             Sort Key:
             |               - "_.at"
             |             Plans:
             |               - Node Type: "Index Scan"
             |                 Parent Relationship: "Outer"
             |                 Scan Direction: "NoMovement"
             |                 Index Name: "kv_k_idx"
             |                 Relation Name: "kv"
             |                 Alias: "_"
             |                 Index Cond: "(k = kv.k)"
             |                 Filter: "(realm = kv.realm)"


Re: Rules, Windows and ORDER BY

From
Tom Lane
Date:
Jason Dusek <jason.dusek@gmail.com> writes:
> I have a simple table of keys and values which periodically
> receives updated values. It's desirable to keep older values
> but, most of the time, we query only for the latest value of a
> particular key.

>   CREATE TABLE kv
>   ( k bytea NOT NULL,
>     at timestamptz NOT NULL,
>     realm bytea NOT NULL,
>     v bytea NOT NULL );
>   CREATE INDEX ON kv USING hash(k);
>   CREATE INDEX ON kv (t);
>   CREATE INDEX ON kv USING hash(realm);

>   SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;

If you want to make that fast, an index on (k,realm,at) would help.
Those indexes that you did create are next to useless for this,
and furthermore hash indexes are quite unsafe for production.

            regards, tom lane


Re: Rules, Windows and ORDER BY

From
Jason Dusek
Date:
2012/8/23 Tom Lane <tgl@sss.pgh.pa.us>:
> Jason Dusek <jason.dusek@gmail.com> writes:
>> I have a simple table of keys and values which periodically
>> receives updated values. It's desirable to keep older values
>> but, most of the time, we query only for the latest value of a
>> particular key.
>
>>   CREATE TABLE kv
>>   ( k bytea NOT NULL,
>>     at timestamptz NOT NULL,
>>     realm bytea NOT NULL,
>>     v bytea NOT NULL );
>>   CREATE INDEX ON kv USING hash(k);
>>   CREATE INDEX ON kv (t);
>>   CREATE INDEX ON kv USING hash(realm);
>
>>   SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;
>
> If you want to make that fast, an index on (k,realm,at) would
> help.  Those indexes that you did create are next to useless
> for this, and furthermore hash indexes are quite unsafe for
> production.

Thanks for pointing out the unsafety of hash indexes. I think I
got in the habit of using them for a project with large,
temporary data sets.

Why are the individual indices not useful? The tests that the
query does -- equality on key and realm and ordering on at --
are each supported by indices. Does it have to do with the cost
of loading the three indices?

--
Jason Dusek
pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B


Re: Rules, Windows and ORDER BY

From
Martijn van Oosterhout
Date:
On Fri, Aug 24, 2012 at 09:32:32AM +0000, Jason Dusek wrote:
> 2012/8/23 Tom Lane <tgl@sss.pgh.pa.us>:
> > Jason Dusek <jason.dusek@gmail.com> writes:
> >>   CREATE TABLE kv
> >>   ( k bytea NOT NULL,
> >>     at timestamptz NOT NULL,
> >>     realm bytea NOT NULL,
> >>     v bytea NOT NULL );
> >>   CREATE INDEX ON kv USING hash(k);
> >>   CREATE INDEX ON kv (t);
> >>   CREATE INDEX ON kv USING hash(realm);
> >
> >>   SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;
> >
> > If you want to make that fast, an index on (k,realm,at) would
> > help.  Those indexes that you did create are next to useless
> > for this, and furthermore hash indexes are quite unsafe for
> > production.
>
> Why are the individual indices not useful? The tests that the
> query does -- equality on key and realm and ordering on at --
> are each supported by indices. Does it have to do with the cost
> of loading the three indices?

I'm not entirely sure, but I'll take a stab at it. I think it has to do
with the fact that you want order. Combining multiple indexes so you
use them at the same time works as an BitmapAnd. That is, it uses each
index to determine blocks that are interesting and then find the blocks
that are listed by all tindexes, and then it loads the blocks and chcks
them.

The problem here is that you want ORDER BY at, which makes the above
scheme fall apart, because order is not preversed. So it falls back on
either scanning the 'at' index and probing checking the rows to see if
they match, or using all indexes, and then sorting the result.

In theory you could BitmapAnd the 'k' and 'realm' indexes and then scan
the 'at' index only checking rows that the bitmap shows are
interesting.  But I'm not sure if postgres can do that.

Anyway, the suggested three column index will match your query in a
single lookup and hence be much faster than any of the above
suggestions, so if this is a really important query then it may be
worth it here.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

Attachment

Re: Rules, Windows and ORDER BY

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Fri, Aug 24, 2012 at 09:32:32AM +0000, Jason Dusek wrote:
>> Why are the individual indices not useful? The tests that the
>> query does -- equality on key and realm and ordering on at --
>> are each supported by indices. Does it have to do with the cost
>> of loading the three indices?

> I'm not entirely sure, but I'll take a stab at it. I think it has to do
> with the fact that you want order. Combining multiple indexes so you
> use them at the same time works as an BitmapAnd. That is, it uses each
> index to determine blocks that are interesting and then find the blocks
> that are listed by all tindexes, and then it loads the blocks and chcks
> them.

Yeah.  While you *can* in principle solve the problem with the
individual indexes, it's much less efficient than a single index.
In particular, BitmapAnd plans are far from being a magic bullet
for combining two individually-not-very-selective conditions.
(That realm constraint is surely not very selective; dunno about
the key one.)  That implies reading a large number of entries from
each index, forming a rather large bitmap for each one, and then
ANDing those bitmaps to get a smaller one.  And even after all that
work, you're still not done, because you have no idea which bit in
the bitmap represents the row with largest "at" value.

> In theory you could BitmapAnd the 'k' and 'realm' indexes and then scan
> the 'at' index only checking rows that the bitmap shows are
> interesting.  But I'm not sure if postgres can do that.

No, it can't, and that likely wouldn't be a very effective plan anyway;
you could end up scanning a very large fraction of the "at" index, since
you'd have to start at the end (the latest entry anywhere in the table).
Even if you didn't make many trips to the heap, that's not cheap.

In constrast, given a three-column btree index organized with the
equality-constrained columns first, the btree code can descend the
index tree straight to the entry you want.  We've expended a lot of
sweat on optimizing that case, and it will absolutely blow the doors
off anything involving a bitmap scan.

Of course the downside is that the three-column index might be
relatively useless for queries of forms other than this one.
So it's a tradeoff between flexibility and performance.  But since
the OP is asking, I'm assuming he cares a lot about performance of
queries of this exact form.

            regards, tom lane