Rules, Windows and ORDER BY - Mailing list pgsql-general

From Jason Dusek
Subject Rules, Windows and ORDER BY
Date
Msg-id CAO3NbwMz011_Tve5iTW86uNx3BkRk+fd=aL2fP28sSbg20zArA@mail.gmail.com
Whole thread Raw
Responses Re: Rules, Windows and ORDER BY  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hello List,

I have a simple table of keys and values which periodically
receives updated values. It's desirable to keep older values
but, most of the time, we query only for the latest value of a
particular key.

  CREATE TABLE kv
  ( k bytea NOT NULL,
    at timestamptz NOT NULL,
    realm bytea NOT NULL,
    v bytea NOT NULL );
  CREATE INDEX ON kv USING hash(k);
  CREATE INDEX ON kv (t);
  CREATE INDEX ON kv USING hash(realm);

  SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;

It would be nice to encapsulate this common query with a VIEW;
for example:

  CREATE VIEW kv_new AS
    SELECT * FROM kv WHERE at =
      ( SELECT at FROM kv AS _
         WHERE _.k = kv.k AND _.realm = kv.realm
         ORDER BY at DESC LIMIT 1 );

I tried partition functions, at first, but they were really very
slow. This view is pretty sprightly but has a more complicated
plan than the original query, which only has a sort followed by
an index scan, and is consequently not as fast. Please find the
plans below my signature.

Ideally, I'd be able to create a rule where the ORDER BY and
LIMIT were simply appended to whatever SELECT was given; but I
am at a loss as to how to do that. Creating a VIEW with the
order and limit just gives me a table with one row in it (of
course).

Is there something better than a sub-select here? I tried using
one with max(at) but it's not noticeably faster. I would be
interested to see how others have approached this kind of log-
-structured storage in Postgres. The window functions make,
alas, no use of indexes.

--
Jason Dusek
pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B




  EXPLAIN (COSTS FALSE, FORMAT YAML)
    SELECT * FROM kv WHERE k = ... AND realm = ... ORDER BY at LIMIT 1;
  -[ RECORD 1 ]-----------------------------------------
  QUERY PLAN | - Plan:
             |     Node Type: "Limit"
             |     Plans:
             |       - Node Type: "Sort"
             |         Parent Relationship: "Outer"
             |         Sort Key:
             |           - "at"
             |         Plans:
             |           - Node Type: "Index Scan"
             |             Parent Relationship: "Outer"
             |             Scan Direction: "NoMovement"
             |             Index Name: "kv_k_idx"
             |             Relation Name: "kv"
             |             Alias: "kv"
             |             Index Cond: "(k = ...)"
             |             Filter: "(realm = ...)"


  EXPLAIN (COSTS FALSE, FORMAT YAML)
    SELECT * FROM kv_new WHERE k = ... AND realm = ...;
  -[ RECORD 1 ]-----------------------------------------------------
  QUERY PLAN | - Plan:
             |     Node Type: "Index Scan"
             |     Scan Direction: "NoMovement"
             |     Index Name: "kv_k_idx"
             |     Relation Name: "kv"
             |     Alias: "kv"
             |     Index Cond: "(k = ...)"
             |     Filter: "((realm = ...) AND (at = (SubPlan 1)))"
             |     Plans:
             |       - Node Type: "Limit"
             |         Parent Relationship: "SubPlan"
             |         Subplan Name: "SubPlan 1"
             |         Plans:
             |           - Node Type: "Sort"
             |             Parent Relationship: "Outer"
             |             Sort Key:
             |               - "_.at"
             |             Plans:
             |               - Node Type: "Index Scan"
             |                 Parent Relationship: "Outer"
             |                 Scan Direction: "NoMovement"
             |                 Index Name: "kv_k_idx"
             |                 Relation Name: "kv"
             |                 Alias: "_"
             |                 Index Cond: "(k = kv.k)"
             |                 Filter: "(realm = kv.realm)"


pgsql-general by date:

Previous
From: Chris Travers
Date:
Subject: Re: At what point does a big table start becoming too big?
Next
From: Vincent Veyron
Date:
Subject: Re: Amazon High I/O instances