[HACKERS] POC: Sharing record typmods between backends - Mailing list pgsql-hackers

From Thomas Munro
Subject [HACKERS] POC: Sharing record typmods between backends
Date
Msg-id CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] POC: Sharing record typmods between backends  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Hi hackers,

Tuples can have type RECORDOID and a typmod number that identifies a
"blessed" TupleDesc in a backend-private cache.  To support the
sharing of such tuples through shared memory and temporary files, I
think we need a typmod registry in shared memory.  Here's a
proof-of-concept patch for discussion.  I'd be grateful for any
feedback and/or flames.

This is a problem I ran into in my parallel hash join project.  Robert
pointed it out to me and told me to go read tqueue.c for details, and
my first reaction was: I'll code around this by teaching the planner
to avoid sharing tuples from paths that produce transient record types
based on tlist analysis[1].  Aside from being a cop-out, that approach
doesn't work because the planner doesn't actually know what types the
executor might come up with since some amount of substitution for
structurally-similar records seems to be allowed[2] (though I'm not
sure I can explain that).  So... we're gonna need a bigger boat.

The patch uses typcache.c's backend-private cache still, but if the
backend is currently "attached" to a shared registry then it functions
as a write though cache.   There is no cache-invalidation problem
because registered typmods are never unregistered.  parallel.c exports
the leader's existing record typmods into a shared registry, and
attaches to it in workers.  A DSM detach hook returns backends to
private cache mode when parallelism ends.

Some thoughts:

* Maybe it would be better to have just one DSA area, rather than the
one controlled by execParallel.c (for executor nodes to use) and this
new one controlled by parallel.c (for the ParallelContext).  Those
scopes are approximately the same at least in the parallel query case,
but...

* It would be nice for the SharedRecordTypeRegistry to be able to
survive longer than a single parallel query, perhaps in a per-session
DSM segment.  Perhaps eventually we will want to consider a
query-scoped area, a transaction-scoped area and a session-scoped
area?  I didn't investigate that for this POC.

* It seemed to be a reasonable goal to avoid allocating an extra DSM
segment for every parallel query, so the new DSA area is created
in-place.  192KB turns out to be enough to hold an empty
SharedRecordTypmodRegistry due to dsa.c's superblock allocation scheme
(that's two 64KB size class superblocks + some DSA control
information).  It'll create a new DSM segment as soon as you start
using blessed records, and will do so for every parallel query you
start from then on with the same backend.  Erm, maybe adding 192KB to
every parallel query DSM segment won't be popular...

* Perhaps simplehash + an LWLock would be better than dht, but I
haven't looked into that.  Can it be convinced to work in DSA memory
and to grow on demand?

Here's one way to hit the new code path, so that record types blessed
in a worker are accessed from the leader:

  CREATE TABLE foo AS SELECT generate_series(1, 10) AS x;
  CREATE OR REPLACE FUNCTION make_record(n int)
    RETURNS RECORD LANGUAGE plpgsql PARALLEL SAFE AS
  $$
  BEGIN
    RETURN CASE n
             WHEN 1 THEN ROW(1)
             WHEN 2 THEN ROW(1, 2)
             WHEN 3 THEN ROW(1, 2, 3)
             WHEN 4 THEN ROW(1, 2, 3, 4)
             ELSE ROW(1, 2, 3, 4, 5)
           END;
  END;
  $$;
  SET force_parallel_mode = 1;
  SELECT make_record(x) FROM foo;

PATCH

1.  Apply dht-v3.patch[3].
2.  Apply shared-record-typmod-registry-v1.patch.
3.  Apply rip-out-tqueue-remapping-v1.patch.

[1]
https://www.postgresql.org/message-id/CAEepm%3D2%2Bzf7L_-eZ5hPW5%3DUS%2Butdo%3D9tMVD4wt7ZSM-uOoSxWg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CA+TgmoZMH6mJyXX=YLSOvJ8jULFqGgXWZCr_rbkc1nJ+177VSQ@mail.gmail.com
[3]
https://www.postgresql.org/message-id/flat/CAEepm%3D3d8o8XdVwYT6O%3DbHKsKAM2pu2D6sV1S_%3D4d%2BjStVCE7w%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: [HACKERS] SCRAM authentication, take three
Next
From: Vitaly Burovoy
Date:
Subject: Re: [HACKERS] identity columns