Re: Fast Default WIP patch for discussion - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Fast Default WIP patch for discussion
Date
Msg-id CA+Tgmoa9hm1e+XB5Tc-ANyyZVc7CHFsA4_dZ_MdSJVpSL643Eg@mail.gmail.com
Whole thread Raw
In response to Fast Default WIP patch for discussion  (Serge Rielau <serge@rielau.com>)
Responses Re: Fast Default WIP patch for discussion  (Serge Rielau <serge@rielau.com>)
List pgsql-hackers
On Fri, Oct 21, 2016 at 7:15 PM, Serge Rielau <serge@rielau.com> wrote:
> Some key design points requiring discussion:
> 1. Storage of the “exist” (working name) default
>    Right now the patch stores the default value in its binary form as it
> would be in the tuple into a BYTEA.
>    It would be feasible to store the pretty printed literal, however this
> requires calling the io functions when a
>    tuple descriptor is built.

A pretty-printed literal is a terrible idea because input functions
need not be immutable (e.g. timestamptz).  I would have expected a
pg_node_tree column containing a CONST node, but your bytea thing
might be OK.

> 2. The exist default is cached alongside the “current” default in the tuple
> descriptor’s constraint structure.
>     Seems most natural too me, but debatable.

No opinion (yet).

> 3. Delayed vs. early expansion of the tuples.
>     To avoid having to decide when to copy tuple descriptors with or without
> constraints and defaults
>     I have opted to expand the tuple at the core entry points.
>     How do I know I have them all? An omission means wrong results!

Early expansion seems right.  "How do I know I have them all?" - and
not only all of the current ones but all future ones - seems like a
core sticking point for this patch.

> 4. attisnull()
>     This routine is used in many places, but to give correct result sit must
> now be accompanied
>     by the tuple descriptor. This becomes moderately messy and it’s not
> always clear where to get that.
>     Interestingly most usages are related to catalog lookups.
>     Assuming we have no intention to support fast default for catalog tables
> we could keep using the
>     existing attisnull() api for catalog lookups and use a new version (name
> tbd) for user tables.

attisnull is not a thing.  There's heap_attisnull and slot_attisnull.

> 5. My head hurts looking at the PK/FK code - it’s not always clear which
> tuple descriptor belongs
>     to which tuple

I suggest ibuprofen and a stiff upper lip.

> 6. Performance of the expansion code.
>     The current code needs to walk all defaults and then start padding by
> filling in values.
>     But the outcome is always the same. We will produce the same payload and
> the name null map.
>     It would be feasible to cache an “all defaults tuple”, remember the
> offsets (and VARWIDTH, HASNULL)
>     for each attribute and then simply splice the short and default tuples
> together.
>     This ought to be faster, but the meta data to cache is not insignificant
> and the expansion code is messy enough
>     without this already.

You could experiment to figure out if it makes any difference.

> 7. Grooming
>     Obviously we can remove all exist defaults for a table from pg_attribute
> whenever the table is rewrittem.
>     That’s easy.

But might cause the table to expand a lot, which would suck.

>     But could we/should we keep track of the short tuples and either
> eliminate them or drop exist defaults once they
>     become obsolete because there is no tuple short enough for them to
> matter.

I wouldn't mess with it.

> 8. Do we need to worry about toasted defaults?

Presumably.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: Streaming basebackups vs pg_stat_tmp
Next
From: Julien Rouhaud
Date:
Subject: Overlook in 2bd9e412?