Re: Efficient slicing/substring of TOAST values (reprise) - Mailing list pgsql-patches

From John Gray
Subject Re: Efficient slicing/substring of TOAST values (reprise)
Date
Msg-id 1004196937.1219.973.camel@adzuki
Whole thread Raw
In response to Re: Efficient slicing/substring of TOAST values (reprise)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
I've been away, so this is a very delayed response... and some of it is
my attempt to work out why things do what they do, and I apologise for
the length. I've wandered off into a mini dissertation on TOAST value
ids as well, and maybe spotted a very unlikely case of TOAST data
corruption on OID / toast value id wraparound.

On Tue, 2001-10-16 at 22:56, Tom Lane wrote:
> John Gray <jgray@azuli.co.uk> writes:
> > Things I've noticed in passing:
>
> > 1. utils/adt/varlena.c There could be some performance gains for the
> > length functions if the PG_GETARG API allowed for finding the length of
> > a value without detoasting it.
>
> This is doable, but it's uglier because the functions need to know a lot
> more about toasting; is it really worth it?

That's a fair point. The length functions aren't going to be that
heavily used anyway.
>
> > 2. commands/command.c Some of the recursion to inherited tables passes
> > the inhOpt from the parent rather than setting false.
>
> That would be a bug, but I can't see any such error in current CVS.
> Where are you looking?
Maybe I dreamt it :) I've looked again, and it's not there... Oh well, I
was tired when I wrote that email.

>
> > 3. alter table add constraint doesn't (on the face of it) prevent adding
> > constraints to system tables if you're the superuser.
>
> Should it?  They'd be ignored anyway by most internal operations.
> I suppose at the very least it should check usecatupd...
>
Well, it was really a consistency point -whether alter table add
constraint could use AlterTableColumnSetup -but it can't (as it stands
now) because AlterTableColumnSetup does a heap_close on the relation
-and AlterTableAddConstraint does a heap scan on the relation so needs
to keep it open.

> > 4. New function-call interface is mainly documented in fmgr/README which
> > is in the future tense. Should this go into a reference manual section
> > instead? (those bits that it's good for programmer-users to know)
>
> There is some documentation in xfunc.sgml, but I have no objection to
> transposing more of the README into the SGML docs.  Just haven't got
> round to it.

I'll have another look and might move a little more (not implementation
details though) across in another patch.
>
> > 2) TOAST valueids. If MVCC works just as well on TOAST tables, then the
> > update process is much simplified as an amended value doesn't need a new
> > valueid.
>
> Not sure that that's safe; need to think more.
>
You're right. I've been probing a bit, and I start to understand the use
of TOAST valueids. Here is my understanding of what happens when
heap_tuple_toast_attrs() is called:

1. On update: each attribute in the new tuple (to be toasted) could
contain:

a. a plain value
b. a TOASTed value which hasn't been changed
c. a TOASTed value which was copied from another relation.

2. On insert: each attribute in the new tuple (to be toasted) could
contain:

a. a plain value
b. a TOASTed value which was copied from another relation.

Cases 1b and 2a are straightforward.

Cases 1c and 2b: we need to make a copy of the TOAST value content. The
current implementation does this by detoasting (and in the case of 1c,
deleting the old value if necessary) and retoasting, which is simpler
than messing around with copying chunks in the toast relation.

Case 1a: The plain value might be the new value for an attribute which
was previously toasted. The new value might not be a candidate for
toasting (e.g. very short) in which case the old toast value is
discarded. Thus discarding the old toast value and producing a new one
(with a new valueid) is the simplest way to handle this case.

Update of TOAST values if the header doesn't change (i.e. length remains
the same) could be done just by altering chunks in the toast relation.
However, at present, there's no MVCC visibility control on TOAST table
accesses because heap_fetch in toast_fetch_datum uses SnapshotAny. This
works around a significant amount of time qualification checking that
would otherwise be required (e.g. if SnapshotNow were used instead.)
-the comments in HeapTupleSatisifies suggest that time qualification
checking is more lengthy than key checking.

Therefore, attempting to use MVCC to optimise partial updates of values
would add an efficiency penalty to all TOAST accesses. The only way I
see to avoid this (and it's not a good way!) would be to have a flag on
a toast relation (or base table?) that indicated whether to use
SnapshotAny or ?SnapshotNow/?SnapshotUpdate i.e. having two different
sets of toast relation semantics.

So, in conclusion: The current scheme doesn't offer any easy shortcuts
to partial updates. The best approach (in terms of avoiding detoasting)
would be to implement toast_modify_datum which would insert (under a new
valueid) a copy of an existing toast value (with necessary chunks
modified) into the toast relation. It would copy chunk by chunk and thus
avoid the memory impact of detoasting. Should I add the underlying
routines for this to my current patch, or keep it for another one?

Finally, I'm wondering about the impact of OID wraparound on TOAST
valueids. ISTM that if you have a large table (say for document
management) and some documents are (essentially) static and some are
very frequently modified, when the OID counter wraps round, lots of
value toasting will fail because of duplicate keys on the index insert.

Because we only fetch TOAST chunks via the index, we escape corruption
problems. Before the transaction aborted, chunk 0 would have been added
to the heap. It will be dead because its xmin is marked as aborted, but
that wouldn't matter to toast_fetch_datum (SnapshotAny again) which
would see two copies of chunk 0. We get away with it because there will
only be one index entry (which will point to the correct item for chunk
0) and we only use an indexscan to access the toast relation.

There is also a (very theoretical) corruption problem if you never
VACUUM the toast table because deleted TOAST values remain visible to
SnapshotAny, and the index insert after wraparound will succeed because
it doesn't consider the deleted TOAST chunks to be live duplicates. I
can't honestly see this happening in reality though: the relations would
be absolutely huge if there had been a few billion updates and no
vacuum!

Apologies for the essay!

Regards

John

--
John Gray
Azuli IT    http://www.azuli.co.uk      +44 121 693 3397



pgsql-patches by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Add regression test for ALTER TABLE / ADD UNIQUE
Next
From: Rene Pijlman
Date:
Subject: Re: [DOCS] NULLs and sort order