Thread: Efficient slicing/substring of TOAST values (reprise)

Efficient slicing/substring of TOAST values (reprise)

From
John Gray
Date:
I attach the new version of my patch for providing
PG_GETARG_xxx_P_SLICE() macros and associated support routines.

Contents of this patch:

1. Routines in src/backend/access/tuptoaster.c for fetching only
necessary chunks of a toasted value.

2. Amended text_substr and bytea_substr to use new methods.

3. Added new command: ALTER TABLE tabname ALTER COLUMN colname SET
STORAGE {PLAIN | EXTERNAL | EXTENDED | MAIN} to parser and documented in
alter-table.sgml. (NB I used ColId as the item type for the storage
mode, I hope this makes sense!). All this does is sets attstorage for
the specified column.

4. Rearrangements of ALTER TABLE ALTER COLUMN to tidy it up.
AlterTableAlterColumnStatistics is now AlterTableAlterColumnFlags and
uses the subtype code to distinguish. Setup code (permissions checks
etc.) moved to new routine, macro for the loop that generates all
inheritors. I have merged Tom Lane's patches (though he may want to
check that it's been done correctly).

5. Documented new macros (and also the PG_GETARG_xxx_P_COPY macros) is
xfunc.sgml.

Things I've noticed in passing:

1. utils/adt/varlena.c There could be some performance gains for the
length functions if the PG_GETARG API allowed for finding the length of
a value without detoasting it.

2. commands/command.c Some of the recursion to inherited tables passes
the inhOpt from the parent rather than setting false. If
find_all_inheritors does what it says (i.e. it finds all children,
grandchildren etc.), we could save some wasted effort by using false
instead.

3. alter table add constraint doesn't (on the face of it) prevent adding
constraints to system tables if you're the superuser.

4. New function-call interface is mainly documented in fmgr/README which
is in the future tense. Should this go into a reference manual section
instead? (those bits that it's good for programmer-users to know)

I haven't done anything on the update issue. We need to think about:

1) Syntax (how to specify a partial update of a column).

and

2) TOAST valueids. If MVCC works just as well on TOAST tables, then the
update process is much simplified as an amended value doesn't need a new
valueid.

I'm going to be away now until the 23rd, but please let me know if
there's anything I can do on my return. I'm happy to take comments and
rework this as necessary.

Regards

John

--
John Gray
Azuli IT    http://www.azuli.co.uk      +44 121 693 3397
jgray@azuli.co.uk



Attachment

Re: Efficient slicing/substring of TOAST values (reprise)

From
Tom Lane
Date:
John Gray <jgray@azuli.co.uk> writes:
> Things I've noticed in passing:

> 1. utils/adt/varlena.c There could be some performance gains for the
> length functions if the PG_GETARG API allowed for finding the length of
> a value without detoasting it.

This is doable, but it's uglier because the functions need to know a lot
more about toasting; is it really worth it?

> 2. commands/command.c Some of the recursion to inherited tables passes
> the inhOpt from the parent rather than setting false.

That would be a bug, but I can't see any such error in current CVS.
Where are you looking?

> 3. alter table add constraint doesn't (on the face of it) prevent adding
> constraints to system tables if you're the superuser.

Should it?  They'd be ignored anyway by most internal operations.
I suppose at the very least it should check usecatupd...

> 4. New function-call interface is mainly documented in fmgr/README which
> is in the future tense. Should this go into a reference manual section
> instead? (those bits that it's good for programmer-users to know)

There is some documentation in xfunc.sgml, but I have no objection to
transposing more of the README into the SGML docs.  Just haven't got
round to it.

> 2) TOAST valueids. If MVCC works just as well on TOAST tables, then the
> update process is much simplified as an amended value doesn't need a new
> valueid.

Not sure that that's safe; need to think more.

            regards, tom lane

Re: Efficient slicing/substring of TOAST values (reprise)

From
John Gray
Date:
I've been away, so this is a very delayed response... and some of it is
my attempt to work out why things do what they do, and I apologise for
the length. I've wandered off into a mini dissertation on TOAST value
ids as well, and maybe spotted a very unlikely case of TOAST data
corruption on OID / toast value id wraparound.

On Tue, 2001-10-16 at 22:56, Tom Lane wrote:
> John Gray <jgray@azuli.co.uk> writes:
> > Things I've noticed in passing:
>
> > 1. utils/adt/varlena.c There could be some performance gains for the
> > length functions if the PG_GETARG API allowed for finding the length of
> > a value without detoasting it.
>
> This is doable, but it's uglier because the functions need to know a lot
> more about toasting; is it really worth it?

That's a fair point. The length functions aren't going to be that
heavily used anyway.
>
> > 2. commands/command.c Some of the recursion to inherited tables passes
> > the inhOpt from the parent rather than setting false.
>
> That would be a bug, but I can't see any such error in current CVS.
> Where are you looking?
Maybe I dreamt it :) I've looked again, and it's not there... Oh well, I
was tired when I wrote that email.

>
> > 3. alter table add constraint doesn't (on the face of it) prevent adding
> > constraints to system tables if you're the superuser.
>
> Should it?  They'd be ignored anyway by most internal operations.
> I suppose at the very least it should check usecatupd...
>
Well, it was really a consistency point -whether alter table add
constraint could use AlterTableColumnSetup -but it can't (as it stands
now) because AlterTableColumnSetup does a heap_close on the relation
-and AlterTableAddConstraint does a heap scan on the relation so needs
to keep it open.

> > 4. New function-call interface is mainly documented in fmgr/README which
> > is in the future tense. Should this go into a reference manual section
> > instead? (those bits that it's good for programmer-users to know)
>
> There is some documentation in xfunc.sgml, but I have no objection to
> transposing more of the README into the SGML docs.  Just haven't got
> round to it.

I'll have another look and might move a little more (not implementation
details though) across in another patch.
>
> > 2) TOAST valueids. If MVCC works just as well on TOAST tables, then the
> > update process is much simplified as an amended value doesn't need a new
> > valueid.
>
> Not sure that that's safe; need to think more.
>
You're right. I've been probing a bit, and I start to understand the use
of TOAST valueids. Here is my understanding of what happens when
heap_tuple_toast_attrs() is called:

1. On update: each attribute in the new tuple (to be toasted) could
contain:

a. a plain value
b. a TOASTed value which hasn't been changed
c. a TOASTed value which was copied from another relation.

2. On insert: each attribute in the new tuple (to be toasted) could
contain:

a. a plain value
b. a TOASTed value which was copied from another relation.

Cases 1b and 2a are straightforward.

Cases 1c and 2b: we need to make a copy of the TOAST value content. The
current implementation does this by detoasting (and in the case of 1c,
deleting the old value if necessary) and retoasting, which is simpler
than messing around with copying chunks in the toast relation.

Case 1a: The plain value might be the new value for an attribute which
was previously toasted. The new value might not be a candidate for
toasting (e.g. very short) in which case the old toast value is
discarded. Thus discarding the old toast value and producing a new one
(with a new valueid) is the simplest way to handle this case.

Update of TOAST values if the header doesn't change (i.e. length remains
the same) could be done just by altering chunks in the toast relation.
However, at present, there's no MVCC visibility control on TOAST table
accesses because heap_fetch in toast_fetch_datum uses SnapshotAny. This
works around a significant amount of time qualification checking that
would otherwise be required (e.g. if SnapshotNow were used instead.)
-the comments in HeapTupleSatisifies suggest that time qualification
checking is more lengthy than key checking.

Therefore, attempting to use MVCC to optimise partial updates of values
would add an efficiency penalty to all TOAST accesses. The only way I
see to avoid this (and it's not a good way!) would be to have a flag on
a toast relation (or base table?) that indicated whether to use
SnapshotAny or ?SnapshotNow/?SnapshotUpdate i.e. having two different
sets of toast relation semantics.

So, in conclusion: The current scheme doesn't offer any easy shortcuts
to partial updates. The best approach (in terms of avoiding detoasting)
would be to implement toast_modify_datum which would insert (under a new
valueid) a copy of an existing toast value (with necessary chunks
modified) into the toast relation. It would copy chunk by chunk and thus
avoid the memory impact of detoasting. Should I add the underlying
routines for this to my current patch, or keep it for another one?

Finally, I'm wondering about the impact of OID wraparound on TOAST
valueids. ISTM that if you have a large table (say for document
management) and some documents are (essentially) static and some are
very frequently modified, when the OID counter wraps round, lots of
value toasting will fail because of duplicate keys on the index insert.

Because we only fetch TOAST chunks via the index, we escape corruption
problems. Before the transaction aborted, chunk 0 would have been added
to the heap. It will be dead because its xmin is marked as aborted, but
that wouldn't matter to toast_fetch_datum (SnapshotAny again) which
would see two copies of chunk 0. We get away with it because there will
only be one index entry (which will point to the correct item for chunk
0) and we only use an indexscan to access the toast relation.

There is also a (very theoretical) corruption problem if you never
VACUUM the toast table because deleted TOAST values remain visible to
SnapshotAny, and the index insert after wraparound will succeed because
it doesn't consider the deleted TOAST chunks to be live duplicates. I
can't honestly see this happening in reality though: the relations would
be absolutely huge if there had been a few billion updates and no
vacuum!

Apologies for the essay!

Regards

John

--
John Gray
Azuli IT    http://www.azuli.co.uk      +44 121 693 3397