Thread: My "TOAST slicing" patch -explanation

My "TOAST slicing" patch -explanation

From
John Gray
Date:
After reading the summary of Peter's comments on the patches in the
list, I thought I would provide a little more explanation, as he
suggests.

At present, functions that take text arguments retrieve them via the
PG_GETARG macros. This reconstructs the whole value into memory
(potentially time-consuming for a large value). This may be unnecessary
for some applications, where only a specified part of the value is
needed e.g. the header data from an image.

Because a TOASTed value is stored in equal-size chunks, it is
straightforward (for an uncompressed value) to determine which chunk(s)
contain a given substring. My patch adds accessor methods similar to the
existing macros which fetch the minimal number of chunks to satisfy a
particular substring request. (In the case of a compressed value,
"minimal number" == "all of them"). In other words, instead of writing
PG_GETARG_TEXT_P(0) and receiving back a large allocated value, you can
say PG_GETARG_TEXT_P_SLICE(0,0,500) to receive only the first 500 bytes.

This has a minimal performance impact -an indexscan is already used to
retrieve TOAST chunks -the extra accessor routines just specify the
second key as well as the first.

I have rewritten text_substr and bytea_substr to use these methods.

I have also added an ALTER TABLE x ALTER COLUMN y SET STORAGE command to
provide a simple way to change attstorage for a column. (The macros
above will be most efficient with uncompressed values.)

Finally, I am aware of the following items which are not covered by the
patch:

1) Efficient updating of parts of a value. This is not trivial[1].
2) Should the large object interface be handled via TOAST?[2]

As for the applications, well, it foes allow you to stream MP3s out of
the database reasonably efficiently :-)

Any questions, please ask...

John

[1] And I don't know how it might be used. Maybe if anyone has any ideas
about substring assignment (someone did point me in the direction of an
OVERLAY() function) or its applications I can think about that.
[2] AIUI, the underlying mechanism is similar, but implemented
separately.




Re: My "TOAST slicing" patch -explanation

From
Tom Lane
Date:
John Gray <jgray@azuli.co.uk> writes:
> Finally, I am aware of the following items which are not covered by the
> patch:

> 1) Efficient updating of parts of a value. This is not trivial[1].

Actually, based on subsequent discussion I now understand that efficient
updating of parts of a TOASTed value is impossible, if by that you mean
rewriting only the modified part.  This is so because TOAST does not
use MVCC, really: it relies on MVCC for the owning tuple to determine
visibility of a tuple value.  The only safe way to update a TOAST item
is to rewrite the whole thing with a new TOAST id number and then
update the owning tuple to reference that new id.

Despite this, it'd be a really good idea to offer functions that allow
applications to write part of a large TOASTed value.  Even if it can't
be as efficient as we'd like, we could still eliminate pushing the rest
of the value back and forth to the client.

> 2) Should the large object interface be handled via TOAST?[2]

Probably not, given the above facts.  We do have MVCC behavior for
partial updates of large objects, and we shouldn't lose it.


Having said all that, I think John's patch for substring extraction is
fine in concept.  I haven't looked at it in detail, but I think we
should review it and expect to apply it (possibly with some cleanups).
        regards, tom lane


Re: My "TOAST slicing" patch -explanation

From
Hannu Krosing
Date:
On Sun, 2002-02-24 at 23:52, Tom Lane wrote:
> John Gray <jgray@azuli.co.uk> writes:
> > Finally, I am aware of the following items which are not covered by the
> > patch:
> 
> > 1) Efficient updating of parts of a value. This is not trivial[1].
> 
> Actually, based on subsequent discussion I now understand that efficient
> updating of parts of a TOASTed value is impossible, if by that you mean
> rewriting only the modified part.  This is so because TOAST does not
> use MVCC, really: it relies on MVCC for the owning tuple to determine
> visibility of a tuple value.

Do TOAST tables participate in WAL ?

> The only safe way to update a TOAST item
> is to rewrite the whole thing with a new TOAST id number and then
> update the owning tuple to reference that new id.

Can't we still devise some way to reuse the chunks that did not change ?

With some kind of double indirection and associated bookkeeping perhaps?

> Despite this, it'd be a really good idea to offer functions that allow
> applications to write part of a large TOASTed value.  Even if it can't
> be as efficient as we'd like, we could still eliminate pushing the rest
> of the value back and forth to the client.

I guess this can be already done with creative use of substring() and ||

> > 2) Should the large object interface be handled via TOAST?[2]
> 
> Probably not, given the above facts.  We do have MVCC behavior for
> partial updates of large objects, and we shouldn't lose it.

It would feel "cleaner" to have one representation for LOs - can't TOAST
just be made to participate in MVCC?

We could restict WAL of LOs to UPDATES only (and force fsync on TOAST
FILE after INSERT) just to conserve log space.

----------------
Hannu



Re: My "TOAST slicing" patch -explanation

From
Tom Lane
Date:
Hannu Krosing <hannu@krosing.net> writes:
> Do TOAST tables participate in WAL ?

Sure.
        regards, tom lane