Re: Slicing TOAST - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Slicing TOAST
Date
Msg-id 5191ECDA.3080204@krosing.net
Whole thread Raw
In response to Slicing TOAST  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 05/14/2013 10:05 AM, Simon Riggs wrote:
> I'm proposing this now as a possible GSoC project:
>
> In 1-byte character encodings (i.e. not UTF-8), SUBSTR() is optimised
> to allow seeking straight to the exact slice when retrieving a large
> toasted value. This reduces I/O considerably when you have large
> toasted values since it is an O(1) action rather than an O(N).
>
> This is possible because the slicing of toasted values is predictable
> on 1 byte encodings.
>
> It would be useful to have a predictable function perform the slicing,
> so we could use that knowledge later to optimise searches in a wider
> range of situations. More specifically, since UTF-8 is so common, to
> allow optimisations in that encoding of common data: text, XML, JSON.
>
> e.g. if we knew that an XML document has a required element called
> TITLE and that occurs only once and always in the first slice, it
> would be useful information to use in search functions. (Not sure, but
> it may be possible to assign non-consecutive slice numbers to allow
> variable data mid-way through a column value if needed).
>
> e.g. in UTF-8 free text we could put 500 characters in each slice, so
> that even if that could be anywhere between 500 and 2000 bytes it
> would still fit just fine.
>
> e.g. for arrays, if we put say 200 elements per slice, then accessing
> particular elements would require only 1 slice retrieval.
>
> Doing this would *possibly* reduce packing density, but not certainly
> so. But it would greatly improve access times to large structured
> toast values.
On the contrary, as it would enable us to pack the chunks fitting
more on the page, especially for :)

That is, first chunk into N bytes, then compress each chunk

-----------------
Hannu



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: erroneous restore into pg_catalog schema
Next
From: Marti Raudsepp
Date:
Subject: Re: PostgreSQL 9.3 beta breaks some extensions "make install"