Pluggable toaster - Mailing list pgsql-hackers
From | Teodor Sigaev |
---|---|
Subject | Pluggable toaster |
Date | |
Msg-id | 224711f9-83b7-a307-b17f-4457ab73aa0a@sigaev.ru Whole thread Raw |
Responses |
Re: Pluggable toaster
Re: Pluggable toaster |
List | pgsql-hackers |
Hi! We are working on custom toaster for JSONB [1], because current TOAST is universal for any data type and because of that it has some disadvantages: - "one toast fits all" may be not the best solution for particular type or/and use cases - it doesn't know the internal structure of data type, so it cannot choose an optimal toast strategy - it can't share common parts between different rows and even versions of rows Modification of current toaster for all tasks and cases looks too complex, moreover, it will not works for custom data types. Postgres is an extensible database, why not to extent its extensibility even further, to have pluggable TOAST! We propose an idea to separate toaster from heap using toaster API similar to table AM API etc. Following patches are applicable over patch in [1] 1) 1_toaster_interface_v1.patch.gz https://github.com/postgrespro/postgres/tree/toaster_interface Introduces syntax for storage and formal toaster API. Adds column atttoaster to pg_attribute, by design this column should not be equal to invalid oid for any toastable datatype, ie it must have correct oid for any type (not column) with non-plain storage. Since toaster may support only particular datatype, core should check correctness of toaster set by toaster validate method. New commands could be found in src/test/regress/sql/toaster.sql On-disk toast pointer structure now has one more possible struct - varatt_custom with fixed header and variable tail which uses as a storage for custom toasters. Format of built-in toaster is kept to allow simple pg_upgrade logic. Since toaster for column could be changed during table's lifetime we had two options about toaster's drop operation: - if column's toaster has been changed, then we need to re-toast all values, which could be extremely expensive. In any case, functions/operators should be ready to work with values toasted by different toasters, although any toaster should execute simple toast/detoast operation, which allows any existing code to work with the new approach. Tracking dependency of toasters and rows looks as bad idea. - disallow drop toaster. We don't believe that there will be many toasters at the same time (number of AM isn't very high too and we don't believe that it will be changed significantly in the near future), so prohibition of dropping of toaster looks reasonable. In this patch set we choose second option. Toaster API includes get_vtable method, which is planned to access the custom toaster features which isn't covered by this API. The idea is, that toaster returns some structure with some values and/or pointers to toaster's methods and caller could use it for particular purposes, see patch 4). Kind of structure identified by magic number, which should be a first field in this structure. Also added contrib/dummy_toaster to simplify checking. psql/pg_dump are modified to support toaster object concept. 2) 2_toaster_default_v1.patch.gz https://github.com/postgrespro/postgres/tree/toaster_default Built-in toaster implemented (with some refactoring) uisng toaster API as generic (or default) toaster. dummy_toaster here is a minimal workable example, it saves value directly in toast pointer and fails if value is greater than 1kb. 3) 3_toaster_snapshot_v1.patch.gz https://github.com/postgrespro/postgres/tree/toaster_snapshot The patch implements technology to distinguish row's versions in toasted values to share common parts of toasted values between different versions of rows 4) 4_bytea_appendable_toaster_v1.patch.gz https://github.com/postgrespro/postgres/tree/bytea_appendable_toaster Contrib module implements toaster for non-compressed bytea columns, which allows fast appending to existing bytea value. Appended tail stored directly in toaster pointer, if there is enough place to do it. Note: patch modifies byteacat() to support contrib toaster. Seems, it's looks ugly and contrib module should create new concatenation function. We are open for any questions, discussions, objections and advices. Thank you. Peoples behind: Oleg Bartunov Nikita Gluhov Nikita Malakhov Teodor Sigaev [1] https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f25b@sigaev.ru <https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f25b@sigaev.ru> -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
pgsql-hackers by date: