Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Custom compression methods
Date
Msg-id CA+TgmoaKDW1Oi9V=jc9hOGyf77NbkNEABuqgHD1Cq==1QsOcxg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Custom compression methods
Re: [HACKERS] Custom compression methods
Re: [HACKERS] Custom compression methods
Re: [HACKERS] Custom compression methods
List pgsql-hackers
On Wed, Nov 11, 2020 at 9:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> There were a few problems in this rebased version, basically, the
> compression options were not passed while compressing values from the
> brin_form_tuple, so I have fixed this.

Since the authorship history of this patch is complicated, it would be
nice if you would include authorship information and relevant
"Discussion" links in the patches.

Design level considerations and overall notes:

configure is autogenerated from configure.in, so the patch shouldn't
include changes only to the former.

Looking over the changes to src/include:

+       PGLZ_COMPRESSION_ID,
+       LZ4_COMPRESSION_ID

I think that it would be good to assign values to these explicitly.

+/* compresion handler routines */

Spelling.

+       /* compression routine for the compression method */
+       cmcompress_function cmcompress;
+
+       /* decompression routine for the compression method */
+       cmcompress_function cmdecompress;

Don't reuse cmcompress_function; that's confusing. Just have a typedef
per structure member, even if they end up being the same.

 #define TOAST_COMPRESS_SET_RAWSIZE(ptr, len) \
-       (((toast_compress_header *) (ptr))->rawsize = (len))
+do { \
+       Assert(len > 0 && len <= RAWSIZEMASK); \
+       ((toast_compress_header *) (ptr))->info = (len); \
+} while (0)

Indentation.

+#define TOAST_COMPRESS_SET_COMPRESSION_METHOD(ptr, cm_method) \
+       ((toast_compress_header *) (ptr))->info |= ((cm_method) << 30);

What about making TOAST_COMPRESS_SET_RAWSIZE() take another argument?
And possibly also rename it to TEST_COMPRESS_SET_SIZE_AND_METHOD() or
something? It seems not great to have separate functions each setting
part of a 4-byte quantity. Too much chance of failing to set both
parts. I guess you've got a function called
toast_set_compressed_datum_info() for that, but it's just a wrapper
around two macros that could just be combined, which would reduce
complexity overall.

+       T_CompressionRoutine,           /* in access/compressionapi.h */

This looks misplaced. I guess it should go just after these:

    T_FdwRoutine,               /* in foreign/fdwapi.h */
    T_IndexAmRoutine,           /* in access/amapi.h */
    T_TableAmRoutine,           /* in access/tableam.h */

Looking over the regression test changes:

The tests at the top of create_cm.out that just test that we can
create tables with various storage types seem unrelated to the purpose
of the patch. And the file doesn't test creating a compression method
either, as the file name would suggest, so either the file name needs
to be changed (compression, compression_method?) or the tests don't go
here.

+-- check data is okdd

I guess whoever is responsible for this comment prefers vi to emacs.

I don't quite understand the purpose of all of these tests, and there
are some things that I feel like ought to be tested that seemingly
aren't. Like, you seem to test using an UPDATE to move a datum from a
table to another table with the same compression method, but not one
with a different compression method. Testing the former is nice and
everything, but that's the easy case: I think we also need to test the
latter. I think it would be good to verify not only that the data is
readable but that it's compressed the way we expect. I think it would
be a great idea to add a pg_column_compression() function in a similar
spirit to pg_column_size(). Perhaps it could return NULL when
compression is not in use or the data type is not varlena, and the
name of the compression method otherwise. That would allow for better
testing of this feature, and it would also be useful to users who are
switching methods, to see what data they still have that's using the
old method. It could be useful for debugging problems on customer
systems, too.

I wonder if we need a test that moves data between tables through an
intermediary. For instance, suppose a plpgsql function or DO block
fetches some data and stores it in a plpgsql variable and then uses
the variable to insert into another table. Hmm, maybe that would force
de-TOASTing. But perhaps there are other cases. Maybe a more general
way to approach the problem is: have you tried running a coverage
report and checked which parts of your code are getting exercised by
the existing tests and which parts are not? The stuff that isn't, we
should try to add more tests. It's easy to get corner cases wrong with
this kind of thing.

I notice that LIKE INCLUDING COMPRESSION doesn't seem to be tested, at
least not by 0001, which reinforces my feeling that the tests here are
not as thorough as they could be.

+NOTICE:  pg_compression contains unpinned initdb-created object(s)

This seems wrong to me - why is it OK?

-       result = (struct varlena *)
-               palloc(TOAST_COMPRESS_RAWSIZE(attr) + VARHDRSZ);
-       SET_VARSIZE(result, TOAST_COMPRESS_RAWSIZE(attr) + VARHDRSZ);
+       cmoid = GetCompressionOidFromCompressionId(TOAST_COMPRESS_METHOD(attr));

-       if (pglz_decompress(TOAST_COMPRESS_RAWDATA(attr),
-                                               TOAST_COMPRESS_SIZE(attr),
-                                               VARDATA(result),
-
TOAST_COMPRESS_RAWSIZE(attr), true) < 0)
-               elog(ERROR, "compressed data is corrupted");
+       /* get compression method handler routines */
+       cmroutine = GetCompressionRoutine(cmoid);

-       return result;
+       return cmroutine->cmdecompress(attr);

I'm worried about how expensive this might be, and I think we could
make it cheaper. The reason why I think this might be expensive is:
currently, for every datum, you have a single direct function call.
Now, with this, you first have a direct function call to
GetCompressionOidFromCompressionId(). Then you have a call to
GetCompressionRoutine(), which does a syscache lookup and calls a
handler function, which is quite a lot more expensive than a single
function call. And the handler isn't even returning a statically
allocated structure, but is allocating new memory every time, which
involves more function calls and maybe memory leaks. Then you use the
results of all that to make an indirect function call.

I'm not sure exactly what combination of things we could use to make
this better, but it seems like there are a few possibilities:

(1) The handler function could return a pointer to the same
CompressionRoutine every time instead of constructing a new one every
time.
(2) The CompressionRoutine to which the handler function returns a
pointer could be statically allocated instead of being built at
runtime.
(3) GetCompressionRoutine could have an OID -> handler cache instead
of relying on syscache + calling the handler function all over again.
(4) For the compression types that have dedicated bit patterns in the
high bits of the compressed TOAST size, toast_compress_datum() could
just have hard-coded logic to use the correct handlers instead of
translating the bit pattern into an OID and then looking it up over
again.
(5) Going even further than #4 we could skip the handler layer
entirely for such methods, and just call the right function directly.

I think we should definitely do (1), and also (2) unless there's some
reason it's hard. (3) doesn't need to be part of this patch, but might
be something to consider later in the series. It's possible that it
doesn't have enough benefit to be worth the work, though. Also, I
think we should do either (4) or (5). I have a mild preference for (5)
unless it looks too ugly.

Note that I'm not talking about hard-coding a fast path for a
hard-coded list of OIDs - which would seem a little bit unprincipled -
but hard-coding a fast path for the bit patterns that are themselves
hard-coded. I don't think we lose anything in terms of extensibility
or even-handedness there; it's just avoiding a bunch of rigamarole
that doesn't really buy us anything.

All these points apply equally to toast_decompress_datum_slice() and
toast_compress_datum().

+       /* Fallback to default compression method, if not specified */
+       if (!OidIsValid(cmoid))
+               cmoid = DefaultCompressionOid;

I think that the caller should be required to specify a legal value,
and this should be an elog(ERROR) or an Assert().

The change to equalTupleDescs() makes me wonder. Like, can we specify
the compression method for a function parameter, or a function return
value? I would think not. But then how are the tuple descriptors set
up in that case? Under what circumstances do we actually need the
tuple descriptors to compare unequal?

lz4.c's header comment calls it cm_lz4.c, and the pathname is wrong too.

I wonder if we should try to adopt a convention for the names of these
files that isn't just the compression method name, like cmlz4 or
compress_lz4. I kind of like the latter one. I am a little worried
that just calling it lz4.c will result in name collisions later - not
in this directory, of course, but elsewhere in the system. It's not a
disaster if that happens, but for example verbose error reports print
the file name, so it's nice if it's unambiguous.

+               if (!IsBinaryUpgrade &&
+                       (relkind == RELKIND_RELATION ||
+                        relkind == RELKIND_PARTITIONED_TABLE))
+                       attr->attcompression =
+
GetAttributeCompressionMethod(attr, colDef->compression);
+               else
+                       attr->attcompression = InvalidOid;

Storing InvalidOid in the IsBinaryUpgrade case looks wrong. If
upgrading from pre-v14, we need to store PGLZ_COMPRESSION_OID.
Otherwise, we need to preserve whatever value was present in the old
version. Or am I confused here?

I think there should be tests for the way this interacts with
partitioning, and I think the intended interaction should be
documented. Perhaps it should behave like TABLESPACE, where the parent
property has no effect on what gets stored because the parent has no
storage, but is inherited by each new child.

I wonder in passing about TOAST tables and materialized views, which
are the other things that have storage. What gets stored for
attcompression? For a TOAST table it probably doesn't matter much
since TOAST table entries shouldn't ever be toasted themselves, so
anything that doesn't crash is fine (but maybe we should test that
trying to alter the compression properties of a TOAST table doesn't
crash, for example). For a materialized view it seems reasonable to
want to set column properties, but I'm not quite sure how that works
today for things like STORAGE anyway. If we do allow setting STORAGE
or COMPRESSION for materialized view columns then dump-and-reload
needs to preserve the values.

+       /*
+        * Use default compression method if the existing compression method is
+        * invalid but the new storage type is non plain storage.
+        */
+       if (!OidIsValid(attrtuple->attcompression) &&
+               (newstorage != TYPSTORAGE_PLAIN))
+               attrtuple->attcompression = DefaultCompressionOid;

You have a few too many parens in there.

I don't see a particularly good reason to treat plain and external
differently. More generally, I think there's a question here about
when we need an attribute to have a valid compression type and when we
don't. If typstorage is plan or external, then there's no point in
ever having a compression type and maybe we should even reject
attempts to set one (but I'm not sure). However, the attstorage is a
different case. Suppose the column is created with extended storage
and then later it's changed to plain. That's only a hint, so there may
still be toasted values in that column, so the compression setting
must endure. At any rate, we need to make sure we have clear and
sensible rules for when attcompression (a) must be valid, (b) may be
valid, and (c) must be invalid. And those rules need to at least be
documented in the comments, and maybe in the SGML docs.

I'm out of time for today, so I'll have to look at this more another
day. Hope this helps for a start.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2
Next
From: James Coleman
Date:
Subject: Why does create_gather_merge_plan need make_sort?