Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [HACKERS] Custom compression methods
Date
Msg-id CAFiTN-um3+YD3Su=vEwgY8SgcQFuBY5xVvKgE4ks+=-zb6T+Sw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Feb 11, 2021 at 3:26 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> In CompareCompressionMethodAndDecompress, I think this is still
> playing a bit fast and loose with the rules around slots. I think we
> can do better. Suppose that at the point where we discover that we
> need to decompress at least one attribute, we create the new slot
> right then, and also memcpy tts_values and tts_isnull. Then, for that
> attribute and any future attributes that need decompression, we reset
> tts_values in the *new* slot, leaving the old one untouched. Then,
> after finishing all the attributes, the if (decompressed_any) block,
> you just have a lot less stuff to do. The advantage of this is that
> you haven't tainted the old slot; it's still got whatever contents it
> had before, and is in a clean state, which seems better to me.

Fixed

>
> It's unclear to me whether this function actually needs to
> ExecMaterializeSlot(newslot). It definitely does need to
> ExecStoreVirtualTuple(newslot) and I think it's a very good idea, if
> not absolutely mandatory, for it not to modify anything about the old
> slot. But what's the argument that the new slot needs to be
> materialized at this point? It may be needed, if the old slot would've
> had to be materialized at this point. But it's something to think
> about.

I think if the original slot was materialized then materialing the new
slot make more sense to me so done that way.

>
> The CREATE TABLE documentation says that COMPRESSION is a kind of
> column constraint, but that's wrong. For example, you can't write
> CREATE TABLE a (b int4 CONSTRAINT thunk COMPRESSION lz4), for example,
> contrary to what the syntax summary implies. When you fix this so that
> the documentation matches the grammar change, you may also need to
> move the longer description further up in create_table.sgml so the
> order matches.

Fixed

> The use of VARHDRSZ_COMPRESS in toast_get_compression_oid() appears to
> be incorrect. VARHDRSZ_COMPRESS is offsetof(varattrib_4b,
> va_compressed.va_data). But what gets externalized in the case of a
> compressed datum is just VARDATA(dval), which excludes the length
> word, unlike VARHDRSZ_COMPRESS, which does not. This has no
> consequences since we're only going to fetch 1 chunk either way, but I
> think we should make it correct.

Fixed

> TOAST_COMPRESS_SET_SIZE_AND_METHOD() could Assert something about cm_method.

While replying to the comments, I realised that I have missed it.  I
will fix it in the next version.

> Small delta patch with a few other suggested changes attached.

Merged

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: Zhihong Yu
Date:
Subject: Re: PoC Refactor AM analyse API