Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | CAFiTN-tyWCqoHzsKzTbk1R6PMO0pC3g-=5=giB644rLjfZkCYA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Custom compression methods (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: [HACKERS] Custom compression methods
|
List | pgsql-hackers |
On Tue, Feb 9, 2021 at 2:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Feb 7, 2021 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Feb 5, 2021 at 8:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Wed, Feb 3, 2021 at 2:07 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > > > Even more review comments, still looking mostly at 0001: > > > > > > > > If there's a reason why parallel_schedule is arranging to run the > > > > compression test in parallel with nothing else, the comment in that > > > > file should explain the reason. If there isn't, it should be added to > > > > a parallel group that doesn't have the maximum number of tests yet, > > > > probably the last such group in the file. > > > > > > > > serial_schedule should add the test in a position that roughly > > > > corresponds to where it appears in parallel_schedule. > > > > > > Done > > > > > > > I believe it's relatively standard practice to put variable > > > > declarations at the top of the file. compress_lz4.c and > > > > compress_pglz.c instead put those declarations nearer to the point of > > > > use. > > > > > > Do you mean pglz_compress_methods and lz4_compress_methods ? I > > > followed that style from > > > heapam_handler.c. If you think that doesn't look good then I can move it up. > > > > > > > compressamapi.c has an awful lot of #include directives for the code > > > > it actually contains. I believe that we should cut that down to what > > > > is required by 0001, and other patches can add more later as required. > > > > In fact, it's tempting to just get rid of this .c file altogether and > > > > make the two functions it contains static inline functions in the > > > > header, but I'm not 100% sure that's a good idea. > > > > > > I think it looks better to move them to compressamapi.h so done that. > > > > > > > The copyright dates in a number of the file headers are out of date. > > > > > > Fixed > > > > > > > binary_upgrade_next_pg_am_oid and the related changes to > > > > CreateAccessMethod don't belong in 0001, because it doesn't support > > > > non-built-in compression methods. These changes and the related > > > > pg_dump change should be moved to the patch that adds support for > > > > that. > > > > > > Fixed > > > > > > > The comments added to dumpTableSchema() say that "compression is > > > > assigned by ALTER" but don't give a reason. I think they should. I > > > > don't know how much they need to explain about what the code does, but > > > > they definitely need to explain why it does it. Also, isn't this bad? > > > > If we create the column with the wrong compression setting initially > > > > and then ALTER it, we have to rewrite the table. If it's empty, that's > > > > cheap, but it'd still be better not to do it at all. > > > > > > Yeah, actually that part should go in 0003 patch where we implement > > > the custom compression method. > > > in that patch we need to alter and set because we want to keep the > > > preserved method as well > > > So I will add it there > > > > > > > I'm not sure it's a good idea for dumpTableSchema() to leave out > > > > specifying the compression method if it happens to be pglz. I think we > > > > definitely shouldn't do it in binary-upgrade mode. What if we changed > > > > the default in a future release? For that matter, even 0002 could make > > > > the current approach unsafe.... I think, anyway. > > > > > > Fixed > > > > > > > > > > The changes to pg_dump.h look like they haven't had a visit from > > > > pgindent. You should probably try to do that for the whole patch, > > > > though it's a bit annoying since you'll have to manually remove > > > > unrelated changes to the same files that are being modified by the > > > > patch. Also, why the extra blank line here? > > > > > > Fixed, ran pgindent for other files as well. > > > > > > > GetAttributeCompression() is hard to understand. I suggest changing > > > > the comment to "resolve column compression specification to an OID" > > > > and somehow rejigger the code so that you aren't using one not-NULL > > > > test and one NULL test on the same variable. Like maybe change the > > > > first part to if (!IsStorageCompressible(typstorage)) { if > > > > (compression == NULL) return InvalidOid; ereport(ERROR, ...); } > > > > > > Done > > > > > > > It puzzles me that CompareCompressionMethodAndDecompress() calls > > > > slot_getallattrs() just before clearing the slot. It seems like this > > > > ought to happen before we loop over the attributes, so that we don't > > > > need to call slot_getattr() every time. See the comment for that > > > > function. But even if we didn't do that for some reason, why would we > > > > do it here? If it's already been done, it shouldn't do anything, and > > > > if it hasn't been done, it might overwrite some of the values we just > > > > poked into tts_values. It also seems suspicious that we can get away > > > > with clearing the slot and then again marking it valid. I'm not sure > > > > it really works like that. Like, can't clearing the slot invalidate > > > > pointers stored in tts_values[]? For instance, if they are pointers > > > > into an in-memory heap tuple, tts_heap_clear() is going to free the > > > > tuple; if they are pointers into a buffer, tts_buffer_heap_clear() is > > > > going to unpin it. I think the supported procedure for this sort of > > > > thing is to have a second slot, set tts_values, tts_isnull etc. and > > > > then materialize the slot. After materializing the new slot, it's > > > > independent of the old slot, which can then be cleared. See for > > > > example tts_virtual_materialize(). The whole approach you've taken > > > > here might need to be rethought a bit. I think you are right to want > > > > to avoid copying everything over into a new slot if nothing needs to > > > > be done, and I think we should definitely keep that optimization, but > > > > I think if you need to copy stuff, you have to do the above procedure > > > > and then continue using the other slot instead of the original one. > > > > Some places I think we have functions that return either the original > > > > slot or a different one depending on how it goes; that might be a > > > > useful idea here. But, you also can't just spam-create slots; it's > > > > important that whatever ones we end up with get reused for every > > > > tuple. > > > > > > I have changed this algorithm, so now if we have to decompress > > > anything we will use the new slot and we will stick that new slot to > > > the ModifyTableState, DR_transientrel for matviews and DR_intorel for > > > CTAS. Does this looks okay or we need to do something else? If this > > > logic looks fine then maybe we can think of some more optimization and > > > cleanup in this function. > > > > > > > > > > Doesn't the change to describeOneTableDetails() require declaring > > > > changing the declaration of char *headers[11] to char *headers[12]? > > > > How does this not fail Assert(cols <= lengthof(headers))? > > > > > > Fixed > > > > > > > Why does describeOneTableDetais() arrange to truncate the printed > > > > value? We don't seem to do that for the other column properties, and > > > > it's not like this one is particularly long. > > > > > > Not required, fixed. > > > > > > > Perhaps the changes to pg_am.dat shouldn't remove the blank line? > > > > > > Fixed > > > > > > > I think the comment to pg_attribute.h could be rephrased to stay > > > > something like: "OID of compression AM. Must be InvalidOid if and only > > > > if typstorage is 'a' or 'b'," replacing 'a' and 'b' with whatever the > > > > right letters are. This would be shorter and I think also clearer than > > > > what you have > > > > > > Fixed > > > > > > > The first comment change in postgres.h is wrong. You changed > > > > va_extsize to "size in va_extinfo" but the associated structure > > > > definition is unchanged, so the comment shouldn't be changed either. > > > > > > Yup, not required. > > > > > > > In toast_internals.h, you end using 30 as a constant several times but > > > > have no #define for it. You do have a #define for RAWSIZEMASK, but > > > > that's really a derived value from 30. Also, it's not a great name > > > > because it's kind of generic. So how about something like: > > > > > > > > #define TOAST_RAWSIZE_BITS 30 > > > > #define TOAST_RAWSIZE_MASK ((1 << (TOAST_RAW_SIZE_BITS + 1)) - 1) > > > > > > > > But then again on second thought, this 30 seems to be the same 30 that > > > > shows up in the changes to postgres.h, and there again 0x3FFFFFFF > > > > shows up too. So maybe we should actually be defining these constants > > > > there, using names like VARLENA_RAWSIZE_BITS and VARLENA_RAWSIZE_MASK > > > > and then having toast_internals.h use those constants as well. > > > > > > Done, IMHO it should be > > > #define VARLENA_RAWSIZE_BITS 30 > > > #define VARLENA_RAWSIZE_MASK ((1 << VARLENA_RAWSIZE_BITS) -1 ) > > > > > > > > > > Taken with the email I sent yesterday, I think this is a more or less > > > > complete review of 0001. Although there are a bunch of things to fix > > > > here still, I don't think this is that far from being committable. I > > > > don't at this point see too much in terms of big design problems. > > > > Probably the CompareCompressionMethodAndDecompress() is the closest to > > > > a design-level problem, and certainly something needs to be done about > > > > it, but even that is a fairly localized problem in the context of the > > > > entire patch. > > > > > > 0001 is attached, now pending parts are > > > > > > - Confirm the new design of CompareCompressionMethodAndDecompress > > > - Performance test, especially lz4 with small varlena > > > > I have tested the performance, pglz vs lz4 > > > > Test1: With a small simple string, pglz doesn't select compression but > > lz4 select as no min limit > > Table: 100 varchar column > > Test: Insert 1000 tuple, each column of 25 bytes string (32 is min > > limit for pglz) > > Result: > > pglz: 1030 ms (doesn't attempt compression so externalize), > > lz4: 212 ms > > > > Test2: With small incompressible string, pglz don't select compression > > lz4 select but can not compress > > Table: 100 varchar column > > Test: Insert 1000 tuple, each column of 25 bytes string (32 is min > > limit for pglz) > > Result: > > pglz: 1030 ms (doesn't attempt compression so externalize), > > lz4: 1090 ms (attempt to compress but externalize): > > > > Test3: Test a few columns with large random data > > Table: 3 varchar column > > Test: Insert 1000 tuple 3 columns size(3500 byes, 4200 bytes, 4900bytes) > > pglz: 150 ms (compression ratio: 3.02%), > > lz4: 30 ms (compression ratio : 2.3%) > > > > Test4: Test3 with different large random slighly compressible, need to > > compress + externalize: > > Table: 3 varchar column > > Insert: Insert 1000 tuple 3 columns size(8192, 8192, 8192) > > CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS > > 'select array_agg(md5(g::text))::text from generate_series(1, 256) g'; > > Test: insert into t1 select large_val(), large_val(), large_val() from > > generate_series(1,1000); > > pglz: 2000 ms > > lz4: 1500 ms > > > > Conclusion: > > 1. In most cases lz4 is faster and doing better compression as well. > > 2. In Test2 when small data is incompressible then lz4 tries to > > compress whereas pglz doesn't try so there is some performance loss. > > But if we want we can fix > > it by setting some minimum limit of size for lz4 as well, maybe the > > same size as pglz? > > > > > - Rebase other patches atop this patch > > > - comment in ddl.sgml > > > > Other changes in patch: > > - Now we are dumping the default compression method in the > > binary-upgrade mode so the pg_dump test needed some change, fixed > > that. > > - in compress_pglz.c and compress_lz4.c, we were using > > toast_internal.h macros so I removed and used varlena macros instead. > > > > While testing, I noticed that if the compressed data are externalized > > then pg_column_compression(), doesn't fetch the compression method > > from the toast chunk, I think we should do that. I will analyze this > > and fix it in the next version. > > While trying to fix this, I have realized this problem exists in > CompareCompressionMethodAndDecompress > see below code. > --- > + new_value = (struct varlena *) > + DatumGetPointer(slot->tts_values[attnum - 1]); > + > + /* nothing to be done, if it is not compressed */ > + if (!VARATT_IS_COMPRESSED(new_value)) > + continue; > --- > > Basically, we are just checking whether the stored value is compressed > or not, but we are clearly ignoring the fact that it might be > compressed and stored externally on disk. So basically if the value > is stored externally we can not know whether the external data were > compressed or not without fetching the values from the toast table, I > think instead of fetching the complete data from toast we can just > fetch the header using 'toast_fetch_datum_slice'. > > Any other thoughts on this? I think I was partially wrong here. Basically, there is a way to know whether the external data are compressed or not using VARATT_EXTERNAL_IS_COMPRESSED macro. However, if it is compressed then we will have to fetch the toast slice of size toast_compress_header, to know the compression method. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: