Re: [proposal] de-TOAST'ing using a iterator - Mailing list pgsql-hackers

From John Naylor
Subject Re: [proposal] de-TOAST'ing using a iterator
Date
Msg-id CACPNZCsX4nt8AN-OMQTc2rf8-=tQnet4KML_RG782RC5h-Wx=w@mail.gmail.com
Whole thread Raw
In response to Re: [proposal] de-TOAST'ing using a iterator  (Binguo Bao <djydewang@gmail.com>)
Responses Re: [proposal] de-TOAST'ing using a iterator  (Binguo Bao <djydewang@gmail.com>)
List pgsql-hackers
On Thu, Jul 25, 2019 at 10:21 PM Binguo Bao <djydewang@gmail.com> wrote:
>
> Hi John!
> Sorry for the late reply. It took me some time to fix a random bug.

Don't worry, it's not late at all! :-)

>> In the case where we don't know the slice size, how about the other
>> aspect of my question above: Might it be simpler and less overhead to
>> decompress entire chunks at a time? If so, I think it would be
>> enlightening to compare performance.
>
>
> Good idea. I've tested  your propopal with scripts and patch v5 in the attachment:
>
>                   master          patch v4          patch v5
> comp. beg.        4364ms          1505ms          1529ms
> comp. end       28321ms        31202ms          26916ms
> uncomp. beg.     3474ms          1513ms          1523ms
> uncomp. end     27416ms        30260ms          25888ms
>
> The proposal improves suffix query performance greatly
> with less calls to the decompression function.

Looks good. I repeated my CIA fact book test and found no difference
with compression, but found that suffix search in the uncompressed
case had less regression (~5%) than v4 (>8%). Let's pursue this
further.

> Besides, do you have any other suggestions for the structure of DetoastIterator or ToastBuffer?

My goal for this stage of review was to understand more fully what the
code is doing, and make it as simple and clear as possible, starting
at the top level. In doing so, it looks like I found some additional
performance gains. I haven't looked much yet at the TOAST fetching
logic.


1). For every needle comparison, text_position_next_internal()
calculates how much of the value is needed and passes that to
detoast_iterate(), which then calculates if it has to do something or
not. This is a bit hard to follow. There might also be a performance
penalty -- the following is just a theory, but it sounds plausible:
The CPU can probably correctly predict that detoast_iterate() will
usually return the same value it did last time, but it still has to
call the function and make sure, which I imagine is more expensive
than advancing the needle. Ideally, we want to call the iterator only
if we have to.

In the attached patch (applies on top of your v5),
text_position_next_internal() simply compares hptr to the detoast
buffer limit, and calls detoast_iterate() until it can proceed. I
think this is clearer. (I'm not sure of the error handling, see #2.)
In this scheme, the only reason to know length is to pass to
pglz_decompress_iterate() in the case of in-line compression. As I
alluded to in my first review, I don't think it's worth the complexity
to handle that iteratively since the value is only a few kB. I made it
so in-line datums are fully decompressed as in HEAD and removed struct
members to match. I also noticed that no one updates or looks at
"toast_iter.done" so I removed that as well.

Now pglz_decompress_iterate() doesn't need length at all. For testing
I just set decompress_all = true and let the compiler optimize away
the rest. I left finishing it for you if you agree with these changes.

With this additional patch, the penalty for suffix search in my CIA
fact book test is only ~2% in the compressed case, and might even be
slightly faster than HEAD in the uncompressed case.


2). detoast_iterate() and fetch_datum_iterate() return a value but we
don't check it or do anything with it. Should we do something with it?
It's also not yet clear if we should check the iterator state instead
of return values. I've added some XXX comments as a reminder. We
should also check the return value of pglz_decompress_iterate().


3). Speaking of pglz_decompress_iterate(), I diff'd it with
pglz_decompress(), and I have some questions on it:

a).
+ srcend = (const unsigned char *) (source->limit == source->capacity
? source->limit : (source->limit - 4));

What does the 4 here mean in this expression? Is it possible it's
compensating for this bit in init_toast_buffer()?

+ buf->limit = VARDATA(buf->buf);

It seems the initial limit should also depend on whether the datum is
compressed, right? Can we just do this:

+ buf->limit = buf->position;

b).
- while (sp < srcend && dp < destend)
...
+ while (sp + 1 < srcend && dp < destend &&
...

Why is it here "sp + 1"?


4. Note that varlena.c has a static state variable, and a cleanup
function that currently does:

static void
text_position_cleanup(TextPositionState *state)
{
/* no cleanup needed */
}

It seems to be the detoast iterator could be embedded in this state
variable, and then free-ing can happen here. That has a possible
advantage that the iterator struct would be on the same cache line as
the state data. That would also remove the need to pass "iter" as a
parameter, since these functions already pass "state". I'm not sure if
this would be good for other users of the iterator, so maybe we can
hold off on that for now.

5. Would it be a good idea to add tests (not always practical), or
more Assert()'s? You probably already know this, but as a reminder
it's good to develop with asserts enabled, but never build with them
for performance testing.

I think that's enough for now. If you have any questions or
counter-arguments, let me know. I've set the commitfest entry to
waiting on author.


--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fix typos and inconsistencies for HEAD (take 8)
Next
From: Ashutosh Sharma
Date:
Subject: Re: COPY command on a table column marked as GENERATED ALWAYS