Thread: Compressed TOAST Slicing

Compressed TOAST Slicing

From
Paul Ramsey
Date:
Currently, PG_DETOAST_DATUM_SLICE when run on a compressed TOAST entry will first decompress the whole object, then extract the relevant slice. 

When the desired slice is at or near the front of the object, this is obviously non-optimal. 

The attached patch adds in a code path to do a partial decompression of the TOAST entry, when the requested slice is at the start of the object.

For an example of the improvement possible, this trivial example:

    create table slicingtest (
      id serial primary key,
      a text
    );

    insert into slicingtest (a) select repeat('xyz123', 10000) as a from generate_series(1,10000);
    \timing
    select sum(length(substr(a, 0, 20))) from slicingtest;

On master, in the current state on my wee laptop, I get

    Time: 1426.737 ms (00:01.427)

With the patch, on my wee laptop, I get

    Time: 46.886 ms

As usual, doing less work is faster.

Interesting note to motivate a follow-on patch: the substr() function does attempt to slice, but the left() function does not. So, if this patch is accepted, next patch will be to left() to add slicing behaviour.

If nobody lights me on fire, I'll submit to commitfest shortly.

P.
Attachment

Re: Compressed TOAST Slicing

From
Stephen Frost
Date:
Greetings,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> The attached patch adds in a code path to do a partial decompression of the
> TOAST entry, when the requested slice is at the start of the object.

Neat!

> As usual, doing less work is faster.

Definitely.

> Interesting note to motivate a follow-on patch: the substr() function does
> attempt to slice, but the left() function does not. So, if this patch is
> accepted, next patch will be to left() to add slicing behaviour.

Makes sense to me.

There two things that I wonder about in the patch- if it would be of any
use to try and allocate on a need basis instead of just allocating the
whole chunk up to the toast size, and secondly, why we wouldn't consider
handling a non-zero offset.  A non-zero offset would, of course, still
require decompressing from the start and then just throwing away what we
skip over, but we're going to be doing that anyway, aren't we?  Why not
stop when we get to the end, at least, and save ourselves the trouble of
decompressing the rest and then throwing it away.

> If nobody lights me on fire, I'll submit to commitfest shortly.

Sounds like a good idea to me.

Thanks!

Stephen

Attachment

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:


On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> The attached patch adds in a code path to do a partial decompression of the
> TOAST entry, when the requested slice is at the start of the object.

There two things that I wonder about in the patch- if it would be of any
use to try and allocate on a need basis instead of just allocating the
whole chunk up to the toast size,

I'm not sure what I was thinking when I rejected allocating the slice size in favour of the whole uncompressed size... I cannot see why that wouldn't work.
 
and secondly, why we wouldn't consider
handling a non-zero offset.  A non-zero offset would, of course, still
require decompressing from the start and then just throwing away what we
skip over, but we're going to be doing that anyway, aren't we?  Why not
stop when we get to the end, at least, and save ourselves the trouble of
decompressing the rest and then throwing it away.

I was worried about changing the pg_lz code too much because it scared me, but debugging some stuff made me read it more closely so I fear it less now, and doing interior slices seems not unreasonable, so I will give it a try.
 
P

Re: Compressed TOAST Slicing

From
Tom Lane
Date:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost <sfrost@snowman.net> wrote:
>> and secondly, why we wouldn't consider
>> handling a non-zero offset.  A non-zero offset would, of course, still
>> require decompressing from the start and then just throwing away what we
>> skip over, but we're going to be doing that anyway, aren't we?  Why not
>> stop when we get to the end, at least, and save ourselves the trouble of
>> decompressing the rest and then throwing it away.

> I was worried about changing the pg_lz code too much because it scared me,
> but debugging some stuff made me read it more closely so I fear it less
> now, and doing interior slices seems not unreasonable, so I will give it a
> try.

I think Stephen was just envisioning decompressing from offset 0 up to
the end of what's needed, and then discarding any data before the start
of what's needed; at least, that's what'd occurred to me.  It sounds like
you were thinking about hacking pg_lz to not write the leading data
anywhere.  While that'd likely be a win for cases where there was leading
data to discard, I'm worried about adding any cycles to the inner loops
of the decompressor.  We don't want to pessimize every other use of pg_lz
to buy a little bit for these cases.

            regards, tom lane


Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:


On Thu, Nov 1, 2018 at 4:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
> On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost <sfrost@snowman.net> wrote:
>> and secondly, why we wouldn't consider
>> handling a non-zero offset.  A non-zero offset would, of course, still
>> require decompressing from the start and then just throwing away what we
>> skip over, but we're going to be doing that anyway, aren't we?  Why not
>> stop when we get to the end, at least, and save ourselves the trouble of
>> decompressing the rest and then throwing it away.

> I was worried about changing the pg_lz code too much because it scared me,
> but debugging some stuff made me read it more closely so I fear it less
> now, and doing interior slices seems not unreasonable, so I will give it a
> try.

I think Stephen was just envisioning decompressing from offset 0 up to
the end of what's needed, and then discarding any data before the start
of what's needed; at least, that's what'd occurred to me. 

Understood, that makes lots of sense and is a very small change, it turns out :)
Allocating just what is needed also makes things faster yet, which is nice, and no big surprise.
Some light testing seems to show no obvious regression in speed of decompression for the usual "decompress it all" case.

P
 
Attachment

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
As threatened, I have also added a patch to left() to also use sliced access.
Attachment

Re: Compressed TOAST Slicing

From
Rafia Sabih
Date:
On Fri, Nov 2, 2018 at 11:55 PM Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>
> As threatened, I have also added a patch to left() to also use sliced access.

Hi Paul,

The idea looks good and believing your performance evaluation it seems
like a practical one too.

I had a look at this patch and here are my initial comments,
1.
- if (dp != destend || sp != srcend)
+ if (!is_slice && (dp != destend || sp != srcend))
  return -1;

A comment explaining how this check differs for is_slice case would be helpful.
2.
- int len = VARSIZE_ANY_EXHDR(str);
- int n = PG_GETARG_INT32(1);
- int rlen;
+ int n = PG_GETARG_INT32(1);

Looks like PG indentation is not followed here for n.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih <rafia.sabih@enterprisedb.com> wrote:
>
> The idea looks good and believing your performance evaluation it seems
> like a practical one too.

Thank you kindly for the review!

> A comment explaining how this check differs for is_slice case would be helpful.
> Looks like PG indentation is not followed here for n.

I have attached updated patches that add the comment and adhere to the Pg variable declaration indentation styles,
ATB!
P

--
Paul Ramsey

Attachment

Re: Compressed TOAST Slicing

From
Andres Freund
Date:
Hi Stephen,


On 2018-12-06 12:54:18 -0800, Paul Ramsey wrote:
> On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih <rafia.sabih@enterprisedb.com>
> wrote:
> >
> > The idea looks good and believing your performance evaluation it seems
> > like a practical one too.
> 
> Thank you kindly for the review!
> 
> > A comment explaining how this check differs for is_slice case would be
> helpful.
> > Looks like PG indentation is not followed here for n.
> 
> I have attached updated patches that add the comment and adhere to the Pg
> variable declaration indentation styles,
> ATB!
> P

You were mentioning committing this at the Brussels meeting... :)

Greetings,

Andres Freund


Re: Compressed TOAST Slicing

From
Simon Riggs
Date:
On Thu, 6 Dec 2018 at 20:54, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih <rafia.sabih@enterprisedb.com> wrote:
>
> The idea looks good and believing your performance evaluation it seems
> like a practical one too.

Thank you kindly for the review!

Sounds good.

Could we get an similarly optimized implementation of -> operator for JSONB as well?

Are there any other potential uses? Best to fix em all up at once and then move on to other things. Thanks.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs <simon@2ndquadrant.com> wrote:

> Could we get an similarly optimized implementation of -> operator for JSONB as well?
> Are there any other potential uses? Best to fix em all up at once and then move on to other things. Thanks.

Oddly enough, I couldn't find many/any things that were sensitive to
left-end decompression. The only exception is "LIKE this%" which
clearly would be helped, but unfortunately wouldn't be a quick
drop-in, but a rather major reorganization of the regex handling.

I had a look at "->" and I couldn't see how a slice could be used to
make it faster? We don't a priori know how big a slice would give us
what we want. This again makes Stephen's case for an iterator, but of
course all the iterator benefits only come when the actual function at
the top (in this case the json parser) are also updated to be
iterative.

Committing this little change doesn't preclude an iterator, or even
make doing one more complicated... :)

P.


Re: Compressed TOAST Slicing

From
Юрий Соколов
Date:
Some time ago I posted PoC patch with alternative TOAST compression scheme: instead of "compress-then-chunk" I suggested "chunk-then-compress". It decrease compression level, but allows efficient arbitrary slicing.

ср, 20 февр. 2019 г., 2:09 Paul Ramsey pramsey@cleverelephant.ca:
On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs <simon@2ndquadrant.com> wrote:

> Could we get an similarly optimized implementation of -> operator for JSONB as well?
> Are there any other potential uses? Best to fix em all up at once and then move on to other things. Thanks.

Oddly enough, I couldn't find many/any things that were sensitive to
left-end decompression. The only exception is "LIKE this%" which
clearly would be helped, but unfortunately wouldn't be a quick
drop-in, but a rather major reorganization of the regex handling.

I had a look at "->" and I couldn't see how a slice could be used to
make it faster? We don't a priori know how big a slice would give us
what we want. This again makes Stephen's case for an iterator, but of
course all the iterator benefits only come when the actual function at
the top (in this case the json parser) are also updated to be
iterative.

Committing this little change doesn't preclude an iterator, or even
make doing one more complicated... :)

P.

Re: Compressed TOAST Slicing

From
Simon Riggs
Date:
On Tue, 19 Feb 2019 at 23:09, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs <simon@2ndquadrant.com> wrote:

> Could we get an similarly optimized implementation of -> operator for JSONB as well?
> Are there any other potential uses? Best to fix em all up at once and then move on to other things. Thanks.

Oddly enough, I couldn't find many/any things that were sensitive to
left-end decompression. The only exception is "LIKE this%" which
clearly would be helped, but unfortunately wouldn't be a quick
drop-in, but a rather major reorganization of the regex handling.

I had a look at "->" and I couldn't see how a slice could be used to
make it faster? We don't a priori know how big a slice would give us
what we want. This again makes Stephen's case for an iterator, but of
course all the iterator benefits only come when the actual function at
the top (in this case the json parser) are also updated to be
iterative.

Committing this little change doesn't preclude an iterator, or even
make doing one more complicated... :)

Sure, but we have the choice between something that benefits just a few cases or one that benefits more widely.

If we all only work on the narrow use cases that are right in front of us at the present moment then we would not have come this far. I'm sure many GIS applications also store JSONB data, so you would be helping the performance of the whole app, even if there isn't much JSON in PostGIS.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Compressed TOAST Slicing

From
Andres Freund
Date:
On 2019-02-20 08:39:38 +0000, Simon Riggs wrote:
> On Tue, 19 Feb 2019 at 23:09, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> 
> > On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > > Could we get an similarly optimized implementation of -> operator for
> > JSONB as well?
> > > Are there any other potential uses? Best to fix em all up at once and
> > then move on to other things. Thanks.
> >
> > Oddly enough, I couldn't find many/any things that were sensitive to
> > left-end decompression. The only exception is "LIKE this%" which
> > clearly would be helped, but unfortunately wouldn't be a quick
> > drop-in, but a rather major reorganization of the regex handling.
> >
> > I had a look at "->" and I couldn't see how a slice could be used to
> > make it faster? We don't a priori know how big a slice would give us
> > what we want. This again makes Stephen's case for an iterator, but of
> > course all the iterator benefits only come when the actual function at
> > the top (in this case the json parser) are also updated to be
> > iterative.
> >
> > Committing this little change doesn't preclude an iterator, or even
> > make doing one more complicated... :)
> >
> 
> Sure, but we have the choice between something that benefits just a few
> cases or one that benefits more widely.
> 
> If we all only work on the narrow use cases that are right in front of us
> at the present moment then we would not have come this far. I'm sure many
> GIS applications also store JSONB data, so you would be helping the
> performance of the whole app, even if there isn't much JSON in PostGIS.

-1, I think this is blowing up the complexity of a already useful patch,
even though there's no increase in complexity due to the patch proposed
here.  I totally get wanting incremental decompression for jsonb, but I
don't see why Paul should be held hostage for that.

Greetings,

Andres Freund


Re: Compressed TOAST Slicing

From
Robert Haas
Date:
On Wed, Feb 20, 2019 at 11:27 AM Andres Freund <andres@anarazel.de> wrote:
> -1, I think this is blowing up the complexity of a already useful patch,
> even though there's no increase in complexity due to the patch proposed
> here.  I totally get wanting incremental decompression for jsonb, but I
> don't see why Paul should be held hostage for that.

I concur.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Compressed TOAST Slicing

From
Simon Riggs
Date:
On Wed, 20 Feb 2019 at 16:27, Andres Freund <andres@anarazel.de> wrote:

> Sure, but we have the choice between something that benefits just a few
> cases or one that benefits more widely.
>
> If we all only work on the narrow use cases that are right in front of us
> at the present moment then we would not have come this far. I'm sure many
> GIS applications also store JSONB data, so you would be helping the
> performance of the whole app, even if there isn't much JSON in PostGIS.

-1, I think this is blowing up the complexity of a already useful patch,
even though there's no increase in complexity due to the patch proposed
here.  I totally get wanting incremental decompression for jsonb, but I
don't see why Paul should be held hostage for that.

Not sure I agree with your emotive language. Review comments != holding hostages.

If we add one set of code now and need to add another different one later, we will have 2 sets of code that do similar things.

I'm surprised to hear you think that is a good thing.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:

On Feb 20, 2019, at 10:37 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

-1, I think this is blowing up the complexity of a already useful patch,
even though there's no increase in complexity due to the patch proposed
here.  I totally get wanting incremental decompression for jsonb, but I
don't see why Paul should be held hostage for that.

Not sure I agree with your emotive language. Review comments != holding hostages.

If we add one set of code now and need to add another different one later, we will have 2 sets of code that do similar things.

So, current state is, asked for a datum slice, we can now decompress just the parts we need to get that slice. This allows us to speed up anything that knows in advance how big a slice they are going to want. At this moment all I’ve found is left() and substr() for the start-at-front case.

What this does not support: any function that probably wants less-than-everything, but doesn’t know how big a slice to look for. Stephen thinks I should put an iterator on decompression, which would be an interesting piece of work. Having looked at the json code a little doing partial searches would require a lot of re-work that is above my paygrade, but if there was an iterator in place, at least that next stop would then be open. 

Note that adding an iterator isn’t adding two ways to do the same thing, since the iterator would slot nicely underneath the existing slicing API, and just iterate to the requested slice size. So this is easily just “another step” along the train line to providing streaming access to compressed and TOASTed data.

I’d hate for the simple slice ability to get stuck behind the other work, since it’s both (a) useful and (b) exists. If you are concerned the iterator will never get done, I can only offer my word that, since it seems important to multiple people on this list, I will do it. (Just not, maybe, very well :)

P.

Re: Compressed TOAST Slicing

From
"Daniel Verite"
Date:
    Paul Ramsey wrote:

> Oddly enough, I couldn't find many/any things that were sensitive to
> left-end decompression. The only exception is "LIKE this%" which
> clearly would be helped, but unfortunately wouldn't be a quick
> drop-in, but a rather major reorganization of the regex handling.

What about starts_with(string, prefix)?

text_starts_with(arg1,arg2) in varlena.c does a full decompression
of  arg1 when it could limit itself to the length of the smaller arg2:

Datum
text_starts_with(PG_FUNCTION_ARGS)
....
  len1 = toast_raw_datum_size(arg1);
  len2 = toast_raw_datum_size(arg2);
  if (len2 > len1)
      result = false;
  else
  {
      text     *targ1 = DatumGetTextPP(arg1);
      text     *targ2 = DatumGetTextPP(arg2);

      result = (memcmp(VARDATA_ANY(targ1), VARDATA_ANY(targ2),
           VARSIZE_ANY_EXHDR(targ2)) == 0);
...



Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Compressed TOAST Slicing

From
Robert Haas
Date:
On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey <pramsey@cleverelephant.ca> wrote:
> What this does not support: any function that probably wants less-than-everything, but doesn’t know how big a slice
tolook for. Stephen thinks I should put an iterator on decompression, which would be an interesting piece of work.
Havinglooked at the json code a little doing partial searches would require a lot of re-work that is above my paygrade,
butif there was an iterator in place, at least that next stop would then be open. 
>
> Note that adding an iterator isn’t adding two ways to do the same thing, since the iterator would slot nicely
underneaththe existing slicing API, and just iterate to the requested slice size. So this is easily just “another step”
alongthe train line to providing streaming access to compressed and TOASTed data. 

Yeah.  Plus, I'm not sure the iterator thing is even the right design
for the JSONB case.  It might be better to think, for that case, about
whether there's someway to operate directly on the compressed data.
If you could somehow jigger the format and the chunking so that you
could jump directly to the right chunk and decompress from there,
rather than having to walk over all of the earlier chunks to figure
out where the data you want is, you could probably obtain a large
performance benefit.  But figuring out how to design such a scheme
seems pretty far afield from the topic at hand.

I'd actually be inclined not to add an iterator until we have a real
user for it, for exactly the reason that we don't know that it is the
right thing.  But there is certain value in decompressing partially,
to a known byte position, as your patch does, no matter what we decide
to do about that stuff.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite <daniel@manitou-mail.org> wrote:
>
>         Paul Ramsey wrote:
>
> > Oddly enough, I couldn't find many/any things that were sensitive to
> > left-end decompression. The only exception is "LIKE this%" which
> > clearly would be helped, but unfortunately wouldn't be a quick
> > drop-in, but a rather major reorganization of the regex handling.
>
> What about starts_with(string, prefix)?
>
> text_starts_with(arg1,arg2) in varlena.c does a full decompression
> of  arg1 when it could limit itself to the length of the smaller arg2:

Nice catch, I didn't find that one as it's not user visible, seems to
be only called in spgist (!!)
./backend/access/spgist/spgtextproc.c:
DatumGetBool(DirectFunctionCall2(text_starts_with

Thanks, I'll add that.

P


Re: Compressed TOAST Slicing

From
"Daniel Verite"
Date:
    Paul Ramsey wrote:

> > text_starts_with(arg1,arg2) in varlena.c does a full decompression
> > of  arg1 when it could limit itself to the length of the smaller arg2:
>
> Nice catch, I didn't find that one as it's not user visible, seems to
> be only called in spgist (!!)

It's also exposed in SQL since v11, as
  starts_with(string,prefix) returns bool
and as an operator:
  text ^@ text
I guess it's meant to be more efficient than (string LIKE prefix||'%')
or strpos(string,prefix)=1, and it will be even more so if we can
avoid some amount of decompression :)


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Compressed TOAST Slicing

From
Tom Lane
Date:
Paul Ramsey <pramsey@cleverelephant.ca> writes:
>> On Feb 20, 2019, at 10:37 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If we add one set of code now and need to add another different one later, we will have 2 sets of code that do
similarthings. 

> Note that adding an iterator isn’t adding two ways to do the same thing,
> since the iterator would slot nicely underneath the existing slicing
> API, and just iterate to the requested slice size. So this is easily
> just “another step” along the train line to providing streaming access
> to compressed and TOASTed data.

Yeah, I find Paul's argument fairly convincing there.  There wouldn't be
much code duplication, and for the places that can use it, a "fetch the
first N bytes" API is probably going to be more natural and easier to
use than an iteration-based API.  So we'd likely want to keep it, even
if it ultimately becomes just a thin wrapper over the iterator.

I've not reviewed the patch, but as far as the proposed functionality
goes, it seems fine to accept this rather than waiting for something
much more difficult to happen.

            regards, tom lane


Re: Compressed TOAST Slicing

From
Tomas Vondra
Date:
On 2/20/19 7:50 PM, Robert Haas wrote:
> On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>> What this does not support: any function that probably wants 
>> less-than-everything, but doesn’t know how big a slice to look
>> for. Stephen thinks I should put an iterator on decompression,
>> which would be an interesting piece of work. >> Having looked at
>> the json code a little doing partial searches would require a lot
>> of re-work that is above my paygrade, but if there was an iterator
>> in place, at least that next stop would then be open.
>>
>> Note that adding an iterator isn’t adding two ways to do the same 
>> thing, since the iterator would slot nicely underneath the existing
>> slicing API, and just iterate to the requested slice size. So this
>> is easily just “another step” along the train line to providing
>> streaming access to compressed and TOASTed data.
> 
> Yeah.  Plus, I'm not sure the iterator thing is even the right design
> for the JSONB case.  It might be better to think, for that case, about
> whether there's someway to operate directly on the compressed data.
> If you could somehow jigger the format and the chunking so that you
> could jump directly to the right chunk and decompress from there,
> rather than having to walk over all of the earlier chunks to figure
> out where the data you want is, you could probably obtain a large
> performance benefit.  But figuring out how to design such a scheme
> seems pretty far afield from the topic at hand.
> 

I doubt working directly on compressed data is doable / efficient unless
the compression was designed with that use case in mind. Which pglz
almost certainly was not. Furthermore, I think there's an issue with
layering - the compression currently happens in the TOAST infrastructure
(and Paul's patch does not change this), but operating on compressed
data is inherently specific to a given data type.

> I'd actually be inclined not to add an iterator until we have a real 
> user for it, for exactly the reason that we don't know that it is
> the right thing.  But there is certain value in decompressing
> partially, to a known byte position, as your patch does, no matter
> what we decide to do about that stuff.
> 

Well, I think Simon's suggestion was that we should also use the
iterator from JSONB code, so that would be the use of it. And if Paul
implemented the current slicing on top of the iterator, that would also
be an user (even without the JSONB stuff).

But I think Andres is right this might increase the complexity of the
patch too much, possibly pushing it from PG12. I don't see anything
wrong with doing the simple approach now and then extending it to handle
JSONB later, if someone wants to invest their time into it.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Compressed TOAST Slicing

From
Stephen Frost
Date:
Greetings,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite <daniel@manitou-mail.org> wrote:
> >
> >         Paul Ramsey wrote:
> >
> > > Oddly enough, I couldn't find many/any things that were sensitive to
> > > left-end decompression. The only exception is "LIKE this%" which
> > > clearly would be helped, but unfortunately wouldn't be a quick
> > > drop-in, but a rather major reorganization of the regex handling.
> >
> > What about starts_with(string, prefix)?
> >
> > text_starts_with(arg1,arg2) in varlena.c does a full decompression
> > of  arg1 when it could limit itself to the length of the smaller arg2:
>
> Nice catch, I didn't find that one as it's not user visible, seems to
> be only called in spgist (!!)
> ./backend/access/spgist/spgtextproc.c:
> DatumGetBool(DirectFunctionCall2(text_starts_with
>
> Thanks, I'll add that.

That sounds good to me, I look forward to an updated patch.  As Andres
mentioned, he and I chatted a bit about this approach vs the iterator
approach at FOSDEM and convinced me that this is worthwhile to do even
if we might add an iterator approach later- which also seems to be the
consensus of this thread, so once the patch has been updated to catch
this case, I'll review it (again) with an eye on committing it soon.

Thanks!

Stephen

Attachment

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
On Wed, Feb 20, 2019 at 1:12 PM Stephen Frost <sfrost@snowman.net> wrote:
>
> * Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> > On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite <daniel@manitou-mail.org> wrote:
> > >
> > > What about starts_with(string, prefix)?
> > Thanks, I'll add that.
>
> That sounds good to me, I look forward to an updated patch.
> once the patch has been updated to catch
> this case, I'll review it (again) with an eye on committing it soon.

Merci! Attached are updated patches.

Attachment

Re: Compressed TOAST Slicing

From
Darafei Praliaskouski
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

I have read the patch and have no problems with it.

The feature is super valuable for complex PostGIS-enabled databases.

Re: Compressed TOAST Slicing

From
Alvaro Herrera
Date:
On 2019-Mar-11, Darafei Praliaskouski wrote:

> The feature is super valuable for complex PostGIS-enabled databases.

After having to debug a perf problem in this area, I agree, +1 for the patch.

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Compressed TOAST Slicing

From
Regina Obe
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            not tested

No need for documentation as this is a performance improvement patch

I tested on windows mingw64 (as of a week ago) and confirmed the patch applies cleanly and significantly faster for
left,substr tests than head. 

Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:
Hi!

> 21 февр. 2019 г., в 23:50, Paul Ramsey <pramsey@cleverelephant.ca> написал(а):
>
> Merci! Attached are updated patches.
> <compressed-datum-slicing-20190221a.patch><compressed-datum-slicing-left-20190221a.patch>

As noted before, patches are extremely useful.
So, I've looked into the code too.

I've got some questions about pglz_decompress() changes:

1.
+                    if (dp >= destend)    /* check for buffer overrun */
+                        break;        /* do not clobber memory */
This is done inside byte-loop. But can we just calculate len = min(len, destend - dp) beforehand?

2. Function argument is_slice is only preventing error.

+     * If we are slicing, then we won't necessarily
+     * be at the end of the source or dest buffers
+     * when we hit a stop, so we don't test then.

But I do not get why we should get that error. If we have limited dest, why we should not fill it entirely?

3. And I'd use memmove despite the comment why we do not do that. It is SSE-optimized and cache-optimized nowadays.
But this in not point of this patch, so let's discard this point.

Best regards, Andrey Borodin.

Re: Compressed TOAST Slicing

From
Michael Paquier
Date:
On Mon, Mar 11, 2019 at 08:38:56PM +0000, Regina Obe wrote:
> I tested on windows mingw64 (as of a week ago) and confirmed the
> patch applies cleanly and significantly faster for left, substr
> tests than head.

int32
pglz_decompress(const char *source, int32 slen, char *dest,
-                               int32 rawsize)
+                               int32 rawsize, bool is_slice)
The performance improvements are nice, but breaking a published API is
less nice particularly since some work has been done to make pglz more
plugabble (see 60838df9, guess how wrote that).  Could it be possible
to rework this part please?  It's been some time since I touched this
code, but it would be really nice if we don't have an extra parameter,
and just not bypass the sanity checks at the end.  Using a parameter
to bypass those checks may cause problems for future callers of it.
--
Michael

Attachment

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:

> On Mar 11, 2019, at 10:42 PM, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Mar 11, 2019 at 08:38:56PM +0000, Regina Obe wrote:
>> I tested on windows mingw64 (as of a week ago) and confirmed the
>> patch applies cleanly and significantly faster for left, substr
>> tests than head.
>
> int32
> pglz_decompress(const char *source, int32 slen, char *dest,
> -                               int32 rawsize)
> +                               int32 rawsize, bool is_slice)
> The performance improvements are nice, but breaking a published API is
> less nice particularly since some work has been done to make pglz more
> plugabble (see 60838df9, guess how wrote that).  Could it be possible
> to rework this part please?  It's been some time since I touched this
> code, but it would be really nice if we don't have an extra parameter,
> and just not bypass the sanity checks at the end.  Using a parameter
> to bypass those checks may cause problems for future callers of it.

The sanity check is just that both buffers are completely read they reach their respective ends. With a partial buffer
onone side, that check just will definitionally not  happen when slicing (it’s not possible to know a priori what
locationin the compressed buffer corresponds to a location in the uncompressed one). I can ensure the old API still
holdsfor pglz_decompress() and add a new pglz_decompress_slice() that takes the parameter, is that sufficient? 

P.




Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:

> 12 марта 2019 г., в 19:40, Paul Ramsey <pramsey@cleverelephant.ca> написал(а):
>
>> On Mar 11, 2019, at 10:42 PM, Michael Paquier <michael@paquier.xyz> wrote:
>>
>> int32
>> pglz_decompress(const char *source, int32 slen, char *dest,
>> -                               int32 rawsize)
>> +                               int32 rawsize, bool is_slice)
>
> The sanity check is just that both buffers are completely read they reach their respective ends. With a partial
bufferon one side, that check just will definitionally not  happen when slicing (it’s not possible to know a priori
whatlocation in the compressed buffer corresponds to a location in the uncompressed one). I can ensure the old API
stillholds for pglz_decompress() and add a new pglz_decompress_slice() that takes the parameter, is that sufficient? 

I think that providing two separate entry points for this functionality is better option.
The word "slice" is widely used for [start:end] slicing, not sure it's good word. But I'm not good in English.

Either way we could replace

if (dp != destend || sp != srcend)
    return -1;

with

if (dp != destend && sp != srcend)
    return -1;

and that's it. || defends from data corruption and some kind of programming mistakes, but actually that's not the
purposeof data compression. 

Best regards, Andrey Borodin.

Re: Compressed TOAST Slicing

From
Andres Freund
Date:
On 2019-03-12 14:42:14 +0900, Michael Paquier wrote:
> On Mon, Mar 11, 2019 at 08:38:56PM +0000, Regina Obe wrote:
> > I tested on windows mingw64 (as of a week ago) and confirmed the
> > patch applies cleanly and significantly faster for left, substr
> > tests than head. 
> 
> int32
> pglz_decompress(const char *source, int32 slen, char *dest,
> -                               int32 rawsize)
> +                               int32 rawsize, bool is_slice)

> The performance improvements are nice, but breaking a published API is
> less nice particularly since some work has been done to make pglz more
> plugabble (see 60838df9, guess how wrote that).

I don't think that should stop us from breaking the API. You've got to
do quite low level stuff to need pglz directly, in which case such an
API change should be the least of your problems between major versions.

Greetings,

Andres Freund


Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:

> On Mar 12, 2019, at 9:13 AM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2019-03-12 14:42:14 +0900, Michael Paquier wrote:
>> On Mon, Mar 11, 2019 at 08:38:56PM +0000, Regina Obe wrote:
>>> I tested on windows mingw64 (as of a week ago) and confirmed the
>>> patch applies cleanly and significantly faster for left, substr
>>> tests than head.
>>
>> int32
>> pglz_decompress(const char *source, int32 slen, char *dest,
>> -                               int32 rawsize)
>> +                               int32 rawsize, bool is_slice)
>
>> The performance improvements are nice, but breaking a published API is
>> less nice particularly since some work has been done to make pglz more
>> plugabble (see 60838df9, guess how wrote that).
>
> I don't think that should stop us from breaking the API. You've got to
> do quite low level stuff to need pglz directly, in which case such an
> API change should be the least of your problems between major versions.

I was going to say that the function is only used twice in the code base, but I see it’s now used four times. So maybe
leavethe old signature in place and add the new one for my purposes after all. Though with only four internal calls, I
amguessing Michael is more concerned about external users than with internal ones? 

P.

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:


On Mar 12, 2019, at 9:45 AM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:



On Mar 12, 2019, at 9:13 AM, Andres Freund <andres@anarazel.de> wrote:

On 2019-03-12 14:42:14 +0900, Michael Paquier wrote:
On Mon, Mar 11, 2019 at 08:38:56PM +0000, Regina Obe wrote:
I tested on windows mingw64 (as of a week ago) and confirmed the
patch applies cleanly and significantly faster for left, substr
tests than head. 

int32
pglz_decompress(const char *source, int32 slen, char *dest,
-                               int32 rawsize)
+                               int32 rawsize, bool is_slice)

The performance improvements are nice, but breaking a published API is
less nice particularly since some work has been done to make pglz more
plugabble (see 60838df9, guess how wrote that).

I don't think that should stop us from breaking the API. You've got to
do quite low level stuff to need pglz directly, in which case such an
API change should be the least of your problems between major versions.

I was going to say that the function is only used twice in the code base, but I see it’s now used four times. So maybe leave the old signature in place and add the new one for my purposes after all. Though with only four internal calls, I am guessing Michael is more concerned about external users than with internal ones?

So…
- two signatures?
- old signature but reduced error checking?
- elephant?

P

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:


On Mar 11, 2019, at 10:22 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

21 февр. 2019 г., в 23:50, Paul Ramsey <pramsey@cleverelephant.ca> написал(а):

Merci! Attached are updated patches.
<compressed-datum-slicing-20190221a.patch><compressed-datum-slicing-left-20190221a.patch>

As noted before, patches are extremely useful.
So, I've looked into the code too.

I've got some questions about pglz_decompress() changes:

1.
+ if (dp >= destend) /* check for buffer overrun */
+ break; /* do not clobber memory */
This is done inside byte-loop. But can we just calculate len = min(len, destend - dp) beforehand?

2. Function argument is_slice is only preventing error.

+ * If we are slicing, then we won't necessarily
+ * be at the end of the source or dest buffers
+ * when we hit a stop, so we don't test then.

But I do not get why we should get that error. If we have limited dest, why we should not fill it entirely?

Wow, took some sleuthing to figure out why this has to be the way that it is…

In the current code, there’s no true slicing, at some point all the slicing logic flows down into toast_decompress_datum() which knows via TOAST_COMPRESS_RAWSIZE() [3] exactly how big the original object was.

In the new code, we have to deal with a slice, which isn’t always exactly sized to the expected output. For example, the text_substring() function only knows it will have N *characters* [2], which it turns into a slice size of N*max_character_encoding_size (effectively N*4 for UTF-8). We could use the TOAST_COMPRESS_RAWSIZE to know the exact size of the entire original object and only pass in a buffer of that size, but we cannot know the size of the slice. It could be 4 chars of English and 4 bytes, or 4 chars of Chinese and 16 bytes. Once you stop using TOAST_COMPRESS_RAWSIZE  to exactly size your destination buffer, it’s possible (and it does happen) to have a destination buffer length passed into pglz_decompress that is much larger than the actual contents of the decompressed data, so we end up in a stopping condition in which we’ve run out of source buffer but still have plenty of destination buffer available to fill (illegal in the current code [1]

Anyways, all that to say, we can count on one of sp == srcend (had to decompress the whole thing) or dp == dstend (filled buffer before running out of source data) being true at the end of a slicing decompression, but not both at the same time. The only time they are going to be true at the same time is if we know a priori the size of the output, which we don’t, always (as in text_substring, where we only know the upper bounded size of the output).

Sorry, lot of words there. Anyways, we can still retain the old signature if that’s a deal breaker.

P.




Re: Compressed TOAST Slicing

From
Michael Paquier
Date:
On Tue, Mar 12, 2019 at 11:08:15AM -0700, Paul Ramsey wrote:
>> On Mar 12, 2019, at 9:45 AM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
>> I was going to say that the function is only used twice in the code
>> base, but I see it’s now used four times. So maybe leave the old
>> signature in place and add the new one for my purposes after
>> all. Though with only four internal calls, I am guessing Michael is
>> more concerned about external users than with internal ones?

Yes, external tools are the part I worry about.  It is better to avoid
breaking compatibility if there are workarounds we can apply.  Now I
agree that this particular one may not be the most used ever in the
community ecosystem.

> So…
> - two signatures?
> - old signature but reduced error checking?
> - elephant?

I have not looked at how much effort it would be to keep the current
API and still make the slicing happy, sorry ;p

One way to sort things out would be to have a new _extended() API
layer which is able to take a set of custom flags with one extra
argument, using bits16 for example.  This way, your new option becomes
a flag in an extensible set, and we don't need to worry about breaking
the API again in the future if more options are added.
--
Michael

Attachment

Re: Compressed TOAST Slicing

From
Andres Freund
Date:

On March 12, 2019 6:58:12 PM PDT, Michael Paquier <michael@paquier.xyz> wrote:
>On Tue, Mar 12, 2019 at 11:08:15AM -0700, Paul Ramsey wrote:
>>> On Mar 12, 2019, at 9:45 AM, Paul Ramsey <pramsey@cleverelephant.ca>
>wrote:
>>> I was going to say that the function is only used twice in the code
>>> base, but I see it’s now used four times. So maybe leave the old
>>> signature in place and add the new one for my purposes after
>>> all. Though with only four internal calls, I am guessing Michael is
>>> more concerned about external users than with internal ones?
>
>Yes, external tools are the part I worry about.  It is better to avoid
>breaking compatibility if there are workarounds we can apply.  Now I
>agree that this particular one may not be the most used ever in the
>community ecosystem.
>
>> So…
>> - two signatures?
>> - old signature but reduced error checking?
>> - elephant?
>
>I have not looked at how much effort it would be to keep the current
>API and still make the slicing happy, sorry ;p
>
>One way to sort things out would be to have a new _extended() API
>layer which is able to take a set of custom flags with one extra
>argument, using bits16 for example.  This way, your new option becomes
>a flag in an extensible set, and we don't need to worry about breaking
>the API again in the future if more options are added.

I don't think this is even close to popular enough to incur the maybe of a separate function / more complicated
interface.By this logic we can change basically no APIs anymore. 
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: Compressed TOAST Slicing

From
Michael Paquier
Date:
On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote:
> I don't think this is even close to popular enough to incur the
> maybe of a separate function / more complicated interface. By this
> logic we can change basically no APIs anymore.

Well, if folks here think that it is not worth worrying about, I won't
cry on that either.  If only the original API is kept, could it just
be possible to make it extensible with some bits16 flags then?  Adding
only a boolean is not really appealing.
--
Michael

Attachment

Re: Compressed TOAST Slicing

From
Tomas Vondra
Date:
On 3/13/19 3:19 AM, Michael Paquier wrote:
> On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote:
>> I don't think this is even close to popular enough to incur the
>> maybe of a separate function / more complicated interface. By this
>> logic we can change basically no APIs anymore. 
> 
> Well, if folks here think that it is not worth worrying about, I won't
> cry on that either.  If only the original API is kept, could it just
> be possible to make it extensible with some bits16 flags then?  Adding
> only a boolean is not really appealing.

In my experience "extensible" APIs with bitmasks are terrible - it's a
PITA to both use them and maintain stuff that calls them. That is not to
say there is no use for that design pattern, or that I like API breaks.
But I very much prefer when an API change breaks things, alerting me of
places that may require attention.

And I'm with Andres here about the complexity being rather unwarranted
here - I don't think we've changed pglz API in years (if ever), so what
is the chance we'd actually benefit from the extensibility soon?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:

> On Mar 13, 2019, at 3:09 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On 3/13/19 3:19 AM, Michael Paquier wrote:
>> On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote:
>>> I don't think this is even close to popular enough to incur the
>>> maybe of a separate function / more complicated interface. By this
>>> logic we can change basically no APIs anymore.
>>
>> Well, if folks here think that it is not worth worrying about, I won't
>> cry on that either.  If only the original API is kept, could it just
>> be possible to make it extensible with some bits16 flags then?  Adding
>> only a boolean is not really appealing.
>
> In my experience "extensible" APIs with bitmasks are terrible - it's a
> PITA to both use them and maintain stuff that calls them. That is not to
> say there is no use for that design pattern, or that I like API breaks.
> But I very much prefer when an API change breaks things, alerting me of
> places that may require attention.
>
> And I'm with Andres here about the complexity being rather unwarranted
> here - I don't think we've changed pglz API in years (if ever), so what
> is the chance we'd actually benefit from the extensibility soon?

I’m just going to saw the baby in half, retaining the old pglz_decompress() signature and call into a
pglz_decompress_checked()signature that allows one to optionally turn off the checking at the end (which is all the
splitboolean argument does, so probably my current name is not the best name for that argument). 

Scream madly at me if you consider this inappropriate.

P

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:


On Mar 13, 2019, at 8:25 AM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:



On Mar 13, 2019, at 3:09 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/13/19 3:19 AM, Michael Paquier wrote:
On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote:
I don't think this is even close to popular enough to incur the
maybe of a separate function / more complicated interface. By this
logic we can change basically no APIs anymore. 

Well, if folks here think that it is not worth worrying about, I won't
cry on that either.  If only the original API is kept, could it just
be possible to make it extensible with some bits16 flags then?  Adding
only a boolean is not really appealing.

In my experience "extensible" APIs with bitmasks are terrible - it's a
PITA to both use them and maintain stuff that calls them. That is not to
say there is no use for that design pattern, or that I like API breaks.
But I very much prefer when an API change breaks things, alerting me of
places that may require attention.

And I'm with Andres here about the complexity being rather unwarranted
here - I don't think we've changed pglz API in years (if ever), so what
is the chance we'd actually benefit from the extensibility soon?

I’m just going to saw the baby in half, retaining the old pglz_decompress() signature and call into a pglz_decompress_checked() signature that allows one to optionally turn off the checking at the end (which is all the split boolean argument does, so probably my current name is not the best name for that argument).

Scream madly at me if you consider this inappropriate.

Here is a new (final?) patch that adds the extra signature and for good measure includes the minor changes to varlena.c necessary to turn on slicing behaviour for left() and starts_with() (text_substring() already slices by default, just to no avail without this patch)

P.

Attachment

Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:

> 13 марта 2019 г., в 21:05, Paul Ramsey <pramsey@cleverelephant.ca> написал(а):
>
> Here is a new (final?) patch ...
>
> <compressed-datum-slicing-20190313a.patch>

This check

@@ -744,6 +748,8 @@ pglz_decompress(const char *source, int32 slen, char *dest,
                 {
                     *dp = dp[-off];
                     dp++;
+                    if (dp >= destend)    /* check for buffer overrun */
+                        break;        /* do not clobber memory */
                 }

is still done for every byte. You can precompute maximum allowed length before that cycle. Here's diff

diff --git a/src/common/pg_lzcompress.c b/src/common/pg_lzcompress.c
index 6b48892a8f..05b2b3d5d1 100644
--- a/src/common/pg_lzcompress.c
+++ b/src/common/pg_lzcompress.c
@@ -744,12 +744,11 @@ pglz_decompress_checked(const char *source, int32 slen, char *dest,
                                 * memcpy() here, because the copied areas could overlap
                                 * extremely!
                                 */
+                               len = Min(len, destend - dp);
                                while (len--)
                                {
                                        *dp = dp[-off];
                                        dp++;
-                                       if (dp >= destend)      /* check for buffer overrun */
-                                               break;          /* do not clobber memory */
                                }
                        }
                        else


Best regards, Andrey Borodin.

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
> On Mar 13, 2019, at 9:32 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
>> 13 марта 2019 г., в 21:05, Paul Ramsey <pramsey@cleverelephant.ca> написал(а):
>>
>> Here is a new (final?) patch ...
>>
>> <compressed-datum-slicing-20190313a.patch>
>
> This check
>
> @@ -744,6 +748,8 @@ pglz_decompress(const char *source, int32 slen, char *dest,
>                 {
>                     *dp = dp[-off];
>                     dp++;
> +                    if (dp >= destend)    /* check for buffer overrun */
> +                        break;        /* do not clobber memory */
>                 }
>
> is still done for every byte. You can precompute maximum allowed length before that cycle. Here's diff

Thanks! Attached change,

P
Attachment

Re: Compressed TOAST Slicing

From
Stephen Frost
Date:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-03-12 14:42:14 +0900, Michael Paquier wrote:
> > On Mon, Mar 11, 2019 at 08:38:56PM +0000, Regina Obe wrote:
> > > I tested on windows mingw64 (as of a week ago) and confirmed the
> > > patch applies cleanly and significantly faster for left, substr
> > > tests than head.
> >
> > int32
> > pglz_decompress(const char *source, int32 slen, char *dest,
> > -                               int32 rawsize)
> > +                               int32 rawsize, bool is_slice)
>
> > The performance improvements are nice, but breaking a published API is
> > less nice particularly since some work has been done to make pglz more
> > plugabble (see 60838df9, guess how wrote that).
>
> I don't think that should stop us from breaking the API. You've got to
> do quite low level stuff to need pglz directly, in which case such an
> API change should be the least of your problems between major versions.

Agreed, this is across a major version and I don't think it's an issue
to break the API.

Thanks!

Stephen

Attachment

Re: Compressed TOAST Slicing

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Andres Freund (andres@anarazel.de) wrote:
>> I don't think that should stop us from breaking the API. You've got to
>> do quite low level stuff to need pglz directly, in which case such an
>> API change should be the least of your problems between major versions.

> Agreed, this is across a major version and I don't think it's an issue
> to break the API.

Yeah.  We don't normally hesitate to change internal APIs across major
versions, as long as
(a) the breakage will be obvious when recompiling an extension, and
(b) it will be clear how to get the same behavior as before.

Adding an argument qualifies on both counts.  Sometimes, if a very
large number of call sites would be affected, it makes sense to use
a wrapper function so that we don't have to touch so many places;
but that doesn't apply here.

            regards, tom lane


Re: Compressed TOAST Slicing

From
Robert Haas
Date:
On Mon, Mar 18, 2019 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Andres Freund (andres@anarazel.de) wrote:
> >> I don't think that should stop us from breaking the API. You've got to
> >> do quite low level stuff to need pglz directly, in which case such an
> >> API change should be the least of your problems between major versions.
>
> > Agreed, this is across a major version and I don't think it's an issue
> > to break the API.
>
> Yeah.  We don't normally hesitate to change internal APIs across major
> versions, as long as
> (a) the breakage will be obvious when recompiling an extension, and
> (b) it will be clear how to get the same behavior as before.
>
> Adding an argument qualifies on both counts.  Sometimes, if a very
> large number of call sites would be affected, it makes sense to use
> a wrapper function so that we don't have to touch so many places;
> but that doesn't apply here.

+1.  I think Paul had it right originally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:

> On Mar 18, 2019, at 7:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Mar 18, 2019 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Stephen Frost <sfrost@snowman.net> writes:
>>> * Andres Freund (andres@anarazel.de) wrote:
>>>> I don't think that should stop us from breaking the API. You've got to
>>>> do quite low level stuff to need pglz directly, in which case such an
>>>> API change should be the least of your problems between major versions.
>>
>>> Agreed, this is across a major version and I don't think it's an issue
>>> to break the API.
>>
>> Yeah.  We don't normally hesitate to change internal APIs across major
>> versions, as long as
>> (a) the breakage will be obvious when recompiling an extension, and
>> (b) it will be clear how to get the same behavior as before.
>>
>> Adding an argument qualifies on both counts.  Sometimes, if a very
>> large number of call sites would be affected, it makes sense to use
>> a wrapper function so that we don't have to touch so many places;
>> but that doesn't apply here.
>
> +1.  I think Paul had it right originally.

In that spirit, here is a “one pglz_decompress function, new parameter” version for commit.
Thanks!
P

Attachment

Re: Compressed TOAST Slicing

From
Stephen Frost
Date:
Greetings,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> > On Mar 18, 2019, at 7:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > +1.  I think Paul had it right originally.
>
> In that spirit, here is a “one pglz_decompress function, new parameter” version for commit.

Alright, I've been working through this and have made a few improvements
(the big comment block at the top of pg_lzcompress.c needed updating,
among a couple other minor things), but I was trying to wrap my head
around this:

+/* ----------
+ * toast_decompress_datum_slice -
+ *
+ * Decompress the front of a compressed version of a varlena datum.
+ * offset handling happens in heap_tuple_untoast_attr_slice.
+ * Here we just decompress a slice from the front.
+ */
+static struct varlena *
+toast_decompress_datum_slice(struct varlena *attr, int32 slicelength)
+{
+    struct varlena *result;
+    int32 rawsize;
+
+    Assert(VARATT_IS_COMPRESSED(attr));
+
+    result = (struct varlena *) palloc(slicelength + VARHDRSZ);
+    SET_VARSIZE(result, slicelength + VARHDRSZ);
+
+    rawsize = pglz_decompress(TOAST_COMPRESS_RAWDATA(attr),
+                        VARSIZE(attr) - TOAST_COMPRESS_HDRSZ,
+                        VARDATA(result),
+                        slicelength, false);
+    if (rawsize < 0)
         elog(ERROR, "compressed data is corrupted");

+    SET_VARSIZE(result, rawsize + VARHDRSZ);
     return result;
 }

Specifically, the two SET_VARSIZE() calls, do we really need both..?
Are we sure that we're setting the length correctly there..?  Is there
any cross-check we can do?

I have to admit that I find the new argument to pglz_decompress() a bit
awkward to describe and document; if you have any thoughts as to how
that could be improved, that'd be great.

Thanks!

Stephen

Attachment

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:

> On Mar 19, 2019, at 4:47 AM, Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Paul Ramsey (pramsey@cleverelephant.ca) wrote:
>>> On Mar 18, 2019, at 7:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> +1.  I think Paul had it right originally.
>>
>> In that spirit, here is a “one pglz_decompress function, new parameter” version for commit.
>
> Alright, I've been working through this and have made a few improvements
> (the big comment block at the top of pg_lzcompress.c needed updating,
> among a couple other minor things), but I was trying to wrap my head
> around this:
>
>
> Specifically, the two SET_VARSIZE() calls, do we really need both..?
> Are we sure that we're setting the length correctly there..?  Is there
> any cross-check we can do?

Well, we don’t need to do the two SET_VARSIZE() calls, but we *do* need to use rawsize in the call before the return,
sincewe cannot be sure that the size of the uncompressed bit is as large as the requested slice (even though it will be
99times out of 100) 


> I have to admit that I find the new argument to pglz_decompress() a bit
> awkward to describe and document; if you have any thoughts as to how
> that could be improved, that'd be great.

The only thing I can see is loosening the integrity check in pglz_decompress which is a guardrail on something I’m not
surewe ever hit. Instead of checking that both the src and dst buffers are fully used up, a test that at least one of
themis used up should come up true in all error-free-happy cases. 

P

Re: Compressed TOAST Slicing

From
Stephen Frost
Date:
Greetings,

* Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> > On Mar 19, 2019, at 4:47 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Paul Ramsey (pramsey@cleverelephant.ca) wrote:
> >>> On Mar 18, 2019, at 7:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>> +1.  I think Paul had it right originally.
> >>
> >> In that spirit, here is a “one pglz_decompress function, new parameter” version for commit.
> >
> > Alright, I've been working through this and have made a few improvements
> > (the big comment block at the top of pg_lzcompress.c needed updating,
> > among a couple other minor things), but I was trying to wrap my head
> > around this:
> >
> >
> > Specifically, the two SET_VARSIZE() calls, do we really need both..?
> > Are we sure that we're setting the length correctly there..?  Is there
> > any cross-check we can do?
>
> Well, we don’t need to do the two SET_VARSIZE() calls, but we *do* need to use rawsize in the call before the return,
sincewe cannot be sure that the size of the uncompressed bit is as large as the requested slice (even though it will be
99times out of 100) 

Sure, of course, that makes sense, what didn't make much sense was
setting it and then setting it again to something different.

I'll pull out the extra one then.

> > I have to admit that I find the new argument to pglz_decompress() a bit
> > awkward to describe and document; if you have any thoughts as to how
> > that could be improved, that'd be great.
>
> The only thing I can see is loosening the integrity check in pglz_decompress which is a guardrail on something I’m
notsure we ever hit. Instead of checking that both the src and dst buffers are fully used up, a test that at least one
ofthem is used up should come up true in all error-free-happy cases. 

Hrmpf.  I don't really like loosening up the integrity check in the
cases where we should be using up everything though.  As such, I'll go
with what you've proposed here.  We can adjust it later if we end up
deciding that reducing the error-checking is reasonable.

I'll plan to push this tomorrow with the above change (and a few
additional comments to explain what all is going on..).

Thanks!

Stephen

Attachment

Re: Compressed TOAST Slicing

From
Darafei "Komяpa" Praliaskouski
Date:
Hi!
 
I'll plan to push this tomorrow with the above change (and a few
additional comments to explain what all is going on..).

Is everything ok? Can it be pushed?

I'm looking here, haven't found it pushed and worry about this.
https://github.com/postgres/postgres/commits/master
 

--
Darafei Praliaskouski

Re: Compressed TOAST Slicing

From
Stephen Frost
Date:
Greetings,

* Darafei "Komяpa" Praliaskouski (me@komzpa.net) wrote:
> > I'll plan to push this tomorrow with the above change (and a few
> > additional comments to explain what all is going on..).
>
> Is everything ok? Can it be pushed?

This has been pushed now.

Thanks!

Stephen

Attachment

Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:
Hi!

> 12 марта 2019 г., в 10:22, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):
>
> 3. And I'd use memmove despite the comment why we do not do that. It is SSE-optimized and cache-optimized nowadays.


So, I've pushed idea a little bit and showed that decompress byte-copy cycle to Vladimir Leskov.
while (len--)
{
    *dp = dp[-off];
    dp++;
}

He advised me to use algorithm that splits copied regions into smaller non-overlapping subregions with exponentially
increasingsize. 

while (off <= len)
{
    memcpy(dp, dp - off, off);
    len -= off;
    dp += off;
    off *= 2;
}
memcpy(dp, dp - off, len);

On original Paul's test without patch of this thread this optimization gave about x2.5 speedup.
I've composed more detailed tests[0] and tested against current master. Now it only gives 20%-25% of decompression
speedup,but I think it is still useful. 

Best regards, Andrey Borodin.

[0] Here's the test

create table if not exists slicingtest1 as select repeat('0', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest2 as select repeat('01', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest3 as select repeat('012', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest4 as select repeat('0123', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest5 as select repeat('01234', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest6 as select repeat('012345', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest7 as select repeat('0123456', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest8 as select repeat('01234567', 10000) as a from generate_series(1,10000);
create table if not exists slicingtest16 as select repeat('0123456789ABCDEF', 10000) as a from
generate_series(1,10000);
create table if not exists slicingtest32 as select repeat('0x1x2x3x4x5x6x7x8x9xAxBxCxDxExFx', 10000) as a from
generate_series(1,10000);
create table if not exists slicingtest64 as select
repeat('0xyz1xyz2xyz3xyz4xyz5xyz6xyz7xyz8xyz9xyzAxyzBxyzCxyzDxyzExyzFxyz',10000) as a from generate_series(1,10000); 

\timing off
select sum(length(a)) from slicingtest1; -- do for every stride lenght
\timing on
select sum(length(a)) from slicingtest1;



Attachment

Re: Compressed TOAST Slicing

From
Paul Ramsey
Date:
> On Apr 9, 2019, at 10:09 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> He advised me to use algorithm that splits copied regions into smaller non-overlapping subregions with exponentially
increasingsize. 
>
> while (off <= len)
> {
>    memcpy(dp, dp - off, off);
>    len -= off;
>    dp += off;
>    off *= 2;
> }
> memcpy(dp, dp - off, len);
>
> On original Paul's test without patch of this thread this optimization gave about x2.5 speedup.
> I've composed more detailed tests[0] and tested against current master. Now it only gives 20%-25% of decompression
speedup,but I think it is still useful. 

Wow, well beyond slicing, just being able to decompress 25% faster is a win for pretty much any TOAST use case. I guess
the$100 question is: portability? The whole reason for the old-skool code that’s there now was concerns about
memcpy’ingoverlapping addresses and Bad Things happening. 

P.


Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:

> 9 апр. 2019 г., в 22:12, Paul Ramsey <pramsey@cleverelephant.ca> написал(а):
>
> Wow, well beyond slicing, just being able to decompress 25% faster is a win for pretty much any TOAST use case. I
guessthe $100 question is: portability? The whole reason for the old-skool code that’s there now was concerns about
memcpy’ingoverlapping addresses and Bad Things happening. 

Yeah, I've observed Bad Things (actually pretty cool and neat things) during memcpy of overlapping regions even on my
laptop.
But here we never copy overlapping addresses in one memcpy() call.

Though my tests are very synthetic. Do we have a natural way to demonstrate a performance improvement? Like reference
pileof bytes... 

Best regards, Andrey Borodin.


Re: Compressed TOAST Slicing

From
Andres Freund
Date:
On 2019-04-09 10:12:56 -0700, Paul Ramsey wrote:
> 
> > On Apr 9, 2019, at 10:09 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > 
> > He advised me to use algorithm that splits copied regions into smaller non-overlapping subregions with
exponentiallyincreasing size.
 
> > 
> > while (off <= len)
> > {
> >    memcpy(dp, dp - off, off);
> >    len -= off;
> >    dp += off;
> >    off *= 2;
> > }
> > memcpy(dp, dp - off, len);
> > 
> > On original Paul's test without patch of this thread this optimization gave about x2.5 speedup.
> > I've composed more detailed tests[0] and tested against current master. Now it only gives 20%-25% of decompression
speedup,but I think it is still useful.
 
> 
> Wow, well beyond slicing, just being able to decompress 25% faster is a win for pretty much any TOAST use case. I
guessthe $100 question is: portability? The whole reason for the old-skool code that’s there now was concerns about
memcpy’ingoverlapping addresses and Bad Things happening.
 

Just use memmove? It's usually as fast these days.



Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:

> 9 апр. 2019 г., в 22:20, Andres Freund <andres@anarazel.de> написал(а):
>
> Just use memmove? It's usually as fast these days.
No, unfortunately, it is fixing things incompatible way.
In pglz side-effects of overlapping addresses are necessary, not the way memmove avoids it.

I.e. bytes
01234
  ^ copy here three bytes
memmove will give
01012
but we want
01010
    ^ this 0 is taken from result of overwrite by first byte move.

Best regards, Andrey Borodin.


Re: Compressed TOAST Slicing

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2019-04-09 10:12:56 -0700, Paul Ramsey wrote:
>> Wow, well beyond slicing, just being able to decompress 25% faster is a win for pretty much any TOAST use case. I
guessthe $100 question is: portability? The whole reason for the old-skool code that’s there now was concerns about
memcpy’ingoverlapping addresses and Bad Things happening. 

> Just use memmove? It's usually as fast these days.

If I recall what this is trying to do, memmove will give the wrong
result.  We want the expansion to replicate the same data multiple
times, which in normal use of memcpy/memmove would be thought to be
the Wrong Thing.

The proposal is kind of cute, but I'll bet it's a net loss for
small copy lengths --- likely we'd want some cutoff below which
we do it with the dumb byte-at-a-time loop.

            regards, tom lane



Re: Compressed TOAST Slicing

From
Andrey Borodin
Date:

> 9 апр. 2019 г., в 22:30, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
>
> The proposal is kind of cute, but I'll bet it's a net loss for
> small copy lengths --- likely we'd want some cutoff below which
> we do it with the dumb byte-at-a-time loop.

Ture.
I've made simple extension to compare decompression time on pgbench-generated WAL [0]

Use of smart memcpy unless match length is smaller than 16 (sane random value) gives about 20% speedup to decompression
time.
Sole use of memcpy gives smaller effect.

We will dig into this further.

Best regards, Andrey Borodin.


[0] https://github.com/x4m/test_pglz