Home > mailing lists

Re: texteq/byteaeq: avoid detoast [REVIEW] - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	Re: texteq/byteaeq: avoid detoast [REVIEW]
Date	January 16, 2011 20:08:03
Msg-id	AANLkTimBy4QCXodj7XHqQhEqCqWZhme=WirzNa1vAEbF@mail.gmail.com Whole thread Raw
In response to	texteq/byteaeq: avoid detoast [REVIEW] (Andy Colson <andy@squeakycode.net>)
Responses	Re: texteq/byteaeq: avoid detoast [REVIEW] (Noah Misch <noah@leadboat.com>)
List	pgsql-hackers

Tree view

Hello

I looked on this patch too.

It's good idea.

I think, so we can have a function or macro that compare a varlena
sizes. Some like

Datum texteq(..)
{    if (!datumsHasSameLength(PG_GETARG_DATUM(0), PG_GETARG_DATUM(1))       PG_RETURN_FALSE();
    ... actual code ..
}

Regards

Pavel Stehule
2011/1/16 Andy Colson <andy@squeakycode.net>:
> This is a review of:
> https://commitfest.postgresql.org/action/patch_view?id=468
>
> Purpose:
> ========
> Equal and not-equal _may_ be quickly determined if their lengths are
> different.   This _may_ be a huge speed up if we dont have to detoat.
>
>
> The Patch:
> ==========
> I was able to read and understand the patch, its a simple change and looked
> correct to me (a non PG hacker).
> It applies clean to git head, compiles and runs fine with debug enabled.
>
> make check passes
>
>
> Usability:
> ==========
> I used _may_ above.  The benchmark included with the patch, showing huge
> speedups, is really contrived.  It uses a where clause with a thousand
> character constant:  (where c =
> 'long...long...long...long...ConstantText...etc').  In my opinion this is
> very uncommon (the author does note this is a "best case").  If you have a
> field large enough to be toasted you are not going to be using that to
> search on, you are going to have an ID field that is indexed.  (select c
> where id = 7)
>
> This also only touches = and <>.  > < and like wont be touched.  So I think
> the scope of this is limited.
>
> THAT being said, the patch is simple, and if you do happen to hit the code,
> it will speed things up.  As a user of PG I'd like to have this included.
>  Its a corner case, but a big corner, and its a small, simple change, and it
> wont slow anything else down.
>
>
> Performance:
> ============
> I created myself a more real world test, with a table with indexes and id's
> and a large toasted field.
>
> create table junk(id serial primary key, xgroup integer, c text);
> create index junk_group on junk(xgroup);
>
>
> I filled it full of junk:
>
> do $$
>        declare i integer;
>        declare j integer;
> begin
>        for i in 1..100 loop
>                for j in 1..500 loop
>                        insert into junk(xgroup, c) values (j, 'c'||i);
>                        insert into junk (xgroup, c) select j, repeat('abc',
> 2000)|| n from generate_series(1, 5) n;
>                end loop;
>        end loop;
> end$$;
>
>
> This will make about 600 records within the same xgroup.  As well as a
> simple 'c15' type of value in c we can search for.  My thinking is you may
> not know the exact unique id, but you do know what group its in, so that'll
> cut out 90% of the records, and then you'll have to add " and c = 'c15'" to
> get the exact one you want.
>
> I still saw a nice performance boost.
>
> Old PG:
> $ psql < bench3.sql
> Timing is on.
> DO
> Time: 2010.412 ms
>
> Patched:
> $ psql < bench3.sql
> Timing is on.
> DO
> Time: 184.602 ms
>
>
> bench3.sql:
> do $$
>        declare i integer;
> begin
>        for i in 1..400 loop
>                perform count(*) from junk where xgroup = i and c like 'c' ||
> i;
>        end loop;
> end$$;
>
>
>
> Summary:
> ========
> Performance speed-up:  Oh yeah!  If you just happen to hit it, and if you do
> hit it, you might want to re-think your layout a little bit.
>
> Do I want it?  Yes please.
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

pgsql-hackers by date:

From: Dimitri Fontaine
Date: 16 January 2011, 20:06:49
Subject: Re: LOCK for non-tables

From: Euler Taveira de Oliveira
Date: 16 January 2011, 20:14:38
Subject: Re: reviewers needed!

Re: texteq/byteaeq: avoid detoast [REVIEW] - Mailing list pgsql-hackers

Previous

Next