Re: WIP patch: reducing overhead for repeat de-TOASTing - Mailing list pgsql-hackers

From Tom Lane
Subject Re: WIP patch: reducing overhead for repeat de-TOASTing
Date
Msg-id 17434.1214937830@sss.pgh.pa.us
Whole thread Raw
In response to Re: WIP patch: reducing overhead for repeat de-TOASTing  (Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk>)
Responses Re: WIP patch: reducing overhead for repeat de-TOASTing  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: WIP patch: reducing overhead for repeat de-TOASTing  (Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk>)
List pgsql-hackers
[ back on-list ]

Mark Cave-Ayland <mark.cave-ayland@siriusit.co.uk> writes:
> Thanks very much for supplying the WIP patch. In the interest of testing 
> and feedback, I've just downloaded PostgreSQL CVS head and applied your 
> patch, compiled up a slightly modified version of PostGIS (without 
> RECHECKs on the GiST opclass) and loaded in the same dataset.
> Unfortunately I have to report back that with your WIP patch applied, 
> timings seem to have become several orders of magnitude *worse*:

OK, I've reproduced the test case locally.  I believe that when you
say "worse", you mean "worse than 8.3", right?  And you did tell me
offlist that you were testing with --enable-cassert.  CVS HEAD has
very substantially greater cassert overhead because of the
randomize_memory addition --- oprofile output for this test looks like

samples  %        image name               symbol name
1239580  78.7721  postgres                 randomize_mem
143544    9.1218  libc-2.7.so              memcpy
48039     3.0528  libc-2.7.so              memset
13838     0.8794  postgres                 LWLockAcquire
12176     0.7738  postgres                 index_getnext
11697     0.7433  postgres                 LWLockRelease
10406     0.6613  postgres                 hash_search_with_hash_value
4739      0.3012  postgres                 toast_fetch_datum
4099      0.2605  postgres                 _bt_checkkeys
3905      0.2482  postgres                 AllocSetAlloc
3751      0.2384  postgres                 PinBuffer
3545      0.2253  postgres                 UnpinBuffer

I'm inclined to think that we'd better turn that off by default,
since it's not looking like it's catching anything new.

However, even with cassert turned off, the de-TOAST patch isn't
helping your test case; though it does help if I turn the query
into a join.  Here's what I get for
            8.3    HEAD    HEAD + patch

original query [1]        4575    4669    4613
original + force_2d [2]         196     195     201
converted to join [3]        4603    4667     209
original, indexscans off    2335    2391    2426
join, indexscans off        2350    2373     124

[1] select count(*) from geography where centroid && (select the_geom from geography where id=69495);

[2] select count(*) from geography where centroid && (select force_2d(the_geom) from geography where id=69495);

[3] select count(*) from geography g1, geography g2 where g1.centroid && g2.the_geom and g2.id=69495;

All times in msec, median of 3 trials using psql \timing (times seem
to be reproducible within about 1%, if data is already cached).
Default parameters all around.

The join form of the query produces the results I expected, with
g2.the_geom coming from the outer side of a nestloop join and getting
detoasted only once.  After some digging I found the reason why the
original query isn't getting any benefit: it's copying the_geom up
from an InitPlan, and nodeSubplan.c does that like this:
       /*        * We need to copy the subplan's tuple into our own context, in case        * any of the params are
pass-by-reftype --- the pointers stored in        * the param structs will point at this copied tuple! node->curTuple
    * keeps track of the copied tuple for eventual freeing.        */       if (node->curTuple)
heap_freetuple(node->curTuple);      node->curTuple = ExecCopySlotTuple(slot);
 
       /*        * Now set all the setParam params from the columns of the tuple        */       foreach(l,
subplan->setParam)      {           int            paramid = lfirst_int(l);           ParamExecData *prm =
&(econtext->ecxt_param_exec_vals[paramid]);
           prm->execPlan = NULL;           prm->value = heap_getattr(node->curTuple, i, tdesc,
          &(prm->isnull));           i++;       }
 

So the Param is pointing at a bare toasted Datum, not an indirect
pointer in a Slot, and there's no way to avoid detoasting each time
the Param is referenced.

It would be simple enough to fix nodeSubplan.c to copy the data into
an upper-level Slot rather than a bare tuple.  But this makes me wonder
how many other places are like this.  In the past there wasn't that much
benefit to pulling data from a Slot instead of a bare tuple, so I'm
afraid we might have a number of similar gotchas we'd have to track
down.

The other thing that worries me is that the force_2d query measurements
suggest a runtime penalty of two or three percent for the patch, which
is higher than I was hoping for.  But that isn't that much more than
the noise level in this test.

On the whole I'm still feeling pretty discouraged about this patch ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Access to localized_str_tolower()
Next
From: Shane Ambler
Date:
Subject: Re: A new take on the foot-gun meme