Re: Terrible performance on wide selects - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Terrible performance on wide selects
Date
Msg-id 1043316668.2348.15.camel@localhost.localdomain
Whole thread Raw
In response to Re: Terrible performance on wide selects  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Terrible performance on wide selects
List pgsql-hackers
Tom Lane kirjutas N, 23.01.2003 kell 02:18:
> "Dann Corbit" <DCorbit@connx.com> writes:
> > Why not waste a bit of memory and make the row buffer the maximum
> > possible length?
> > E.g. for varchar(2000) allocate 2000 characters + size element and point
> > to the start of that thing.
>
> Surely you're not proposing that we store data on disk that way.
>
> The real issue here is avoiding overhead while extracting columns out of
> a stored tuple.  We could perhaps use a different, less space-efficient
> format for temporary tuples in memory than we do on disk, but I don't
> think that will help a lot.  The nature of O(N^2) bottlenecks is you
> have to kill them all --- for example, if we fix printtup and don't do
> anything with ExecEvalVar, we can't do more than double the speed of
> Steve's example, so it'll still be slow.  So we must have a solution for
> the case where we are disassembling a stored tuple, anyway.
>
> I have been sitting here toying with a related idea, which is to use the
> heap_deformtuple code I suggested before to form an array of pointers to
> Datums in a specific tuple (we could probably use the TupleTableSlot
> mechanisms to manage the memory for these).  Then subsequent accesses to
> individual columns would just need an array-index operation, not a
> nocachegetattr call.  The trick with that would be that if only a few
> columns are needed out of a row, it might be a net loss to compute the
> Datum values for all columns.  How could we avoid slowing that case down
> while making the wide-tuple case faster?

make the pointer array incrementally for O(N) performance:

i.e. for tuple with 100 cols, allocate an array of 100 pointers, plus
keep count of how many are actually valid,

so the first call to get col[5] will fill first 5 positions in the array
save said nr 5 and then access tuple[ptrarray[5]]

next call to get col[75] will start form col[5] and fill up to col[75]

next call to col[76] will start form col[75] and fill up to col[76]

next call to col[60] will just get tuple[ptrarray[60]]

the above description assumes 1-based non-C arrays ;)

--
Hannu Krosing <hannu@tm.ee>

pgsql-hackers by date:

Previous
From: Darko Prenosil
Date:
Subject: Re: Windows Build System was: Win32 port patches
Next
From: Hannu Krosing
Date:
Subject: Re: [PERFORM] Terrible performance on wide selects