Re: Update on sort-compression stuff - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Update on sort-compression stuff
Date
Msg-id 1148411046.2755.185.camel@localhost.localdomain
Whole thread Raw
In response to Re: Update on sort-compression stuff  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2006-05-23 at 14:27 -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > - Test a way of storing tuples with less overhead than a HeapTuple
> > header. If you could do it for in-memory sorts, that'd mean you could
> > fit more tuples in memory before spilling to disk. Given the
> > "compression" in that case is extremely cheap, it'd be much more likely
> > to be beneficial.
> 
> I looked into this and decided that trimming the headers for the
> in-memory copies is not as attractive as all that.  The killer problem
> is that comparetup_heap() needs to be able to apply heap_getattr() to
> the stored tuples to extract sort keys.  Unless we want to support a
> variant copy of the heap_getattr() infrastructure just for sort tuples,
> it ain't gonna work.  Another issue is that we'd be increasing the
> palloc traffic for in-memory sorts, because tuplesort_gettuple() would
> have to cons up a freshly palloc'd complete tuple to hand back to the
> caller.
> 
> However, we can definitely trim a lot of overhead from what gets written
> to "tape", so I'll have a go at doing that.

If we write the tuples in compressed form and read them back in that
same form, there wouldn't be any more palloc overhead at all. The
freelists would be full of too large blocks, but that might not be such
a problem.

heap_getattr() is called by so few other places it makes sense to have a
sort specific version.

--  Simon Riggs EnterpriseDB          http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: [GENERAL] Weird ..... (a=1 or a=2) <> (a=2 or a=1)
Next
From: Bruce Momjian
Date:
Subject: Re: API changes in patch release