Home > mailing lists

Re: Compression and on-disk sorting - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Compression and on-disk sorting
Date	May 19, 2006 15:53:21
Msg-id	3134.1148064785@sss.pgh.pa.us Whole thread Raw
In response to	Re: Compression and on-disk sorting ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses	Re: Compression and on-disk sorting
List	pgsql-hackers

Tree view

"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote:
>> I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost
>> unbeleiveable. What's in the table? It would seem to imply that our
>> tuple format is far more compressable than we expected.

> It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a;
> If the tape routines were actually storing visibility information, I'd
> expect that to be pretty compressible in this case since all the tuples
> were presumably created in a single transaction by pgbench.

It's worse than that: IIRC what passes through a heaptuple sort are
tuples manufactured by heap_form_tuple, which will have consistently
zeroed header fields.  However, the above isn't very helpful since the
rest of us have no idea what that "accounts" table contains.  How wide
is the tuple data, and what's in it?

(This suggests that we might try harder to strip unnecessary header info
from tuples being written to tape inside tuplesort.c.  I think most of
the required fields could be reconstructed given the TupleDesc.)
        regards, tom lane

pgsql-hackers by date:

From: "Joshua D. Drake"
Date: 19 May 2006, 15:49:18
Subject: Re: [OT] MySQL is bad, but THIS bad?

From: "Jim C. Nasby"
Date: 19 May 2006, 15:58:19
Subject: Re: [OT] MySQL is bad, but THIS bad?

Re: Compression and on-disk sorting - Mailing list pgsql-hackers

Previous

Next