Re: Large (8M) cache vs. dual-core CPUs - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: Large (8M) cache vs. dual-core CPUs
Date
Msg-id 20060426222456.GB97354@pervasive.com
Whole thread Raw
In response to Re: Large (8M) cache vs. dual-core CPUs  (mark@mark.mielke.cc)
List pgsql-performance
On Wed, Apr 26, 2006 at 02:48:53AM -0400, mark@mark.mielke.cc wrote:
> You said that DB accesses are random. I'm not so sure. In PostgreSQL,
> are not the individual pages often scanned sequentially, especially
> because all records are variable length? You don't think PostgreSQL
> will regularly read 32 bytes (8 bytes x 4) at a time, in sequence?
> Whether for table pages, or index pages - I'm not seeing why the
> accesses wouldn't be sequential. You believe PostgreSQL will access
> the table pages and index pages randomly on a per-byte basis? What
> is the minimum PostgreSQL record size again? Isn't it 32 bytes or
> over? :-)

Data within a page can absolutely be accessed randomly; it would be
horribly inefficient to slog through 8K of data every time you needed to
find a single row.

The header size of tuples is ~23 bytes, depending on your version of
PostgreSQL, and data fields have to start on the proper alignment
(generally 4 bytes). So essentially the smallest row you can get is 28
bytes.

I know that tuple headers are dealt with as a C structure, but I don't
know if that means accessing any of the header costs the same as
accessing the whole thing. I don't know if PostgreSQL can access fields
within tuples without having to scan through at least the first part of
preceeding fields, though I suspect that it can access fixed-width
fields that sit before any varlena fields directly (without scanning
through the other fields).

If we ever got to the point of divorcing the in-memory tuple layout from
the table layout it'd be interesting to experiment with having all
varlena length info stored immediately after all fixed-width fields;
that could potentially make accessing varlena's randomly faster. Note
that null fields are indicated as such in the null bitmap, so I'm pretty
sure that their in-tuple position doesn't matter much. Of course if you
want the definitive answer, Use The Source.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Large (8M) cache vs. dual-core CPUs
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Introducing a new linux readahead framework