Re: Column lookup in a row performance - Mailing list pgsql-general

From David Rowley
Subject Re: Column lookup in a row performance
Date
Msg-id CAKJS1f-iBtXZJ8uP8At5wJujfJdMD8P1tXiaf+3dX+fkmxHsuQ@mail.gmail.com
Whole thread Raw
In response to Column lookup in a row performance  (Павлухин Иван <vololo100@gmail.com>)
List pgsql-general
On Fri, 22 Mar 2019 at 19:13, Павлухин Иван <vololo100@gmail.com> wrote:
> I am learning deeply how tuples are organized and column values are
> accessed in different databases. As far as undertood postgres does not
> store all column positions in a tuple (e.g. in header or footer). In
> contrast MySQL InnoDB stores column lengths in a record header [1].
> From the first glance it seems that a postgres format can have a
> significant performance penalty when accessing a single column which
> is located after multiple variable-length columns because searching a
> column value position in a row requires multiple jumps. And in InnoDB
> a position of a particular column can be found right after reading a
> header.

When the tuple contains no NULLs, PostgreSQL does cache the offsets to
the attribute position in the tuple up until the first variable length
field. This allows code to directly access the value without having to
deform all fields that come before the required field, and since the
offset value is the same for all tuples of this type, then it can be
stored just once, in what we call the tuple descriptor.

I'm not aware of what innodb does, but if it stores actual offsets
fields that come after a variable length field, then that sounds like
something that needs to be stored per-tuple, so there's a trade-off;
speed vs storage space.   In PostgreSQL, one way you can obtain faster
access is to not have NULLs and put fixed width fields first. That's,
of course, not always possible, but useful to keep in mind when
deciding the order to have your columns in the table.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-general by date:

Previous
From: Aleš Zelený
Date:
Subject: Logical replication - DDL sub transactions for script executed insingle transaction?
Next
From: Thomas Güttler
Date:
Subject: Script which shows performance of ByteA: ascii vs binary