Home > mailing lists

Postgres-R: tuple serialization - Mailing list pgsql-hackers

From	Markus Wanner
Subject	Postgres-R: tuple serialization
Date	July 22, 2008 05:08:17
Msg-id	48859484.70306@bluegap.ch Whole thread Raw
Responses	Re: Postgres-R: tuple serialization
List	pgsql-hackers

Tree view

Hi,

yesterday, I promised to outline the requirements of Postgres-R for
tuple serialization, which we have been talking about before. There are
basically three types of how to serialize tuple changes, depending on
whether they originate from an INSERT, UPDATE or DELETE. For updates and
deletes, it saves the old pkey as well as the origin (a global
transaction id) of the tuple (required for consistent serialization on
remote nodes). For inserts and updates, all added or changed attributes
need to be serialized as well.
pkey+origin changes INSERT - x UPDATE x x DELETE x
-

Note, that the pkey attributes may never be null, so an isnull bit field
can be skipped for those attributes. For the insert case, all attributes
(including primary key attributes) are serialized. Updates require an
additional bit field (well, I'm using chars ATM) to store which
attributes have changed. Only those should be transferred.

I'm tempted to unify that, so that inserts are serialized as the
difference against the default vaules or NULL. That would make things
easier for Postgres-R. However, how about other uses of such a fast
tuple applicator? Does such a use case exist at all? I mean, for
parallelizing COPY FROM STDIN, one certainly doesn't want to serialize
all input tuples into that format before feeding multiple helper
backends. Instead, I'd recommend letting the helper backends do the
parsing and therefore parallelize that as well.

For other features, like parallel pg_dump or even parallel query
execution, this tuple serialization code doesn't help much, IMO. So I'm
thinking that optimizing it for Postgres-R's internal use is the best
way to go.

Comments? Opinions?

Regards

Markus

pgsql-hackers by date:

From: Martijn van Oosterhout
Date: 22 July 2008, 03:33:37
Subject: Re: [WIP] collation support revisited (phase 1)

From: Peter Eisentraut
Date: 22 July 2008, 06:49:17
Subject: Re: Do we really want to migrate plproxy and citext into PG core distribution?

Postgres-R: tuple serialization - Mailing list pgsql-hackers

Previous

Next