Home > mailing lists

Re: Database restore speed - Mailing list pgsql-performance

From	David Lang
Subject	Re: Database restore speed
Date	December 4, 2005 00:21:34
Msg-id	Pine.LNX.4.62.0512031657510.2807@qnivq.ynat.uz Whole thread Raw
In response to	Re: Database restore speed ("Luke Lonergan" <llonergan@greenplum.com>)
List	pgsql-performance

Tree view

On Sat, 3 Dec 2005, Luke Lonergan wrote:

> Tom,
>
> On 12/3/05 12:32 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>
>> "Luke Lonergan" <llonergan@greenplum.com> writes:
>>> Last I looked at the Postgres binary dump format, it was not portable or
>>> efficient enough to suit the need.  The efficiency problem with it was that
>>> there was descriptive information attached to each individual data item, as
>>> compared to the approach where that information is specified once for the
>>> data group as a template for input.
>>
>> Are you complaining about the length words?  Get real...
>
> Hmm - "<sizeof int><int>" repeat, efficiency is 1/2 of "<int>" repeat.  I
> think that's worth complaining about.

but how does it compare to the ASCII representation of that int? (remember
to include your seperator characters as well)

yes it seems less efficiant, and it may be better to do something like
send a record description header that gives the sizes of each item and
then send the records following that without the size items, but either
way should still be an advantage over the existing ASCII messages.

also, how large is the <sizeof int> in the message?

there are other optimizations that can be done as well, but if there's
still a question about if it's worth it to do the parseing on the client
then a first implmentation should be done without makeing to many changes
to test things.

also some of the optimizations need to have measurements done to see if
they are worth it (even something that seems as obvious as seperating the
sizeof from the data itself as you suggest above has a penalty, namely it
spreads the data that needs to be accessed to process a line between
different cache lines, so in some cases it won't be worth it)

David Lang

pgsql-performance by date:

From: Andreas Pflug
Date: 03 December 2005, 23:00:29
Subject: Re: Faster db architecture for a twisted table.

From: Hélder M. Vieira
Date: 04 December 2005, 00:33:22
Subject: Re: Faster db architecture for a twisted table.

Re: Database restore speed - Mailing list pgsql-performance

Previous

Next