Home > mailing lists

Re: bytea vs. pg_dump - Mailing list pgsql-hackers

From	Bruce Momjian
Subject	Re: bytea vs. pg_dump
Date	May 28, 2009 22:26:49
Msg-id	200905290126.n4T1QZP01411@momjian.us Whole thread Raw
In response to	Re: bytea vs. pg_dump (Merlin Moncure <mmoncure@gmail.com>)
Responses	Re: bytea vs. pg_dump
List	pgsql-hackers

Tree view

Added to TODO:
|Improve bytea COPY format* http://archives.postgresql.org/pgsql-hackers/2009-05/msg00192.php

---------------------------------------------------------------------------

Merlin Moncure wrote:
> On Sat, May 16, 2009 at 11:23 AM, Stefan Kaltenbrunner
> <stefan@kaltenbrunner.cc> wrote:
> > Bernd Helmle wrote:
> >>
> >> --On Mittwoch, Mai 06, 2009 19:04:21 -0400 Tom Lane <tgl@sss.pgh.pa.us>
> >> wrote:
> >>
> >>> So I'm now persuaded that a better textual representation for bytea
> >>> should indeed make things noticeably better here. ?It would be
> >>> useful though to cross-check this thought by profiling a case that
> >>> dumps a comparable volume of text data that contains no backslashes...
> >>
> >> This is a profiling result of the same data converted into a printable
> >> text format without any backslashes. The data amount is quite the same and
> >> as you already guessed, calls to appendBinaryStringInfo() and friends gives
> >> the expected numbers:
> >>
> >>
> >> time ? seconds ? seconds ? ?calls ? s/call ? s/call ?name
> >> 35.13 ? ? 24.67 ? ?24.67 ? 134488 ? ? 0.00 ? ? 0.00 ?byteaout
> >> 32.61 ? ? 47.57 ? ?22.90 ? 134488 ? ? 0.00 ? ? 0.00 ?CopyOneRowTo
> >> 28.92 ? ? 67.88 ? ?20.31 ? ?85967 ? ? 0.00 ? ? 0.00 ?pglz_decompress
> >> ?0.67 ? ? 68.35 ? ? 0.47 ?4955300 ? ? 0.00 ? ? 0.00
> >> hash_search_with_hash_value
> >> ?0.28 ? ? 68.55 ? ? 0.20 11643046 ? ? 0.00 ? ? 0.00 ?LWLockRelease
> >> ?0.28 ? ? 68.75 ? ? 0.20 ?4828896 ? ? 0.00 ? ? 0.00 ?index_getnext
> >> ?0.24 ? ? 68.92 ? ? 0.17 ?1208577 ? ? 0.00 ? ? 0.00 ?StrategyGetBuffer
> >> ?0.23 ? ? 69.08 ? ? 0.16 11643046 ? ? 0.00 ? ? 0.00 ?LWLockAcquire
> >> ...
> >> ?0.00 ? ? 70.23 ? ? 0.00 ? 134498 ? ? 0.00 ? ? 0.00 ?enlargeStringInfo
> >> ?0.00 ? ? 70.23 ? ? 0.00 ? 134497 ? ? 0.00 ? ? 0.00
> >> ?appendBinaryStringInfo
> >> ?0.00 ? ? 70.23 ? ? 0.00 ? 134490 ? ? 0.00 ? ? 0.00 ?AllocSetReset
> >> ?0.00 ? ? 70.23 ? ? 0.00 ? 134490 ? ? 0.00 ? ? 0.00 ?resetStringInfo
> >> ?0.00 ? ? 70.23 ? ? 0.00 ? 134488 ? ? 0.00 ? ? 0.00 ?CopySendChar
> >> ?0.00 ? ? 70.23 ? ? 0.00 ? 134488 ? ? 0.00 ? ? 0.00 ?CopySendEndOfRow
> >
> >
> > while doing some pg_migrator testing I noticed that dumping a database seems
> > to be much slower than IO-system is capable off. ie i get 100% CPU usage
> > with no IO-wait at all with between 15-30MB/s read rate if i say do a
> > pg_dumpall > /dev/null.
> 
> Part of the problem is the decompression.  Can't do much about that
> except to not compress your data.
> 
> I don't have any hard statistics on hand at the moment, but a while
> back we compared 'COPY' vs a hand written SPI routine that got the
> tuple data in binary and streamed it out field by field raw to a file.
>  The speed difference was enormous..I don't recall the exact
> difference but copy was at least 2x slower.  This seems to suggest
> there are many potential improvements to copy (my test was mainly
> bytea as well).
> 
> merlin
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

pgsql-hackers by date:

From: Andrew Dunstan
Date: 28 May 2009, 22:25:57
Subject: Re: pg_migrator and an 8.3-compatible tsvector data type

From: Robert Haas
Date: 28 May 2009, 22:30:18
Subject: explain analyze rows=%.0f

Re: bytea vs. pg_dump - Mailing list pgsql-hackers

Previous

Next