Re: Slow dump with pg_dump/pg_restore ? How to improve ? - Mailing list pgsql-general

From Soeren Gerlach
Subject Re: Slow dump with pg_dump/pg_restore ? How to improve ?
Date
Msg-id 200407010008.46744.soeren@all-about-shift.com
Whole thread Raw
In response to Re: Slow dump with pg_dump/pg_restore ? How to improve ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Slow dump with pg_dump/pg_restore ? How to improve ?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
> > So...the dump in the above format needs some 14 minutes, the restore 10
> > minutes. This seems to be very slow as it means something like 100K/sec
> > for dumping and restoring. The drive is cappable of 40 Meg/seconds, so
> > thats not the bottleneck ,-) Anyhow postmaster and pg_dump seem to max
> > out the CPU cycles as it's running at nearly 100% while dumping and
> > restoring.
>
> What datatypes have you got in the large tables?  Also, what character
> set encoding are you using?

datestyle = 'ISO,European' ?

> The only reason I can think of for dump to be that slow is if conversion
> of the data to text is a big time sink.  This would involve the
> datatype's own output routine plus possibly a character set conversion.
> You should at least make sure that no character set conversion needs to
> happen (offhand I think this would only be an issue if pg_dump is
> invoked with PGCLIENTENCODING set in its environment).

This are the schemas of the two big tables mentioned:

-------------------------------------------------------------------------
CREATE TABLE public.agent_action_history
(
  aah_tag_id bigserial NOT NULL,
  aah_action_type int2 NOT NULL DEFAULT 0,
  aah_quantity float4 NOT NULL DEFAULT 0,
  aah_price float4 NOT NULL DEFAULT 0,
  aah_sim_flag bool NOT NULL DEFAULT false,
  aah_timestamp timestamp NOT NULL,
  aah_action_pl float4 NOT NULL DEFAULT 0
) WITHOUT OIDS;

CREATE TABLE public.tick_history
(
  tkh_id serial NOT NULL,
  tkh_cdt_id int8 NOT NULL,
  tkh_price float4 NOT NULL,
  tkh_price_type int2 NOT NULL,
  tkh_volume float4 NOT NULL,
  tkh_system_time timestamp(6) NOT NULL,
  tkh_exchange_time timestamp(6) NOT NULL,
  CONSTRAINT "PK_tick_history" PRIMARY KEY (tkh_id)
) WITHOUT OIDS;
-------------------------------------------------------------------------

> Also I trust you are using dump with the default COPY-style output,
> not dump-as-INSERTs?

Yes. I'm dumping using "-c -Fc -v" as options. I just rerun the whole:

 * Total rows: 904,000
 * Dumping plain format: 21min == 1260 sec == 717 rows/sec
 * Dumping "-c -Fc" format: 26min == 1560 sec == 580 rows/sec
 * pg_dump takes 2/3 of the [single] CPU, postmaster the other 1/3 for both
   dumps
 * the resulting "-Fc" dump is 1/10 of the size of the plain dump

It would be interesting to know from people with DBs > 1 Gig how fast they
can dump in terms of rows/second.

Another info: I just noticed that I've 7.4.1 running not 7.4.3. So I'll
retest this tomorrow again and provide you with numbers for 7.4.3 too.


I'm likely to install Sybase as a reference to Postgres next week. But I can
remember to got there something like 5-10,000 rows/sec some one year ago
when I tested a 11.5 version for this issue too.



Thanks,
Soeren Gerlach


pgsql-general by date:

Previous
From: Dennis Gearon
Date:
Subject: Internationalization
Next
From: Alvaro Herrera
Date:
Subject: Re: Internationalization