Thread: Large table load (40 millon rows) - WAL hassles

Large table load (40 millon rows) - WAL hassles

From
simon@gonzo.magellic.com (simon lai)
Date:
Version: Postgres 7.1.2

A product we are developing requires frequent loading
of a large number of rows into a table.  We are using
the "copy file" command, but with WAL we are effectively
doubling the amount of disk writing we are doing.

After the rows are loaded we do a "create index".

Is there a way to turn off WAL or otherwise speeding up
the table loading process? Would db_restore be faster?

The rows are loaded in sorted order.  Does this impact
index creation negatively or positively?

We are currently working with test data but we estimate
production data to be 6 - 9 billion rows.  Is anyone
else running with these volumes?

Simon

Re: Large table load (40 millon rows) - WAL hassles

From
Bruce Momjian
Date:
> Version: Postgres 7.1.2
>
> A product we are developing requires frequent loading
> of a large number of rows into a table.  We are using
> the "copy file" command, but with WAL we are effectively
> doubling the amount of disk writing we are doing.
>
> After the rows are loaded we do a "create index".
>
> Is there a way to turn off WAL or otherwise speeding up
> the table loading process? Would db_restore be faster?
>
> The rows are loaded in sorted order.  Does this impact
> index creation negatively or positively?
>
> We are currently working with test data but we estimate
> production data to be 6 - 9 billion rows.  Is anyone
> else running with these volumes?
>

We have a problem with 7.1.X where the WAL logs stay around longer than
needed.  Maybe someone has a patch with that fix.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026