On Tue, 2005-01-04 at 14:02 -0500, Rod Taylor wrote:
> > 1)the 250 million records are currently whipped and reinserted as a
> > "daily snapshot" and the fastest way I have found "COPY" to do this from
> > a file is no where near fast enough to do this. SQL*Loader from Oracle
> > does some things that I need, ie Direct Path to the db files access
> > (skipping the RDBMS), inherently ignoring indexing rules and saving a
> > ton of time (Dropping the index, COPY'ing 250 million records, then
> > Recreating the index just takes way too long).
>
> If you have the hardware for it, instead of doing 1 copy, do 1 copy
> command per CPU (until your IO is maxed out anyway) and divide the work
> amongst them. I can push through 100MB/sec using methods like this --
> which makes loading 100GB of data much faster.
>
> Ditto for indexes. Don't create a single index on one CPU and wait --
> send off one index creation command per CPU.
Not sure what you mean by "whipped". If you mean select and re-insert
then perhaps using a pipe would produce better performance, since no
disk access for the data file would be involved.
In 8.0 COPY and CREATE INDEX is optimised to not use WAL at all if
archive_command is not set. 8 is great...
> > 2)Finding a way to keep this many records in a fashion that can be
> > easily queried. I even tried breaking it up into almost 2800 separate
> > tables, basically views of the data pre-broken down, if this is a
> > working method it can be done this way, but when I tried it, VACUUM, and
> > the COPY's all seemed to slow down extremely.
>
> Can you send us EXPLAIN ANALYSE output for the slow selects and a little
> insight into what your doing? A basic table structure, and indexes
> involved would be handy. You may change column and table names if you
> like.
There's a known issue using UNION ALL views in 8.0 that makes them
slightly more inefficient than using a single table. Perhaps that would
explain your results.
There shouldn't be any need to do the 2800 table approach in this
instance.
--
Best Regards, Simon Riggs