Thread: moving data between tables causes the db to overwhelm the system
Hi all; We have a table that's > 2billion rows big and growing fast. We've setup monthly partitions for it. Upon running the first of many select * from bigTable insert into partition statements (330million rows per month) the entire box eventually goes out to lunch. Any thoughts/suggestions? Thanks in advance
> Hi all; > > We have a table that's > 2billion rows big and growing fast. We've setup > monthly partitions for it. Upon running the first of many select * from > bigTable insert into partition statements (330million rows per month) the > entire box eventually goes out to lunch. > > Any thoughts/suggestions? > > Thanks in advance > Sorry, but your post does not provide enough information, so it's practically impossible to give you some suggestions :-( Provide at least these information: 1) basic info about the hardware (number and type of cpus, amount of RAM, controller, number of disk drives) 2) more detailed information of the table size and structure (see the pg_class and pg_stat_* views). Information about indexes and triggers created on the table 3) explain plan of the problematic queries - in this case the 'select * from bigtable' etc. 4) detailed description what 'going to lunch' means - does that mean the CPU is 100% occupied, or is there a problem with I/O (use vmstat / dstat or something like that) I've probably forgot something, but this might be a good starting point. regards Tomas
Re: moving data between tables causes the db to overwhelm the system
From
Pierre Frédéric Caillaud
Date:
> We have a table that's > 2billion rows big and growing fast. We've setup > monthly partitions for it. Upon running the first of many select * from > bigTable insert into partition statements (330million rows per month) the > entire box eventually goes out to lunch. > > Any thoughts/suggestions? > > Thanks in advance Did you create the indexes on the partition before or after inserting the 330M rows into it ? What is your hardware config, where is xlog ?
On Tuesday 01 September 2009 03:26:08 Pierre Frédéric Caillaud wrote: > > We have a table that's > 2billion rows big and growing fast. We've setup > > monthly partitions for it. Upon running the first of many select * from > > bigTable insert into partition statements (330million rows per month) the > > entire box eventually goes out to lunch. > > > > Any thoughts/suggestions? > > > > Thanks in advance > > Did you create the indexes on the partition before or after inserting the > 330M rows into it ? > What is your hardware config, where is xlog ? Indexes are on the partitions, my bad. Also HW is a Dell server with 2 quad cores and 32G of ram we have a DELL MD3000 disk array with an MD1000 expansion bay, 2 controllers, 2 hbs's/mount points runing RAID 10 The explain plan looks like this: explain SELECT * from bigTable where "time" >= extract ('epoch' from timestamp '2009-08-31 00:00:00')::int4 and "time" <= extract ('epoch' from timestamp '2009-08-31 23:59:59')::int ; QUERY PLAN ------------------------------------------------------------------------------------------------ Index Scan using bigTable_time_index on bigTable (cost=0.00..184.04 rows=1 width=129) Index Cond: (("time" >= 1251676800) AND ("time" <= 1251763199)) (2 rows)
> On Tuesday 01 September 2009 03:26:08 Pierre FrĂŠdĂŠric Caillaud wrote: >> > We have a table that's > 2billion rows big and growing fast. We've >> setup >> > monthly partitions for it. Upon running the first of many select * >> from >> > bigTable insert into partition statements (330million rows per month) >> the >> > entire box eventually goes out to lunch. >> > >> > Any thoughts/suggestions? >> > >> > Thanks in advance >> >> Did you create the indexes on the partition before or after inserting >> the >> 330M rows into it ? >> What is your hardware config, where is xlog ? > > > Indexes are on the partitions, my bad. Also HW is a Dell server with 2 > quad > cores and 32G of ram > > we have a DELL MD3000 disk array with an MD1000 expansion bay, 2 > controllers, > 2 hbs's/mount points runing RAID 10 > > The explain plan looks like this: > explain SELECT * from bigTable > where > "time" >= extract ('epoch' from timestamp '2009-08-31 00:00:00')::int4 > and "time" <= extract ('epoch' from timestamp '2009-08-31 23:59:59')::int > ; > > QUERY PLAN > ------------------------------------------------------------------------------------------------ > Index Scan using bigTable_time_index on bigTable (cost=0.00..184.04 > rows=1 > width=129) > Index Cond: (("time" >= 1251676800) AND ("time" <= 1251763199)) > (2 rows) This looks like a single row matches your conditions. Have you run ANALYZE on the table recently? Try to run "ANALYZE BigTable" and then the explain again. BTW what version of PostgreSQL are you running? Tomas
Re: moving data between tables causes the db to overwhelm the system
From
Pierre Frédéric Caillaud
Date:
> Indexes are on the partitions, my bad. If you need to insert lots of data, it is faster to create the indexes afterwards (and then you can also create them in parallel, since you have lots of RAM and cores). > The explain plan looks like this: > explain SELECT * from bigTable > where > "time" >= extract ('epoch' from timestamp '2009-08-31 00:00:00')::int4 > and "time" <= extract ('epoch' from timestamp '2009-08-31 23:59:59')::int > ; > > QUERY PLAN > ------------------------------------------------------------------------------------------------ > Index Scan using bigTable_time_index on bigTable (cost=0.00..184.04 > rows=1 > width=129) > Index Cond: (("time" >= 1251676800) AND ("time" <= 1251763199)) > (2 rows) What is slow, then, is it the insert or is it the select ? Can you EXPLAIN ANALYZE the SELECT ? If "bigTable" is not clustered on "time" you'll get lots of random accesses, it'll be slow. If you want to partition your huge data set by "time", and the data isn't already ordered by "time" on disk, you could do this : SET work_mem TO something very large like 10GB since you got 32GB RAM, check your shared buffers etc first; CREATE TABLE tmp AS SELECT * FROM bigTable ORDER BY "time"; <- huge sort, will take some time SET maintenance_work_mem TO something very large; CREATE INDEX tmp_time ON tmp( "time" ); CREATE TABLE partition1 AS SELECT * FROM tmp WHERE "time" BETWEEN beginning AND end; (repeat...) Since tmp is clustered on "time" you'll get a nice fast bitmap-scan, and you won't need to seq-scan N times (or randomly index-scan) bigTable.
> If you want to partition your huge data set by "time", and the data > isn't already ordered by "time" on disk, you could do this : > > SET work_mem TO something very large like 10GB since you got 32GB RAM, > check your shared buffers etc first; > CREATE TABLE tmp AS SELECT * FROM bigTable ORDER BY "time"; <- huge > sort, will take some time > > SET maintenance_work_mem TO something very large; > CREATE INDEX tmp_time ON tmp( "time" ); > > CREATE TABLE partition1 AS SELECT * FROM tmp WHERE "time" BETWEEN > beginning AND end; > (repeat...) > > Since tmp is clustered on "time" you'll get a nice fast bitmap-scan, > and you won't need to seq-scan N times (or randomly index-scan) bigTable. > I went through the same exercise a couple months ago with a table that had ~1.7 billion rows. I used a similar approach to what's described above but in my case I didn't create the tmp table and did the ORDER BY when I did each select on the bigTable to do the insert (I didn't have many of them). My data was structured such that this was easier than doing the huge sort. In any event, it worked great and my smaller partitions are much much faster. Bob