Thread: 121+ million record table perf problems

121+ million record table perf problems

From

cyber-postgres@midnightfantasy.com

Date:

18 May 2007, 15:15:18

I need some help on recommendations to solve a perf problem.

I've got a table with ~121 million records in it.  Select count on it
currently takes ~45 minutes, and an update to the table to set a value on
one of the columns I finally killed after it ran 17 hours and had still
not completed.  Queries into the table are butt slow, and

System:   SUSE LINUX 10.0 (X86-64)
Postgresql:   PostgreSQL 8.2.1
Index type:   btree

A select count took ~48 minutes before I made some changes to the
postgresql.conf, going from default values to these:
shared_buffers = 24MB
work_mem = 256MB
maintenance_work_mem = 512MB
random_page_cost = 100
stats_start_collector = off
stats_row_level = off

As a test I am trying to do an update on state using the following queries:
update res set state=5001;
select count(resid) from res;

The update query that started this all I had to kill after 17hours.  It
should have updated all 121+ million records.  That brought my select
count down to 19 minutes, but still a far cry from acceptable.

The system has 2GB of RAM (more is alreads on order), but doesn't seem to
show problems in TOP with running away with RAM.  If anything, I don't
think it's using enough as I only see about 6 processes using 26-27 MB
each) and is running on a single disk (guess I will likely have to at the
minimum go to a RAID1).  Workload will primarily be comprised of queries
against the indicies (thus why so many of them) and updates to a single
record from about 10 clients where that one records will have md5, state,
rval, speed, audit, and date columns updated.  Those updates don't seem to
be a problem, and are generally processed in bulk of 500 to 5000 at a
time.

Here is the schema for the table giving me problems:

CREATE TABLE res
(
   res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass),
   res_client_id integer NOT NULL,
   "time" real DEFAULT 0,
   error integer DEFAULT 0,
   md5 character(32) DEFAULT 0,
   res_tc_id integer NOT NULL,
   state smallint DEFAULT 0,
   priority smallint,
   rval integer,
   speed real,
   audit real,
   date timestamp with time zone,
   gold_result_id integer,
   CONSTRAINT result_pkey PRIMARY KEY (res_id),
   CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id)
)
WITHOUT OIDS;
ALTER TABLE res OWNER TO postgres;

CREATE INDEX index_audit
   ON res
   USING btree
   (audit);

CREATE INDEX index_event
   ON res
   USING btree
   (error);

CREATE INDEX index_priority
   ON res
   USING btree
   (priority);

CREATE INDEX index_rval
   ON res
   USING btree
   (rval);

CREATE INDEX index_speed
   ON res
   USING btree
   (speed);

CREATE INDEX index_state
   ON res
   USING btree
   (state);

CREATE INDEX index_tc_id
   ON res
   USING btree
   (res_tc_id);

CREATE INDEX index_time
   ON res
   USING btree
   ("time");

Re: 121+ million record table perf problems

From

Andrew Sullivan

Date:

18 May 2007, 15:30:32

On Fri, May 18, 2007 at 12:43:40PM -0500, cyber-postgres@midnightfantasy.com wrote:
> I've got a table with ~121 million records in it.  Select count on it
> currently takes ~45 minutes, and an update to the table to set a value on
> one of the columns I finally killed after it ran 17 hours and had still
> not completed.  Queries into the table are butt slow, and

I don't think you've told us anything like enough to get started on
solving your problem.  But to start with, you know that in Postgres,
an unrestricted count() on a table always results in reading the
entire table, right?

Standard questions: have you performed any vacuum or analyse?

Your update statement is also a case where you have to touch every
row.  Note that, given that you seem to be setting the state field to
the same value for everything, an index on there will do you not one
jot of good until there's greater selectivity.

How fast is the disk?  Is it fast enough to read and touch every one
of those rows on the table inside of 17 hours?

Note also that your approach of updating all 121 million records in
one statement is approximately the worst way to do this in Postgres,
because it creates 121 million dead tuples on your table.  (You've
created some number of those by killing the query as well.)

All of that said, 17 hours seems kinda long.

> As a test I am trying to do an update on state using the following queries:
> update res set state=5001;
> select count(resid) from res;

What is this testing?

> The update query that started this all I had to kill after 17hours.

Does that suggest that the update you're trying to make work well is
_not_ update res set state = 5001?

> each) and is running on a single disk (guess I will likely have to at the
> minimum go to a RAID1).  Workload will primarily be comprised of queries

I bet that single disk is your problem.  Iostat is your friend, I'd
say.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
Everything that happens in the world happens at some place.
        --Jane Jacobs

Re: 121+ million record table perf problems

From

"Joshua D. Drake"

Date:

18 May 2007, 15:50:37

cyber-postgres@midnightfantasy.com wrote:
> I need some help on recommendations to solve a perf problem.
>
> I've got a table with ~121 million records in it.  Select count on it
> currently takes ~45 minutes, and an update to the table to set a value
> on one of the columns I finally killed after it ran 17 hours and had
> still not completed.  Queries into the table are butt slow, and

Scanning 121 million rows is going to be slow even on 16 disks.

>
> System:   SUSE LINUX 10.0 (X86-64)
> Postgresql:   PostgreSQL 8.2.1
> Index type:   btree

You really should be running 8.2.4.

>
> A select count took ~48 minutes before I made some changes to the
> postgresql.conf, going from default values to these:
> shared_buffers = 24MB

This could be increased.

> work_mem = 256MB
> maintenance_work_mem = 512MB
> random_page_cost = 100
> stats_start_collector = off
> stats_row_level = off
>
> As a test I am trying to do an update on state using the following queries:
> update res set state=5001;

You are updating 121 million rows, that takes a lot of time considering
you are actually (at a very low level) marking 121 million rows dead and
inserting 121 million more.

> The update query that started this all I had to kill after 17hours.  It
> should have updated all 121+ million records.  That brought my select
> count down to 19 minutes, but still a far cry from acceptable.

Not quite sure what you would considerable acceptable based on what you
are trying to do.


Sincerely,

Joshua D. Drake



>
> Here is the schema for the table giving me problems:
>
> CREATE TABLE res
> (
>   res_id integer NOT NULL DEFAULT nextval('result_id_seq'::regclass),
>   res_client_id integer NOT NULL,
>   "time" real DEFAULT 0,
>   error integer DEFAULT 0,
>   md5 character(32) DEFAULT 0,
>   res_tc_id integer NOT NULL,
>   state smallint DEFAULT 0,
>   priority smallint,
>   rval integer,
>   speed real,
>   audit real,
>   date timestamp with time zone,
>   gold_result_id integer,
>   CONSTRAINT result_pkey PRIMARY KEY (res_id),
>   CONSTRAINT unique_res UNIQUE (res_client_id, res_tc_id)
> )
> WITHOUT OIDS;
> ALTER TABLE res OWNER TO postgres;
>
> CREATE INDEX index_audit
>   ON res
>   USING btree
>   (audit);
>
> CREATE INDEX index_event
>   ON res
>   USING btree
>   (error);
>
> CREATE INDEX index_priority
>   ON res
>   USING btree
>   (priority);
>
> CREATE INDEX index_rval
>   ON res
>   USING btree
>   (rval);
>
> CREATE INDEX index_speed
>   ON res
>   USING btree
>   (speed);
>
> CREATE INDEX index_state
>   ON res
>   USING btree
>   (state);
>
> CREATE INDEX index_tc_id
>   ON res
>   USING btree
>   (res_tc_id);
>
> CREATE INDEX index_time
>   ON res
>   USING btree
>   ("time");
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: 121+ million record table perf problems

From

Brian Hurt

Date:

18 May 2007, 15:53:52

cyber-postgres@midnightfantasy.com wrote:

> I need some help on recommendations to solve a perf problem.
>
> I've got a table with ~121 million records in it.  Select count on it
> currently takes ~45 minutes, and an update to the table to set a value
> on one of the columns I finally killed after it ran 17 hours and had
> still not completed.  Queries into the table are butt slow, and
>
This is way too long.  I just did a select count(*) on a table of mine
that has 48 million rows and it took only 178 seconds.  And this is on a
serious POS disk subsystem that's giving me about 1/2 the read speed of
a single off the shelf SATA disk.
As select count(*) has to read the whole table sequentially, the time it
takes is linear with the size of the table (once you get large enough
that the whole table doesn't get cached in memory).  So I'd be surprised
if a 121 million record table took more than 500 or so seconds to read,
and would expect it to be less.

So my advice: vacuum.  I'll bet you've got a whole boatload of dead
tuples kicking around.  Then analyze.  Then consider firing off a
reindex and/or cluster against the table.  The other thing I'd consider
is dropping the money on some more hardware- a few hundred bucks to get
a battery backed raid card and half a dozen SATA drives would probably
do wonders for your performance.

>
> shared_buffers = 24MB

Up your shared buffers.  This is a mistake I made originally as well-
but this is the total number of shared buffers used by the system.  I
had originally assumed that the number of shared buffers used was this
times the number of backends, but it's not.

With 2G of memory, I'd start with shared buffers of 512MB, and consider
upping it to 768MB or even 1024MB.  This will also really help performance.

> stats_start_collector = off
> stats_row_level = off
>
I think I'd also recommend turning these one.

Brian

Re: 121+ million record table perf problems

From

Alan Hodgson

Date:

18 May 2007, 16:08:48

On Friday 18 May 2007 11:51, "Joshua D. Drake" <jd@commandprompt.com> wrote:
> > The update query that started this all I had to kill after 17hours.  It
> > should have updated all 121+ million records.  That brought my select
> > count down to 19 minutes, but still a far cry from acceptable.

You're going to want to drop all your indexes before trying to update 121
million records.  Updates in PostgreSQL are really quite slow, mostly due
to all the index updates. Drop indexes, do the updates, create a primary
key, cluster the table on that key to free up the dead space, then recreate
the rest of the indexes. That's about as fast as you can get that process.

Of course, doing anything big on one disk is also going to be slow, no
matter what you do. I don't think a table scan should take 19 minutes,
though, not for 121 million records. You should be able to get at least
60-70MB/sec out of anything modern. I can only assume your disk is
thrashing doing something else at the same time as the select.

--
"We can no more blame our loss of freedom on Congressmen than we can
prostitution on pimps.  Both simply provide broker services for their
customers." -- Dr. Walter Williams

Re: 121+ million record table perf problems

From

Tom Lane

Date:

18 May 2007, 16:37:58

Andrew Sullivan <ajs@crankycanuck.ca> writes:
> All of that said, 17 hours seems kinda long.

I imagine he's done a bunch of those full-table UPDATEs without
vacuuming, and now has approximately a gazillion dead tuples bloating
the table.

            regards, tom lane

Re: 121+ million record table perf problems

From

Craig James

Date:

18 May 2007, 19:39:21

>
> I've got a table with ~121 million records in it.  Select count on it
> currently takes ~45 minutes, and an update to the table to set a value
> on one of the columns I finally killed after it ran 17 hours and had
> still not completed.  Queries into the table are butt slow, and
>
> The update query that started this all I had to kill after 17hours.
> It should have updated all 121+ million records.  That brought my
> select count down to 19 minutes, but still a far cry from acceptable.

If you have a column that needs to be updated often for all rows,
separate it into a different table, and create a view that joins it back
to the main table so that your application still sees the old schema.

This will greatly speed your update since (in Postgres) and update is
the same as a delete+insert.  By updating that one column, you're
re-writing your entire 121 million rows.  If you separate it, you're
only rewriting that one column.  Don't forget to vacuum/analyze and
reindex when you're done.

Better yet, if you can stand a short down time, you can drop indexes on
that column, truncate, then do 121 million inserts, and finally
reindex.  That will be MUCH faster.

Craig

Re: 121+ million record table perf problems

From

Alvaro Herrera

Date:

18 May 2007, 20:21:17

Craig James wrote:

> Better yet, if you can stand a short down time, you can drop indexes on
> that column, truncate, then do 121 million inserts, and finally
> reindex.  That will be MUCH faster.

Or you can do a CLUSTER, which does all the same things automatically.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: 121+ million record table perf problems

From

Greg Smith

Date:

19 May 2007, 02:17:24

On Fri, 18 May 2007, cyber-postgres@midnightfantasy.com wrote:

> shared_buffers = 24MB
> work_mem = 256MB
> maintenance_work_mem = 512MB

You should take a minute to follow the suggestions at
http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm and set
dramatically higher values for shared_buffers and effective_cache_size for
your server.  Also, your work_mem figure may be OK for now, but if ever do
have 10 people connect to this database at once and run big queries you
could have an issue with it set that high--that's a per client setting.

After you're done with that, you should also follow the suggestions there
to do a VACCUM ANALYZE.  That may knock out two other potential issues at
once.  It will take a while to run, but I think you need it badly to sort
out what you've already done.

> random_page_cost = 100

I'm not sure what logic prompted this change, but after you correct the
above you should return this to its default; if this is helping now it's
only because other things are so far off from where they should be.

> update res set state=5001;
> The update query that started this all I had to kill after 17hours.  It
> should have updated all 121+ million records.  That brought my select count
> down to 19 minutes, but still a far cry from acceptable.

You should work on the select side of this first.  If that isn't running
in a moderate amount of time, trying to get the much more difficult update
to happen quickly is hopeless.

Once the select is under control, there are a lot of parameters to adjust
that will effect the speed of the updates.  The first thing to do is
dramatically increase checkpoint_segments; I would set that to at least 30
in your situation.

Also:  going to RAID-1 won't make a bit of difference to your update
speed; could even make it worse.  Adding more RAM may not help much
either.  If you don't have one already, the real key to improving
performance in a heavy update situation is to get a better disk controller
with a cache that helps accelerate writes.  Then the next step is to
stripe this data across multiple disks in a RAID-0 configuration to split
the I/O up.

You have a lot of work ahead of you.  Even after you resolve the gross
issues here, you have a table that has around 10 indexes on it.
Maintaining those is far from free; every time you update a single record
in that table, the system has to update each of those indexes on top of
the record update itself.  So you're really asking your system to do
around 1.2 billion disk-related operations when you throw out your simple
batch update against every row, and good luck getting that to run in a
time frame that's less than days long.

The right way to get a feel for what's going on is to drop all the indexes
except for the constraints and see how the bulk update runs after the
parameter changes suggested above are in place and the database has been
cleaned up with vacuum+analyze.  Once you have a feel for that, add some
indexes back in and see how it degrades.  Then you'll know how adding each
one of them impacts your performance.  I suspect you're going to have to
redesign your indexing scheme before this is over.  I don't think your
current design is ever going to work the way you expect it to.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: 121+ million record table perf problems

From

Vivek Khera

Date:

21 May 2007, 18:24:17

On May 18, 2007, at 2:30 PM, Andrew Sullivan wrote:

> Note also that your approach of updating all 121 million records in
> one statement is approximately the worst way to do this in Postgres,
> because it creates 121 million dead tuples on your table.  (You've
> created some number of those by killing the query as well.)
>
> All of that said, 17 hours seems kinda long.

I don't think that is too long.  Growing the table one page at a time
takes a long time when you add a lot of pages to a table that big.
Add in the single disk and you're flying the disk head all over the
place so it will just be slow.  No way around it.

And just for good measure, I ran a count on one of my big tables
which consists of two integers and a varchar(7):

db=> select count(*) from mytable;
    count
-----------
311994721
(1 row)

Time: 157689.057 ms

So I'm going to bet $1 that you're I/O starved.

Also, for memory usage, postgres won't use more than you tell it to...