Thread: Help tuning postgres

Help tuning postgres

From
Csaba Nagy
Date:
Hi all,

After a long time of reading the general list it's time to subscribe to
this one...

We have adapted our application (originally written for oracle) to
postgres, and switched part of our business to a postgres data base.

The data base has in the main tables around 150 million rows, the whole
data set takes ~ 30G after the initial migration. After ~ a month of
usage that bloated to ~ 100G. We installed autovacuum after ~ 2 weeks.

The main table is heavily updated during the active periods of usage,
which is coming in bursts.

Now Oracle on the same hardware has no problems handling it (the load),
but postgres comes to a crawl. Examining the pg_stats_activity table I
see the updates on the main table as being the biggest problem, they are
very slow. The table has a few indexes on it, I wonder if they are
updated too on an update ? The index fields are not changing. In any
case, I can't explain why the updates are so much slower on postgres.

Sorry for being fuzzy a bit, I spent quite some time figuring out what I
can do and now I have to give up and ask for help.

The machine running the DB is a debian linux, details:

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 11
model name      : Intel(R) Pentium(R) III CPU family      1266MHz
stepping        : 1
cpu MHz         : 1263.122
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips        : 2490.36

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 11
model name      : Intel(R) Pentium(R) III CPU family      1266MHz
stepping        : 1
cpu MHz         : 1263.122
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips        : 2514.94


$ uname -a
Linux *** 2.6.12.3 #1 SMP Tue Oct 11 13:13:00 CEST 2005 i686 GNU/Linux


$ cat /proc/meminfo
MemTotal:      4091012 kB
MemFree:        118072 kB
Buffers:         18464 kB
Cached:        3393436 kB
SwapCached:          0 kB
Active:         947508 kB
Inactive:      2875644 kB
HighTotal:     3211264 kB
HighFree:          868 kB
LowTotal:       879748 kB
LowFree:        117204 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:           13252 kB
Writeback:           0 kB
Mapped:         829300 kB
Slab:            64632 kB
CommitLimit:   2045504 kB
Committed_AS:  1148064 kB
PageTables:      75916 kB
VmallocTotal:   114680 kB
VmallocUsed:        96 kB
VmallocChunk:   114568 kB


The disk used for the data is an external raid array, I don't know much
about that right now except I think is some relatively fast IDE stuff.
In any case the operations should be cache friendly, we don't scan over
and over the big tables...

The postgres server configuration is attached.

I have looked in the postgres statistics tables, looks like most of the
needed data is always cached, as in the most accessed tables the
load/hit ratio is mostly something like 1/100, or at least 1/30.


Is anything in the config I got very wrong for the given machine, or
what else should I investigate further ? If I can't make this fly, the
obvious solution will be to move back to Oracle, cash out the license
and forget about postgres forever...

TIA,
Csaba.


Attachment

Re: Help tuning postgres

From
Emil Briggs
Date:
> Hi all,
>
> After a long time of reading the general list it's time to subscribe to
> this one...
>
> We have adapted our application (originally written for oracle) to
> postgres, and switched part of our business to a postgres data base.
>
> The data base has in the main tables around 150 million rows, the whole
> data set takes ~ 30G after the initial migration. After ~ a month of
> usage that bloated to ~ 100G. We installed autovacuum after ~ 2 weeks.
>

Have you tried reindexing your active tables?

Emil

Re: Help tuning postgres

From
Csaba Nagy
Date:
[snip]
> Have you tried reindexing your active tables?
>
Not yet, the db is in production use and I have to plan for a down-time
for that... or is it not impacting the activity on the table ?

> Emil
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match


Re: Help tuning postgres

From
Emil Briggs
Date:
> [snip]
>
> > Have you tried reindexing your active tables?
>
> Not yet, the db is in production use and I have to plan for a down-time
> for that... or is it not impacting the activity on the table ?
>

It will cause some performance hit while you are doing it. It sounds like
something is bloating rapidly on your system and the indexes is one possible
place that could be happening.

Re: Help tuning postgres

From
"Merlin Moncure"
Date:
> The disk used for the data is an external raid array, I don't know
much
> about that right now except I think is some relatively fast IDE stuff.
> In any case the operations should be cache friendly, we don't scan
over
> and over the big tables...

Maybe you are I/O bound.  Do you know if your RAID array is caching your
writes?  Easy way to check is to run fsync off and look for obvious
performance differences.  Maybe playing with sync method could help
here.

Merlin


Re: Help tuning postgres

From
Matthew Nuzum
Date:
On 10/12/05, Csaba Nagy <nagy@ecircle-ag.com> wrote:

> We have adapted our application (originally written for oracle) to
> postgres, and switched part of our business to a postgres data base.

> The data base has in the main tables around 150 million rows, the whole
> data set takes ~ 30G after the initial migration. After ~ a month of
> usage that bloated to ~ 100G. We installed autovacuum after ~ 2 weeks.
>
> The main table is heavily updated during the active periods of usage,
> which is coming in bursts.
>
> Now Oracle on the same hardware has no problems handling it (the load),
> but postgres comes to a crawl. Examining the pg_stats_activity table I
> see the updates on the main table as being the biggest problem, they are
> very slow. The table has a few indexes on it, I wonder if they are
> updated too on an update ? The index fields are not changing. In any
> case, I can't explain why the updates are so much slower on postgres.

I'm not the most experience person on this list, but I've got some big
tables I work with. Doing an update on these big tables often involves
a sequential scan which can be quite slow.

I would suggest posting the explain analyze output for one of your
slow updates. I'll bet it is much more revealing and takes out a lot
of the guesswork.

--
Matthew Nuzum
www.bearfruit.org

Re: Help tuning postgres

From
Csaba Nagy
Date:
Ok, that was the first thing I've done, checking out the explain of the
query. I don't really need the analyze part, as the plan is going for
the index, which is the right decision. The updates are simple one-row
updates of one column, qualified by the primary key condition.
This part is OK, the query is not taking extremely long, but the problem
is that we execute 500 in a transaction, and that takes too long and
blocks other activities.
Actually I've done an iostat run in the meantime (just learned how to
use it), and looks like the disk is 100 saturated. So it clearly is a
disk issue in this case. And it turns out the Oracle hardware has an
edge of 3 years over what I have for postgres, so that might very well
explain the performance difference... Oh well.

Next we'll upgrade the postgres hardware, and then I'll come back to
report if it's working better... sorry for the noise for now.

Cheers,
Csaba.

BTW, is the config file good enough for the kind of machine I have ?
Cause it's the first time I had to make a production configuration and
most of the stuff is according to the commented config guid from varlena
with some guesswork added...

> I'm not the most experience person on this list, but I've got some big
> tables I work with. Doing an update on these big tables often involves
> a sequential scan which can be quite slow.
>
> I would suggest posting the explain analyze output for one of your
> slow updates. I'll bet it is much more revealing and takes out a lot
> of the guesswork.
>
> --
> Matthew Nuzum
> www.bearfruit.org


Re: Help tuning postgres

From
Tom Lane
Date:
Emil Briggs <emil@baymountain.com> writes:
>> Not yet, the db is in production use and I have to plan for a down-time
>> for that... or is it not impacting the activity on the table ?

> It will cause some performance hit while you are doing it.

It'll also lock out writes on the table until the index is rebuilt,
so he does need to schedule downtime.

            regards, tom lane

Re: Help tuning postgres

From
Steve Poe
Date:
Would it not be faster to do a dump/reload of the table than reindex or
is it about the same?

Steve Poe

On Wed, 2005-10-12 at 13:21 -0400, Tom Lane wrote:
> Emil Briggs <emil@baymountain.com> writes:
> >> Not yet, the db is in production use and I have to plan for a down-time
> >> for that... or is it not impacting the activity on the table ?
>
> > It will cause some performance hit while you are doing it.
>
> It'll also lock out writes on the table until the index is rebuilt,
> so he does need to schedule downtime.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly


Re: Help tuning postgres

From
"Merlin Moncure"
Date:
>
> Would it not be faster to do a dump/reload of the table than reindex
or
> is it about the same?
>
reindex is probably faster, but that's not the point. you can reindex a
running system whereas dump/restore requires downtime unless you work
everything into a transaction, which is headache, and dangerous.

reindex locking is very granular, in that it only acquires a excl. lock
on one index at a time and while doing so reading is possible (writes
will wait).

in 8.1 we get a fire and forget reindex database xyz which is about as
good as it gets without a dump/load or full vacuum.

Merlin

Re: Help tuning postgres

From
Andrew Sullivan
Date:
On Wed, Oct 12, 2005 at 06:55:30PM +0200, Csaba Nagy wrote:
> Ok, that was the first thing I've done, checking out the explain of the
> query. I don't really need the analyze part, as the plan is going for
> the index, which is the right decision. The updates are simple one-row

How do you know?  You _do_ need the ANALYSE, because it'll tell you
what the query _actually did_ as opposed to what the planner thought
it was going to do.

Note that EXPLAIN ANALYSE actually performs the work, so you better
do it in a transaction and ROLLBACK if it's a production system.

> Actually I've done an iostat run in the meantime (just learned how to
> use it), and looks like the disk is 100 saturated. So it clearly is a
> disk issue in this case. And it turns out the Oracle hardware has an

Yes, but it could be a disk issue because you're doing more work than
you need to.  If your UPDATEs are chasing down a lot of dead tuples,
for instance, you'll peg your I/O even though you ought to have I/O
to burn.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The whole tendency of modern prose is away from concreteness.
        --George Orwell

Re: Help tuning postgres

From
Csaba Nagy
Date:
[snip]
> Yes, but it could be a disk issue because you're doing more work than
> you need to.  If your UPDATEs are chasing down a lot of dead tuples,
> for instance, you'll peg your I/O even though you ought to have I/O
> to burn.

OK, this sounds interesting, but I don't understand: why would an update
"chase down a lot of dead tuples" ? Should I read up on some docs, cause
I obviously don't know enough about how updates work on postgres...

And how would the analyze help in finding this out ? I thought it would
only show me additionally the actual timings, not more detail in what
was done...

Thanks,
Csaba.



Re: Help tuning postgres

From
"Markus Wollny"
Date:
pgsql-performance-owner@postgresql.org wrote:

>>> Have you tried reindexing your active tables?

> It will cause some performance hit while you are doing it. It
> sounds like something is bloating rapidly on your system and
> the indexes is one possible place that could be happening.

You might consider using contrib/oid2name to monitor physical growth of
tables and indexes. There have been some issues with bloat in PostgreSQL
versions prior to 8.0, however there might still be some issues under
certain circumstances even now, so it does pay to cast an eye on what's
going on. If you haven't run vaccum regularly, this might lead to
regular vacuums not reclaiming enough dead tuples in one go, so if
you've had quite a lot of UPDATE/DELETE activity going onin the past and
only just started to use pg_autovacuum after the DB has been in
production for quite a while, you might indeed have to run a VACUUM FULL
and/or REINDEX on the affected tables, both of which will more or less
lock out any client access to the tables als long as they're running.

Kind regards

   Markus

Re: Help tuning postgres

From
"Markus Wollny"
Date:
pgsql-performance-owner@postgresql.org wrote:

> Next we'll upgrade the postgres hardware, and then I'll come
> back to report if it's working better... sorry for the noise for now.

There have been some discussions about which hardware suits PostgreSQL's
needs best under certain load-characteristics. We have experienced quite
a write-performance burst just from switching from a RAID5-config to a
RAID10 (mirroring&striping), even though we had been using some
supposedly sufficiently powerful dedicated battery-backuped
SCSI-RAID-adapters with lots of on-board cache. You can't beat simple,
although it will cost disk-space. Anyway, you might want to search the
archives for discussion on RAID-configurations.

Re: Help tuning postgres

From
Andrew Sullivan
Date:
On Thu, Oct 13, 2005 at 10:15:03AM +0200, Csaba Nagy wrote:
>
> OK, this sounds interesting, but I don't understand: why would an update
> "chase down a lot of dead tuples" ? Should I read up on some docs, cause
> I obviously don't know enough about how updates work on postgres...

Right.  Here's the issue:

MVCC does not replace rows when you update.  Instead, it marks the
old row as expired, and sets the new values.  The old row is still
there, and it's available for other transactions who need to see it.
As the docs say (see
<http://www.postgresql.org/docs/8.0/interactive/transaction-iso.html>),
"In effect, a SELECT query sees a snapshot of the database as of the
instant that that query begins to run."  And that can be true because
the original data is still there, although marked as expired for
subsequent transactions.

UPDATE works the same was as SELECT in terms of searching for rows
(so does any command that searches for data).

Now, when you select data, you actually have to traverse all the
existing versions of the tuple in order to get the one that's live
for you.  This is normally not a problem: VACUUM goes around and
cleans out old, expired data that is not live for _anyone_.  It does
this by looking for the oldest transaction that is open.  (As far as
I understand it, this is actually the oldest transaction in the
entire back end; but I've never understood why that should the the
case, and I'm too incompetent/dumb to understand the code, so I may
be wrong on this point.)  If you have very long-running transactions,
then, you can end up with a lot of versions of dead tuples on the
table, and so reading the few records you want can turn out actually
to be a very expensive operation, even though it ought to be cheap.

You can see this by using the VERBOSE option to VACUUM:

test=# VACUUM VERBOSE eval1 ;
INFO:  vacuuming "public.eval1"
INFO:  "eval1": found 0 removable, 0 nonremovable row versions in 0
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_18831"
INFO:  index "pg_toast_18831_index" now contains 0 row versions in 1
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_18831": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Note those "removable" and "nonremovable" row versions.  It's the
unremovable ones that can hurt.  WARNING: doing VACUUM on a big table
on a disk that's already pegged is going to cause you performance
pain, because it scans the whole table.  In some cases, though, you
have no choice: if the winds are already out of your sails, and
you're effectively stopped, anything that might get you moving again
is an improvement.

> And how would the analyze help in finding this out ? I thought it would
> only show me additionally the actual timings, not more detail in what
> was done...

Yes, it shows the actual timings, and the actual number of rows.  But
if the estimates that the planner makes are wildly different than the
actual results, then you know your statistics are wrong, and that the
planner is going about things the wrong way.  ANALYSE is a big help.
There's also a verbose option to it, but it's usually less useful in
production situations.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
It is above all style through which power defers to reason.
        --J. Robert Oppenheimer

Re: Help tuning postgres

From
Csaba Nagy
Date:
Thanks Andrew, this explanation about the dead rows was enlightening.
Might be the reason for the slowdown I see on occasions, but not for the
case which I was first observing. In that case the updated rows are
different for each update. It is possible that each row has a few dead
versions, but not too many, each row is updated just a limited number of
times.

However, we have other updates which access the same row 1000s of times
(up to millions of times), and that could hurt if it's like you said,
i.e. if each update has to crawl over all the dead rows... I have now
autovacuum in place, and I'm sure it will kick in at ~ a few 10000s of
updates, but in the meantime it could get bad.
In any case, I suppose that those disk pages should be in OS cache
pretty soon and stay there, so I still don't understand why the disk
usage is 100% in this case (with very low CPU activity, the CPUs are
mostly waiting/idle)... the amount of actively used data is not that
big.

I'll try to vacuum through cron jobs the most exposed tables to this
multiple-dead-row-versions symptom, cause autovacuum probably won't do
it often enough. Let's see if it helps...

Thanks,
Csaba.


On Thu, 2005-10-13 at 14:40, Andrew Sullivan wrote:
> On Thu, Oct 13, 2005 at 10:15:03AM +0200, Csaba Nagy wrote:
> >
> > OK, this sounds interesting, but I don't understand: why would an update
> > "chase down a lot of dead tuples" ? Should I read up on some docs, cause
> > I obviously don't know enough about how updates work on postgres...
>
> Right.  Here's the issue:
>
> MVCC does not replace rows when you update.  Instead, it marks the
> old row as expired, and sets the new values.  The old row is still
> there, and it's available for other transactions who need to see it.
> As the docs say (see
> <http://www.postgresql.org/docs/8.0/interactive/transaction-iso.html>),
> "In effect, a SELECT query sees a snapshot of the database as of the
> instant that that query begins to run."  And that can be true because
> the original data is still there, although marked as expired for
> subsequent transactions.
>
> UPDATE works the same was as SELECT in terms of searching for rows
> (so does any command that searches for data).
>
> Now, when you select data, you actually have to traverse all the
> existing versions of the tuple in order to get the one that's live
> for you.  This is normally not a problem: VACUUM goes around and
> cleans out old, expired data that is not live for _anyone_.  It does
> this by looking for the oldest transaction that is open.  (As far as
> I understand it, this is actually the oldest transaction in the
> entire back end; but I've never understood why that should the the
> case, and I'm too incompetent/dumb to understand the code, so I may
> be wrong on this point.)  If you have very long-running transactions,
> then, you can end up with a lot of versions of dead tuples on the
> table, and so reading the few records you want can turn out actually
> to be a very expensive operation, even though it ought to be cheap.
>
> You can see this by using the VERBOSE option to VACUUM:
>
> test=# VACUUM VERBOSE eval1 ;
> INFO:  vacuuming "public.eval1"
> INFO:  "eval1": found 0 removable, 0 nonremovable row versions in 0
> pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  vacuuming "pg_toast.pg_toast_18831"
> INFO:  index "pg_toast_18831_index" now contains 0 row versions in 1
> pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "pg_toast_18831": found 0 removable, 0 nonremovable row
> versions in 0 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> VACUUM
>
> Note those "removable" and "nonremovable" row versions.  It's the
> unremovable ones that can hurt.  WARNING: doing VACUUM on a big table
> on a disk that's already pegged is going to cause you performance
> pain, because it scans the whole table.  In some cases, though, you
> have no choice: if the winds are already out of your sails, and
> you're effectively stopped, anything that might get you moving again
> is an improvement.
>
> > And how would the analyze help in finding this out ? I thought it would
> > only show me additionally the actual timings, not more detail in what
> > was done...
>
> Yes, it shows the actual timings, and the actual number of rows.  But
> if the estimates that the planner makes are wildly different than the
> actual results, then you know your statistics are wrong, and that the
> planner is going about things the wrong way.  ANALYSE is a big help.
> There's also a verbose option to it, but it's usually less useful in
> production situations.
>
> A


Re: Help tuning postgres

From
Matthew Nuzum
Date:
On 10/13/05, Csaba Nagy <nagy@ecircle-ag.com> wrote:
> On Thu, 2005-10-13 at 14:40, Andrew Sullivan wrote:
> > On Thu, Oct 13, 2005 at 10:15:03AM +0200, Csaba Nagy wrote:
> > > And how would the analyze help in finding this out ? I thought it would
> > > only show me additionally the actual timings, not more detail in what
> > > was done...
> >
> > Yes, it shows the actual timings, and the actual number of rows.  But
> > if the estimates that the planner makes are wildly different than the
> > actual results, then you know your statistics are wrong, and that the
> > planner is going about things the wrong way.  ANALYSE is a big help.
> > There's also a verbose option to it, but it's usually less useful in
> > production situations.

This is the point I was trying to make. I've seen special instances
where people have posted an explain annalyze for a select/update to
the list and suggestions have arisen allowing major performance
improvements.

If this task is where your database is performing its worst then it is
the best place to start with optimizing, short of the obvious stuff,
which it sounds like you've covered.

Sometimes, and I think this has often been true for databases that are
either very large or very small, statistics can be tweaked to get
better performance. One good example is when a sequential scan is
being chosen when an index scan may be better; something like this
would definately peg your disk i/o.

Throwing more hardware at your problem will definately help, but I'm a
performance freak and I like to optimize everything to the max.
*Sometimes* you can get drastic improvements without adding any
hardware. I have seen some truly miraculus turn-arounds by tweaking
some non-obvious settings based on suggestions made on this list.
--
Matthew Nuzum
www.bearfruit.org

Re: Help tuning postgres

From
Andrew Sullivan
Date:
On Thu, Oct 13, 2005 at 03:14:44PM +0200, Csaba Nagy wrote:
> In any case, I suppose that those disk pages should be in OS cache
> pretty soon and stay there, so I still don't understand why the disk
> usage is 100% in this case (with very low CPU activity, the CPUs are
> mostly waiting/idle)... the amount of actively used data is not that
> big.

Ah, but if the sum of all the dead rows is large enough that they
start causing your shared memory (== Postgres buffers) to thrash,
then you start causing the memory subsystem to thrash on the box,
which means less RAM is available for disk buffers because the OS is
doing more work; and the disk buffers are full of a lot of garbage
_anyway_, so then you may find that you're ending up hitting the disk
for some of these reads after all.  Around the office I have called
this the "buffer death spiral".  And note that once you've managed to
get into a vacuum-starvation case, your free space map might be
exceeded, at which point your database performance really won't
recover until you've done VACUUM FULL (prior to 7.4 there's also an
index problem that's even worse, and that needs occasional REINDEX to
solve; I forget which version you said you were using).

The painful part about tuning a production system is really that you
have to keep about 50 variables juggling in your head, just so you
can uncover the one thing that you have to put your finger on to make
it all play nice.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
        --Alexander Hamilton

Re: Help tuning postgres

From
Robert Treat
Date:
reindex should be faster, since you're not dumping/reloading the table
contents on top of rebuilding the index, you're just rebuilding the
index.


Robert Treat
emdeon Practice Services
Alachua, Florida

On Wed, 2005-10-12 at 13:32, Steve Poe wrote:
>
> Would it not be faster to do a dump/reload of the table than reindex or
> is it about the same?
>
> Steve Poe
>
> On Wed, 2005-10-12 at 13:21 -0400, Tom Lane wrote:
> > Emil Briggs <emil@baymountain.com> writes:
> > >> Not yet, the db is in production use and I have to plan for a down-time
> > >> for that... or is it not impacting the activity on the table ?
> >
> > > It will cause some performance hit while you are doing it.
> >
> > It'll also lock out writes on the table until the index is rebuilt,
> > so he does need to schedule downtime.
> >
> >             regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >        subscribe-nomail command to majordomo@postgresql.org so that your
> >        message can get through to the mailing list cleanly
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: Help tuning postgres

From
Csaba Nagy
Date:
In the light of what you've explained below about "nonremovable" row
versions reported by vacuum, I wonder if I should worry about the
following type of report:

INFO:  vacuuming "public.some_table"
INFO:  "some_table": removed 29598 row versions in 452 pages
DETAIL:  CPU 0.01s/0.04u sec elapsed 18.77 sec.
INFO:  "some_table": found 29598 removable, 39684 nonremovable row
versions in 851 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.02s/0.07u sec elapsed 23.16 sec.
VACUUM


Does that mean that 39684 nonremovable pages are actually the active
live pages in the table (as it reports 0 dead) ? I'm sure I don't have
any long running transaction, at least according to pg_stats_activity
(backed by the linux ps too). Or I should run a vacuum full...

This table is one of which has frequently updated rows.

TIA,
Csaba.


On Thu, 2005-10-13 at 14:40, Andrew Sullivan wrote:
> On Thu, Oct 13, 2005 at 10:15:03AM +0200, Csaba Nagy wrote:
> >
> > OK, this sounds interesting, but I don't understand: why would an update
> > "chase down a lot of dead tuples" ? Should I read up on some docs, cause
> > I obviously don't know enough about how updates work on postgres...
>
> Right.  Here's the issue:
>
> MVCC does not replace rows when you update.  Instead, it marks the
> old row as expired, and sets the new values.  The old row is still
> there, and it's available for other transactions who need to see it.
> As the docs say (see
> <http://www.postgresql.org/docs/8.0/interactive/transaction-iso.html>),
> "In effect, a SELECT query sees a snapshot of the database as of the
> instant that that query begins to run."  And that can be true because
> the original data is still there, although marked as expired for
> subsequent transactions.
>
> UPDATE works the same was as SELECT in terms of searching for rows
> (so does any command that searches for data).
>
> Now, when you select data, you actually have to traverse all the
> existing versions of the tuple in order to get the one that's live
> for you.  This is normally not a problem: VACUUM goes around and
> cleans out old, expired data that is not live for _anyone_.  It does
> this by looking for the oldest transaction that is open.  (As far as
> I understand it, this is actually the oldest transaction in the
> entire back end; but I've never understood why that should the the
> case, and I'm too incompetent/dumb to understand the code, so I may
> be wrong on this point.)  If you have very long-running transactions,
> then, you can end up with a lot of versions of dead tuples on the
> table, and so reading the few records you want can turn out actually
> to be a very expensive operation, even though it ought to be cheap.
>
> You can see this by using the VERBOSE option to VACUUM:
>
> test=# VACUUM VERBOSE eval1 ;
> INFO:  vacuuming "public.eval1"
> INFO:  "eval1": found 0 removable, 0 nonremovable row versions in 0
> pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  vacuuming "pg_toast.pg_toast_18831"
> INFO:  index "pg_toast_18831_index" now contains 0 row versions in 1
> pages
> DETAIL:  0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "pg_toast_18831": found 0 removable, 0 nonremovable row
> versions in 0 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> VACUUM
>
> Note those "removable" and "nonremovable" row versions.  It's the
> unremovable ones that can hurt.  WARNING: doing VACUUM on a big table
> on a disk that's already pegged is going to cause you performance
> pain, because it scans the whole table.  In some cases, though, you
> have no choice: if the winds are already out of your sails, and
> you're effectively stopped, anything that might get you moving again
> is an improvement.
>
> > And how would the analyze help in finding this out ? I thought it would
> > only show me additionally the actual timings, not more detail in what
> > was done...
>
> Yes, it shows the actual timings, and the actual number of rows.  But
> if the estimates that the planner makes are wildly different than the
> actual results, then you know your statistics are wrong, and that the
> planner is going about things the wrong way.  ANALYSE is a big help.
> There's also a verbose option to it, but it's usually less useful in
> production situations.
>
> A


Re: Help tuning postgres

From
Csaba Nagy
Date:
First of all thanks all for the input.

I probably can't afford even the reindex till Christmas, when we have
about 2 weeks of company holiday... but I guess I'll have to do
something until Christmas.

The system should at least look like working all the time. I can have
downtime, but only for short periods preferably less than 1 minute. The
tables we're talking about have ~10 million rows the smaller ones and
~150 million rows the bigger ones, and I guess reindex will take quite
some time.

I wonder if I could device a scheme like:

 - create a temp table exactly like the production table, including
indexes and foreign keys;
 - create triggers on the production table which log all inserts,
deletes, updates to a log table;
 - activate these triggers;
 - copy all data from the production table to a temp table (this will
take the bulk of the time needed for the whole operation);
 - replay the log on the temp table repeatedly if necessary, until the
temp table is sufficiently close to the original;
 - rename the original table to something else, and then rename the temp
table to the original name, all this in a transaction - this would be
ideally the only visible delay for the user, and if the system is not
busy, it should be quick I guess;
 - replay on more time the log;

All this should happen in a point in time when there's little traffic to
the data base.

Replaying could be as simple as a few delete triggers on the log table,
which replay the deleted record on the production table, and the replay
then consisting in a delete operation on the log table. This is so that
new log entries can be replayed later without replaying again what was
already replayed.

The big tables I should do this procedure on have low probability of
conflicting operations (like insert and immediate delete of the same
row, or multiple insert of the same row, multiple conflicting updates of
the same row, etc.), this is why I think replaying the log will work
fine... of course this whole set up will be a lot more work than just
reindex...

I wonder if somebody tried anything like this and if it has chances to
work ?

Thanks,
Csaba.

On Tue, 2005-10-18 at 17:18, Robert Treat wrote:
> reindex should be faster, since you're not dumping/reloading the table
> contents on top of rebuilding the index, you're just rebuilding the
> index.
>
>
> Robert Treat
> emdeon Practice Services
> Alachua, Florida
>
> On Wed, 2005-10-12 at 13:32, Steve Poe wrote:
> >
> > Would it not be faster to do a dump/reload of the table than reindex or
> > is it about the same?
> >
> > Steve Poe
> >
> > On Wed, 2005-10-12 at 13:21 -0400, Tom Lane wrote:
> > > Emil Briggs <emil@baymountain.com> writes:
> > > >> Not yet, the db is in production use and I have to plan for a down-time
> > > >> for that... or is it not impacting the activity on the table ?
> > >
> > > > It will cause some performance hit while you are doing it.
> > >
> > > It'll also lock out writes on the table until the index is rebuilt,
> > > so he does need to schedule downtime.
> > >
> > >             regards, tom lane
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: if posting/reading through Usenet, please send an appropriate
> > >        subscribe-nomail command to majordomo@postgresql.org so that your
> > >        message can get through to the mailing list cleanly
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings


Re: Help tuning postgres

From
Andrew Sullivan
Date:
On Tue, Oct 18, 2005 at 05:21:37PM +0200, Csaba Nagy wrote:
> INFO:  vacuuming "public.some_table"
> INFO:  "some_table": removed 29598 row versions in 452 pages
> DETAIL:  CPU 0.01s/0.04u sec elapsed 18.77 sec.
> INFO:  "some_table": found 29598 removable, 39684 nonremovable row
> versions in 851 pages
> DETAIL:  0 dead row versions cannot be removed yet.

> Does that mean that 39684 nonremovable pages are actually the active
> live pages in the table (as it reports 0 dead) ? I'm sure I don't have
> any long running transaction, at least according to pg_stats_activity
> (backed by the linux ps too). Or I should run a vacuum full...
>
> This table is one of which has frequently updated rows.

No, you should be ok there.  What that should tell you is that you
have about 40,000 rows in the table.  But notice that your vacuum
process just removed about 75% of the live table rows.  Moreover,
your 39684 rows are taking 851 pages.  On a standard installation,
that's usually 8Kb/page.  So that's about 6,808 Kb of physical
storage space you're using.  Is that consistent with the size of your
data?  If it's very large compared to the data you have stored in
there, you may want to ask if you're "leaking" space from the free
space map (because of that table turnover, which seems pretty
severe).

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The whole tendency of modern prose is away from concreteness.
        --George Orwell