Thread: reindexing pg_shdepend

reindexing pg_shdepend

From
Joseph S
Date:
My pg_shdepend table has a size of 16,384, but
pg_shdepend_depender_index has a size of 19,169,280 and
pg_shdepend_reference_index has a size of 49,152.  When I try to reindex
the table I get:

ERROR:  shared table "pg_shdepend" can only be reindexed in stand-alone mode

So is there any way I can clear this bloat w/o restarting the server?

Re: reindexing pg_shdepend

From
Alvaro Herrera
Date:
Joseph S wrote:
> My pg_shdepend table has a size of 16,384, but pg_shdepend_depender_index
> has a size of 19,169,280 and pg_shdepend_reference_index has a size of
> 49,152.  When I try to reindex the table I get:
>
> ERROR:  shared table "pg_shdepend" can only be reindexed in stand-alone
> mode
>
> So is there any way I can clear this bloat w/o restarting the server?

Nope :-(

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: reindexing pg_shdepend

From
Tom Lane
Date:
Joseph S <jks@selectacast.net> writes:
> My pg_shdepend table has a size of 16,384, but
> pg_shdepend_depender_index has a size of 19,169,280 and
> pg_shdepend_reference_index has a size of 49,152.

I'd be interested to see the usage pattern that made it get like that
...

            regards, tom lane

Re: reindexing pg_shdepend

From
Tom Lane
Date:
Joseph S <jks@selectacast.net> writes:
> Me too.  I don't change my db schema that much, but I experience bloat
> in the pg_tables that I don't expect.  For instance pg_opclass needs a
> VACUUM FULL/REINDEX once a week or I notice the indexes are larger than
> the table itself.  Could it be my heavy use of temp tables?

pg_opclass?  That's read-only for most people.  What are you doing with
operator classes?

Heavy use of temp tables would expand pg_class, pg_type, and especially
pg_attribute, but as long as you have a decent vacuuming regimen (do you
use autovac?) they shouldn't get out of hand.

            regards, tom lane

Re: reindexing pg_shdepend

From
Joseph S
Date:
Me too.  I don't change my db schema that much, but I experience bloat
in the pg_tables that I don't expect.  For instance pg_opclass needs a
VACUUM FULL/REINDEX once a week or I notice the indexes are larger than
the table itself.  Could it be my heavy use of temp tables?

Today I noticed that pg_statistic (which I actually expect to be updated
in the normal course of operations) was over 20 meg (with large indexes
as well) so I gave it a VACUUM FULL/REINDEX and it now stands at 344,064.

These tables never get *really* large, so I've never noticed a big
performance hit, but they still get bigger than they should be and could
be slowing everything down a little.

around 3pm today I did a VACUUM FULL/REINDEX of all the trouble tables I
have in my list, and I didn't save the before/after sizes, bit it is
9:30 pm now and I can tell you how much they've grown since then:

BEFORE VACUUM FULL/REINDEX at 9:30:

pg_catalog    pg_class    table    196,608    21.526
pg_catalog    pg_class_oid_index    index    49,152    81.8
pg_catalog    pg_class_relname_nsp_index    index    172,032    21.526


pg_catalog    pg_type    table    180,224    15.045
pg_catalog    pg_type_oid_index    index    40,960    66.2
pg_catalog    pg_type_typname_nsp_index    index    106,496    25.462

AFTER:

pg_catalog    pg_class    table    81,920    41.1
pg_catalog    pg_class_oid_index    index    32,768    102.5
pg_catalog    pg_class_relname_nsp_index    index    57,344    58.714

pg_catalog    pg_type    table    65,536    41.375
pg_catalog    pg_type_oid_index    index    16,384    165.5
pg_catalog    pg_type_typname_nsp_index    index    49,152    55.167

My apologies for the tabs.  That was a cut & paste from a web page I set
up to monitor the database size.  The columns are: schema    relname
Type    bytes    tuplesperpage

The list of tables I have in my list are:
pg_attribute pg_class pg_depend pg_index pg_shdepend pg_proc
pg_statistic pg_type pg_trigger pg_shdepend
I put them in my list bec. I once noticed that their indexes seemed big
relative to the size of the table itself.  I didn't really analyze if
they were indeed recurring problems or just one time problems, but I
know pg_class and pg_opclass are ones where this is a recurring problem.

BTW Tom do you prefer the replies to go to you directly as well as to
the list?  Most of the time I just hit 'Reply' and since this list
doesn't set the Reply-to: the replies go to the OP as well.

Tom Lane wrote:
> Joseph S <jks@selectacast.net> writes:
>> My pg_shdepend table has a size of 16,384, but
>> pg_shdepend_depender_index has a size of 19,169,280 and
>> pg_shdepend_reference_index has a size of 49,152.
>
> I'd be interested to see the usage pattern that made it get like that
> ...
>
>             regards, tom lane


index bloat WAS: reindexing pg_shdepend

From
Joseph S
Date:
Tom Lane wrote:
> Joseph S <jks@selectacast.net> writes:
>> Me too.  I don't change my db schema that much, but I experience bloat
>> in the pg_tables that I don't expect.  For instance pg_opclass needs a
>> VACUUM FULL/REINDEX once a week or I notice the indexes are larger than
>> the table itself.  Could it be my heavy use of temp tables?
>
> pg_opclass?  That's read-only for most people.  What are you doing with
> operator classes?

I know.  I can't figure it out.  I barely know what operator classes
are, but I'm pretty sure I'm not modifying them in any way.
>
> Heavy use of temp tables would expand pg_class, pg_type, and especially
> pg_attribute, but as long as you have a decent vacuuming regimen (do you
> use autovac?) they shouldn't get out of hand.
>
I do use autovac.  Like I said they don't get really out of hand, only
up to 20 megs or so before I noticed that it was weird.  The large
indexes are what tipped me off that something strange was going on.

I only noticed this because I was making an effort to monitor index
bloat on my regular tables.  It could be there are a lot of people out
there who are experiencing this but don't notice because 20 megs here
and there don't cause any noticeable problems.

So how about it list?  Do you know how bloated your indexes are getting?
  I use this sql:

select (select nspname FROM pg_catalog.pg_namespace where oid =
relnamespace) AS schema, relname,  CASE c.relkind WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN
's' THEN 'special' END as "Type" ,CASE c.relkind  IN ('i','r','S','')
WHEN  true  THEN pg_relation_size(relname) END AS bytes, CASE relpages >
0 WHEN true THEN reltuples/relpages END AS tuplesperpage FROM
pg_catalog.pg_class c WHERE pg_catalog.pg_table_is_visible(c.oid) order
by schema, relname;

... and when I notice that the tuplesperpage for the indexes is low (or
that the indexes are bigger then the tables themselves) I know it is
time for a VACUUM FULL and REINDEX on that table.

If you really want to get fancy you can save the results of that into a
table with a timestamp.  Then every (insert time period here) run VACUUM
FULL/REINDEXs on the individual tables and store the new sizes with
timestamps.

Re: index bloat WAS: reindexing pg_shdepend

From
Tom Lane
Date:
Joseph S <jks@selectacast.net> writes:
> ... and when I notice that the tuplesperpage for the indexes is low (or
> that the indexes are bigger then the tables themselves) I know it is
> time for a VACUUM FULL and REINDEX on that table.

If you are taking the latter as a blind must-be-wrong condition, you are
fooling yourself -- it's not true for small tables.  For instance, in a
freshly initdb'd database:

postgres=# vacuum verbose pg_opclass;
INFO:  vacuuming "pg_catalog.pg_opclass"
INFO:  index "pg_opclass_am_name_nsp_index" now contains 107 row versions in 4 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_opclass_oid_index" now contains 107 row versions in 2 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_opclass": found 0 removable, 107 nonremovable row versions in 2 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.01s/0.00u sec elapsed 0.00 sec.
VACUUM
postgres=#

Have you checked whether the VACUUM FULL + REINDEX actually makes
anything smaller?

            regards, tom lane

Re: index bloat WAS: reindexing pg_shdepend

From
Michael Fuhr
Date:
On Thu, Aug 02, 2007 at 10:40:24PM -0400, Joseph S wrote:
> Tom Lane wrote:
> >Heavy use of temp tables would expand pg_class, pg_type, and especially
> >pg_attribute, but as long as you have a decent vacuuming regimen (do you
> >use autovac?) they shouldn't get out of hand.
>
> I do use autovac.  Like I said they don't get really out of hand, only
> up to 20 megs or so before I noticed that it was weird.  The large
> indexes are what tipped me off that something strange was going on.

Unexpected bloat in pg_shdepend led me to discover a problem with
statistics for shared tables a couple of months ago:

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00190.php
http://archives.postgresql.org/pgsql-hackers/2007-06/msg00245.php

> I only noticed this because I was making an effort to monitor index
> bloat on my regular tables.  It could be there are a lot of people out
> there who are experiencing this but don't notice because 20 megs here
> and there don't cause any noticeable problems.

Anybody making heavy use of temporary tables and relying on autovacuum
is probably suffering bloat in pg_shdepend because no released
version of PostgreSQL has the fix for the statistics bug (it has
been fixed in CVS, however).  As I mention in the second message
above, vacuuming pg_shdepend resulted in an immediate performance
improvement in an application I was investigating.

--
Michael Fuhr

Re: index bloat WAS: reindexing pg_shdepend

From
Joseph Shraibman
Date:
Tom Lane wrote:
> Joseph S <jks@selectacast.net> writes:
>> ... and when I notice that the tuplesperpage for the indexes is low (or
>> that the indexes are bigger then the tables themselves) I know it is
>> time for a VACUUM FULL and REINDEX on that table.
>
> If you are taking the latter as a blind must-be-wrong condition, you are
> fooling yourself -- it's not true for small tables.

I know it isn't true for small tables.  Tables can have a size of 0 but
the minimum size for an index seems to be two pages.  Indexes can also
rival the size of the table when the table when the index is on all the
columns of the table.  But most of the time having an index bigger than
the table itself mean I need a REINDEX.


>
> Have you checked whether the VACUUM FULL + REINDEX actually makes
> anything smaller?

Yes.  I'm mostly seeing the problem on tables of counts that are updated
frequently by triggers on other tables.  It seems autovacuum can't keep
up with the frequency of updates.  The table size itself can shrink by
50%, but the indexes can shrink by 90%.

I just ran my VACUUM FULL/REINDEX script at 11am.  Last time I ran it
was 930pm last night.  Some before/afters:

BEFORE

pg_catalog    pg_class    table    172,032    19.476
pg_catalog    pg_class_oid_index    index    57,344    58.429
pg_catalog    pg_class_relname_nsp_index    index    180,224    18.591

AFTER

pg_catalog    pg_class    table    90,112    41.3
pg_catalog    pg_class_oid_index    index    32,768    103
pg_catalog    pg_class_relname_nsp_index    index    73,728    59

BEFORE

public    acount    table    434,176    119.302
public    acount_pkey    index    172,032    301.095
public    ad_x_idx    index    638,976    36.551

AFTER

public    acount    table    335,872    155.561
public    acount_pkey    index    163,840    318.9
public    a_x_idx    index    131,072    221.143

Re: index bloat WAS: reindexing pg_shdepend

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> On Thu, Aug 02, 2007 at 10:40:24PM -0400, Joseph S wrote:
>> I do use autovac.  Like I said they don't get really out of hand, only
>> up to 20 megs or so before I noticed that it was weird.  The large
>> indexes are what tipped me off that something strange was going on.

> Unexpected bloat in pg_shdepend led me to discover a problem with
> statistics for shared tables a couple of months ago:

> http://archives.postgresql.org/pgsql-hackers/2007-06/msg00190.php
> http://archives.postgresql.org/pgsql-hackers/2007-06/msg00245.php

Hmm ... that problem would have caused autovac to mostly ignore the
shared tables, but in such a scenario you'd expect the table itself
and both indexes to all be bloated.  The thing that struck me about
Joseph's report was that the one index was so much more bloated than
the other.  The index entries are only slightly larger (3 OIDs not 2)
so there's no obvious reason for this.

The fact that the indexes are bloated and the table itself not can be
explained by a history of manual VACUUM FULLs, but that should have
had similar effects on both indexes.

We know that vacuum's inability to merge mostly-but-not-entirely-empty
index pages can lead to index bloat given a sufficiently unfriendly
usage pattern, and I think that must be what happened here, but I'm not
clear what that usage pattern is.  If we had those details we could
possibly work around it by changing the column ordering in the index
--- AFAIR there isn't any particular reason for
pg_shdepend_depender_index to have one column order rather than another.

            regards, tom lane

Re: index bloat WAS: reindexing pg_shdepend

From
Tom Lane
Date:
OK, as far as I saw you never mentioned what PG version you are running,
but if it's 8.2.x then I think I know what's going on.  The thing that
was bothering me was the discrepancy in size of the two indexes.  Now
the entries in pg_shdepend_reference_index are all going to be
references to roles, and the index dump you sent me showed that they're
all references to the *same* role, ie, there's only one user ID doing
all the creation and deletion of temp tables.  On the other hand, the
entries in pg_shdepend_depender_index will be pg_class and pg_type
references with all different OIDs as the OID counter advances over
time.  So the keys in pg_shdepend_depender_index are all distinct
whereas the ones in pg_shdepend_reference_index are mostly the same.

What apparently is happening is that the 8.2 optimization to avoid
splitting btree pages when we are able to make room by removing
LP_DELETED index tuples kicks in for pg_shdepend_reference_index but not
for pg_shdepend_depender_index.  I think this is precisely because the
keys are all the same in the former: when we scan to mark the latest
temp table's entry deleted, we will visit all the other keys that aren't
yet LP_DELETED, and if they are now dead they'll get marked, and then
the next time we fill up the page we will find we can remove them.  But
in pg_shdepend_reference_index there is not any reason for the system to
revisit an old index entry and discover that it can be marked LP_DELETED.
So those entries will stay there until they get vacuumed.

In short, I think what happened is that pg_shdepend bloated because
autovacuum wasn't touching it (because of that statistics bug), and
pg_shdepend_depender_index bloated right along with it, but
pg_shdepend_reference_index didn't bloat because it was able to recycle
tuples sooner.  The current state you report (in which the table is
small too) would have been reached after a manual VACUUM or VACUUM FULL.

I was able to duplicate this behavior by having an unprivileged user
create and immediately drop a temp table, several thousand times in
succession, and then finally vacuuming pg_shdepend (autovac was off to
prevent bollixing the experiment).  The vacuum shows

regression=# vacuum verbose pg_shdepend;
INFO:  vacuuming "pg_catalog.pg_shdepend"
INFO:  scanned index "pg_shdepend_depender_index" to remove 4000 row versions
DETAIL:  CPU 0.00s/0.03u sec elapsed 0.02 sec.
INFO:  scanned index "pg_shdepend_reference_index" to remove 4000 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_shdepend": removed 4000 row versions in 26 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_shdepend_depender_index" now contains 1 row versions in 19 pages
DETAIL:  4000 index row versions were removed.
15 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_shdepend_reference_index" now contains 1 row versions in 2 pages
DETAIL:  346 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_shdepend": found 4000 removable, 1 nonremovable row versions in 26 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
26 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.03u sec elapsed 0.02 sec.
INFO:  "pg_shdepend": truncated 26 to 1 pages
DETAIL:  CPU 0.01s/0.00u sec elapsed 0.01 sec.
VACUUM

Notice that pg_shdepend_reference_index never actually split at all ---
it's still just the metapage and the root page.  So every time the root
filled, it was able to clean most of it out.

In short: nothing much to see here after all.  We'll just have to keep
in mind that the LP_DELETED recycling patch can have drastically
different effectiveness in different indexes of the same table, and
so it's very possible now for indexes to be of much different sizes.

            regards, tom lane