Re: Index bloat problem? - Mailing list pgsql-performance

From David Roussel
Subject Re: Index bloat problem?
Date
Msg-id 869771629a1d24a91b8df779f7cddfe9@diroussel.xsmail.com
Whole thread Raw
In response to Re: Index bloat problem?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Index bloat problem?
List pgsql-performance
On 22 Apr 2005, at 06:57, Tom Lane wrote:
> Bill Chandler <billybobc1210@yahoo.com> writes:
>> Client is reporting that the size of an index is
>> greater than the number of rows in the table (1.9
>> million vs. 1.5 million).
>
> This thread seems to have wandered away without asking the critical
> question "what did you mean by that?"
>
> It's not possible for an index to have more rows than there are in
> the table unless something is seriously broken.  And there aren't
> any SQL operations that let you inspect an index directly anyway.
> So: what is the actual observation that led you to the above
> conclusion?  Facts, please, not inferences.

I work for the client in question. Glad you picked up on that point.  I
covered the detail in my my post "How can an index be larger than a
table" on 21 Apr. 2005.  I guess I was too detailed, and too much info
put people off.
http://archives.postgresql.org/pgsql-performance/2005-04/msg00553.php

quoting from there...

|
|SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE
relname LIKE 'dave_data%';
|
|relname                                relkind reltuples relpages
|======================================= ======= ========= ========
|dave_data_update_events                r       1593600.0 40209
|dave_data_update_events_event_id_key   i       1912320.0 29271
|dave_data_update_events_event_id_seq   S       1.0       1
|dave_data_update_events_lds_idx        i       1593600.0 6139
|dave_data_update_events_obj_id_idx     i       1593600.0 6139
|iso_pjm_data_update_events_obj_id_idx  i       1593600.0 6139
|

Note that there are only 1593600 rows in the table, so why the 1912320
figure?

Of course I checked that the row count was correct...

|
|EXPLAIN ANALYZE
|select count(*) from iso_pjm_data_update_events
|
|QUERY PLAN
|Aggregate  (cost=60129.00..60129.00 rows=1 width=0) (actual
time=35933.292..35933.293 rows=1 loops=1)
|  ->  Seq Scan on iso_pjm_data_update_events  (cost=0.00..56145.00
rows=1593600 width=0) (actual time=0.213..27919.497 rows=1593600
loops=1)
|Total runtime: 35933.489 ms
|

and...

|
|select count(*) from iso_pjm_data_update_events
|
|count
|1593600
|

so it's not that there are any undeleted rows lying around


pgsql-performance by date:

Previous
From: "Dave Page"
Date:
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon
Next
From: Richard van den Berg
Date:
Subject: Re: When are index scans used over seq scans?