Re: Curious about dead rows. - Mailing list pgsql-performance

From Jean-David Beyer
Subject Re: Curious about dead rows.
Date
Msg-id 473A5CC0.3060003@verizon.net
Whole thread Raw
In response to Re: Curious about dead rows.  ("Merlin Moncure" <mmoncure@gmail.com>)
Responses Re: Curious about dead rows.
List pgsql-performance
Merlin Moncure wrote:
> On Nov 10, 2007 1:38 PM, Jean-David Beyer <jeandavid8@verizon.net> wrote:
>> Tom Lane wrote:
>>> Jean-David Beyer <jeandavid8@verizon.net> writes:
>>>> I am doing lots of INSERTs on a table that starts out empty (I did a
>>>> TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
>>>> on. I moved logging up to debug2 level to see what was going on, and I get
>>>> things like this:
>>>>  "vl_as": scanned 3000 of 5296 pages, containing 232944 live rows and 1033
>>>> dead rows; 3000 rows in sample, 411224 estimated total rows
>>>> A little later, it says:
>>>> "vl_as": scanned 3000 of 6916 pages, containing 233507 live rows and 493
>>>> dead rows; 3000 rows in sample, 538311 estimated total rows
>>> Well, *something* is doing deletes or updates in that table.  Better
>>> look a bit harder at your application ...
>>>
>> OK, you agree that if I am doing only INSERTs, that there should not be any
>> dead rows. Therefore, I _must_ be doing deletes or updates.
>>
>> But the program is pretty simple, and I see no UPDATEs or DELETEs. I
>> searched all the program source files (that contain none of them) and all
>> the libraries I have written, and they have none either. Right now the
>> programs are not to the state where UPDATEs or DELETEs are required (though
>> they will be later). I am still developing them and it is easier to just
>> restore from backup or start over from the beginning since most of the
>> changes are data laundering from an ever-increasing number of spreadsheets.
>>
>> Am I right that TRUNCATE deletes all the rows of a table. They may then be
>> still there, but would not autovacuum clean out the dead rows? Or maybe it
>> has not gotten to them yet? I could do an explicit one earlier.
>
> what does pg_stat_all_tables say (assuming row level stats are on)?

# - Query/Index Statistics Collector -

#stats_start_collector = on
stats_start_collector = on

#stats_command_string = off
#stats_block_level = off

#stats_row_level = off
stats_row_level = on

#stats_reset_on_server_start = off

>
It says stuff like this:

 relname  | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del
----------+----------+--------------+----------+---------------+-----------+-
 ibd      |       75 |      9503850 |       11 |       2350555 |   2416845 |
        0 |         0
 vl_cf    |      139 |     38722575 |       22 |       5392609 |   5692814 |
        0 |         0
 vl_li    |      139 |     39992838 |       22 |       5569855 |   5885516 |
        0 |         0

I removed the relid and schemaname and squeezed the other columns so it
would not be quite so wide. Is this what you might like to know?

--
  .~.  Jean-David Beyer          Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A         Registered Machine   241939.
 /( )\ Shrewsbury, New Jersey    http://counter.li.org
 ^^-^^ 21:10:01 up 21 days, 14:28, 3 users, load average: 6.20, 5.69, 5.11

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Curious about dead rows.
Next
From: "Merlin Moncure"
Date:
Subject: Re: dell versus hp