Thread: BUG #7853: Incorrect statistics in table with many dead rows.

BUG #7853: Incorrect statistics in table with many dead rows.

From

jimbob@seagate.com

Date:

05 February 2013, 18:55:09

The following bug has been logged on the website:

Bug reference:      7853
Logged by:          James Skaggs
Email address:      jimbob@seagate.com
PostgreSQL version: 8.4.14
Operating system:   RHEL6
Description:        =


After "analyze verbose", the table shows 158 million rows. A select count(1)
yields 13.8 million rows.

INFO:  analyzing "public.stream_file"
INFO:  "stream_file": scanned 30000 of 2123642 pages, containing 184517 live
rows and 2115512 dead rows; 30000 rows in sample, 158702435 estimated total
rows

Here are the table statistics.

Sequential Scans    81853    =

Sequential Tuples Read    578848425234    =

Index Scans    1976513672    =

Index Tuples Fetched    2183339860    =

Tuples Inserted    65122575    =

Tuples Updated    308883671    =

Tuples Deleted    51238760    =

Tuples HOT Updated    2242897    =

Live Tuples    163981972    =

Dead Tuples    7056493    =

Heap Blocks Read    43483331819    =

Heap Blocks Hit    43121456487    =

Index Blocks Read    134539277    =

Index Blocks Hit    13606451182    =

Toast Blocks Read        =

Toast Blocks Hit        =

Toast Index Blocks Read        =

Toast Index Blocks Hit        =

Last Vacuum    2013-02-04 10:06:44.058743-07    =

Last Autovacuum    2013-02-04 16:11:34.289823-07    =

Last Analyze    2013-02-04 14:22:27.848547-07    =

Last Autoanalyze    2013-02-01 17:37:29.855553-07    =

Table Size    17 GB    =

Toast Table Size    none    =

Indexes Size    34 GB
Query returned successfully with no result in 4094 ms.


Bad statistics led to a bad plan.  We will cluster the table today to see if
that fixes it, but I think statistics should be correct, regardless of the
state of a table.  BTW, Coverity product requries 8.x, and we'll upgrade to
8.4.15 today. Didn't see anything about better statistics in the 8.4.15
changelog.

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

Kevin Grittner

Date:

05 February 2013, 22:00:51

"jimbob@seagate.com" <jimbob@seagate.com> wrote:=0A=0A> INFO:=A0 analyzing =
"public.stream_file"=0A> INFO:=A0 "stream_file": scanned 30000 of 2123642 p=
ages, containing=0A> 184517 live rows and 2115512 dead rows; 30000 rows in =
sample,=0A> 158702435 estimated total rows=0A=0A184517 live rows in 30000 r=
andomly sampled pages out of 2123642=0Atotal pages, means that the statisti=
cs predict that a select=0Acount(*) will find about=A0 13 million live rows=
 to count.=0A=0A> After "analyze verbose", the table shows 158 million rows=
. A=0A> select count(1) yields 13.8 million rows.=0A=0AOK, the estimate was=
 13 million and there were actually 13.8=0Amillion, but it is a random samp=
le used to generate estimates. =0AThat seems worse than average, but close =
enough to be useful.=0AThe 158.7 million total rows includes dead rows, whi=
ch must be=0Avisited to determine visibility, but will not be counted becau=
se=0Athey are not visible to the counting transaction.=A0 Having over 90%=
=0Aof your table filled with dead rows is a bad situation to be in,=0Afrom =
a performance standpoint.=A0 You should use aggressive=0Amaintenance (like =
VACUUM FULL or CLUSTER) to fix the existing=0Aextreme bloat, and then revie=
w your autovacuum settings and overall=0Avacuum regimen to prevent future b=
loat.=0A=0AThis does not look like a bug from the information provided so f=
ar.=0A=0A-Kevin

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

James R Skaggs

Date:

05 February 2013, 23:00:11

Kevin -

Agreed, we shouldn't have so many dead rows.  Our autovacuum is set on but
has default parameters.   We are clustering today.  This, of course, means
downtime and inconvenience to the users.

Here is the troublesome query:

select

    sum(case when t1.cdlc < 0 then 0 else t1.cdlc end) as code_line_cnt,

    sum(case when t1.cmlc < 0 then 0 else t1.cmlc end) as comment_line_cnt,

    sum(case when t1.bllc < 0 then 0 else t1.bllc end) as blank_line_cnt

  from

    ( select

        max(sf.current_code_line_count) as cdlc,

        max(sf.current_comment_line_count) as cmlc,

        max(sf.current_blank_line_count) as bllc

      from

        stream_file sf

      group by sf.file_path_id, sf.current_source_md5

    ) as t1;


And the explain plan.  Against a optimized database, the plan was better,
as well as execution

Aggregate  (cost=32073953.42..32073953.43 rows=1 width=12) (actual
time=77394.354..77394.354 rows=1 loops=1)
  ->  GroupAggregate  (cost=29271854.25..31808889.97 rows=15146482
width=53) (actual time=62490.514..77333.105 rows=168591 loops=1)
        ->  Sort  (cost=29271854.25..29650516.30 rows=151464819 width=53)
(actual time=62490.492..73098.009 rows=14403547 loops=1)
              Sort Key: sf.file_path_id, sf.current_source_md5
              Sort Method:  external merge  Disk: 942440kB
              ->  Seq Scan on stream_file sf  (cost=0.00..3514999.19
rows=151464819 width=53) (actual time=0.327..20620.230 rows=14403547
loops=1)
Total runtime: 77836.949 ms

Thanks and Regards,

James Skaggs
IT/FIS Longmont
SeaTel: 8 684 1048
General: +1 612 367 6224


On Tue, Feb 5, 2013 at 3:00 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

> "jimbob@seagate.com" <jimbob@seagate.com> wrote:
>
> > INFO:  analyzing "public.stream_file"
> > INFO:  "stream_file": scanned 30000 of 2123642 pages, containing
> > 184517 live rows and 2115512 dead rows; 30000 rows in sample,
> > 158702435 estimated total rows
>
> 184517 live rows in 30000 randomly sampled pages out of 2123642
> total pages, means that the statistics predict that a select
> count(*) will find about  13 million live rows to count.
>
> > After "analyze verbose", the table shows 158 million rows. A
> > select count(1) yields 13.8 million rows.
>
> OK, the estimate was 13 million and there were actually 13.8
> million, but it is a random sample used to generate estimates.
> That seems worse than average, but close enough to be useful.
> The 158.7 million total rows includes dead rows, which must be
> visited to determine visibility, but will not be counted because
> they are not visible to the counting transaction.  Having over 90%
> of your table filled with dead rows is a bad situation to be in,
> from a performance standpoint.  You should use aggressive
> maintenance (like VACUUM FULL or CLUSTER) to fix the existing
> extreme bloat, and then review your autovacuum settings and overall
> vacuum regimen to prevent future bloat.
>
> This does not look like a bug from the information provided so far.
>
> -Kevin
>

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

Kevin Grittner

Date:

05 February 2013, 23:12:22

James R Skaggs <james.r.skaggs@seagate.com> wrote:=0A=0A> Agreed, we should=
n't have so many dead rows.=A0 Our autovacuum is=0A> set on but has default=
 parameters.=A0=A0 We are clustering today. =0A> This, of course, means dow=
ntime and inconvenience to the users.=0A=0ARight, which is why it's importa=
nt to figure out why the bloat=0Ahappened.=A0 Sometimes it is unavoidable, =
like when you delete 90% of=0Athe rows in your table or a long-lived "idle =
in transaction"=0Aconnection prevents autovacuum from being able to do its =
work=0Anormally.=A0 To prevent further downtime it is important to figure=
=0Aout what happened and make appropriate changes to your monitoring=0Aor v=
acuuming.=0A=0A> Here is the troublesome query:=0A>=0A>> select =0A>>=A0=A0=
=A0 sum(case when t1.cdlc < 0 then 0 else t1.cdlc end) as code_line_cnt, =
=0A>>=A0=A0=A0 sum(case when t1.cmlc < 0 then 0 else t1.cmlc end) as commen=
t_line_cnt, =0A>>=A0=A0=A0 sum(case when t1.bllc < 0 then 0 else t1.bllc en=
d) as blank_line_cnt =0A>>=A0 from =0A>>=A0=A0=A0 ( select =0A>>=A0=A0=A0=
=A0=A0=A0=A0 max(sf.current_code_line_count) as cdlc, =0A>>=A0=A0=A0=A0=A0=
=A0=A0 max(sf.current_comment_line_count) as cmlc,=0A>>=A0=A0=A0=A0=A0=A0=
=A0 max(sf.current_blank_line_count) as bllc =0A>>=A0=A0=A0=A0=A0 from =0A>=
>=A0=A0=A0=A0=A0=A0=A0 stream_file sf =0A>>=A0=A0=A0=A0=A0 group by sf.file=
_path_id, sf.current_source_md5 =0A>>=A0=A0=A0 ) as t1;=0A=0AWell, I don't =
see that the planner has a lot of choice there=0Abesides whether to use a s=
ort or a hash to do the inner=0Aaggregation.=A0 Are you saying that prior t=
o the bloat it used a hash=0Aaggregation, and that was faster?=A0 And that =
you feel that it should=0Abe using that even with the bloat?=A0 That the de=
ad rows seem to be=0Agetting included in the statistics, driving to the slo=
wer plan, and=0Ayou feel they should be omitted?=0A=0ANote that I'm not aru=
ing one way or another on these points at the=0Amoment; I'm just trying to =
understand your point clearly.=0A=0A-Kevin

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

James R Skaggs

Date:

06 February 2013, 15:45:16

Kevin -

Here is the plan *after *we clustered on the PK index..it was probably that
way before the bloat, but I don't know.

Aggregate  (cost=514557.38..514557.40 rows=1 width=12) (actual
time=8823.973..8823.974 rows=1 loops=1)
  ->  HashAggregate  (cost=471029.03..492793.20 rows=1243667 width=53)
(actual time=8725.324..8789.926 rows=159369 loops=1)"
        ->  Seq Scan on stream_file sf  (cost=0.00..315570.68 rows=12436668
width=53) (actual time=0.013..1152.600 rows=12436753 loops=1)
Total runtime: 8838.395 ms"




I guess the statistics should reflect the true costs.  So if the "bloated"
statistics are representative of the true cost, I'm all for them.

We are clustering the database as we speak, so I can't force a hash
aggregate to see if that is the right choice or not.

Thanks and Regards,

James Skaggs
IT/FIS Longmont
SeaTel: 8 684 1048
General: +1 612 367 6224


On Tue, Feb 5, 2013 at 4:12 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

> James R Skaggs <james.r.skaggs@seagate.com> wrote:
>
> > Agreed, we shouldn't have so many dead rows.  Our autovacuum is
> > set on but has default parameters.   We are clustering today.
> > This, of course, means downtime and inconvenience to the users.
>
> Right, which is why it's important to figure out why the bloat
> happened.  Sometimes it is unavoidable, like when you delete 90% of
> the rows in your table or a long-lived "idle in transaction"
> connection prevents autovacuum from being able to do its work
> normally.  To prevent further downtime it is important to figure
> out what happened and make appropriate changes to your monitoring
> or vacuuming.
>
> > Here is the troublesome query:
> >
> >> select
> >>    sum(case when t1.cdlc < 0 then 0 else t1.cdlc end) as code_line_cnt,
> >>    sum(case when t1.cmlc < 0 then 0 else t1.cmlc end) as
> comment_line_cnt,
> >>    sum(case when t1.bllc < 0 then 0 else t1.bllc end) as blank_line_cnt
> >>  from
> >>    ( select
> >>        max(sf.current_code_line_count) as cdlc,
> >>        max(sf.current_comment_line_count) as cmlc,
> >>        max(sf.current_blank_line_count) as bllc
> >>      from
> >>        stream_file sf
> >>      group by sf.file_path_id, sf.current_source_md5
> >>    ) as t1;
>
> Well, I don't see that the planner has a lot of choice there
> besides whether to use a sort or a hash to do the inner
> aggregation.  Are you saying that prior to the bloat it used a hash
> aggregation, and that was faster?  And that you feel that it should
> be using that even with the bloat?  That the dead rows seem to be
> getting included in the statistics, driving to the slower plan, and
> you feel they should be omitted?
>
> Note that I'm not aruing one way or another on these points at the
> moment; I'm just trying to understand your point clearly.
>
> -Kevin
>

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

Jeff Janes

Date:

10 February 2013, 20:11:02

On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> "jimbob@seagate.com" <jimbob@seagate.com> wrote:
>
>> INFO:  analyzing "public.stream_file"
>> INFO:  "stream_file": scanned 30000 of 2123642 pages, containing
>> 184517 live rows and 2115512 dead rows; 30000 rows in sample,
>> 158702435 estimated total rows
>
> 184517 live rows in 30000 randomly sampled pages out of 2123642
> total pages, means that the statistics predict that a select
> count(*) will find about  13 million live rows to count.
>
>> After "analyze verbose", the table shows 158 million rows. A
>> select count(1) yields 13.8 million rows.
>
> OK, the estimate was 13 million and there were actually 13.8
> million, but it is a random sample used to generate estimates.
> That seems worse than average, but close enough to be useful.
> The 158.7 million total rows includes dead rows, which must be
> visited to determine visibility, but will not be counted because
> they are not visible to the counting transaction.

To clarify here, the 158.7 million estimate does not *intentionally*
include dead rows.  As you say, the ANALYZE did get a very good
instantaneous estimate of the number of live rows.  However, ANALYZE
doesn't over-write the old estimate, it averages its estimate into the
old one.  After the table shape changes dramatically, the ANALYZE
needs to be run repeatedly before the estimate will converge to the
new reality.  (Of course a cluster or vacuum full will blow away the
old statistics, so the next analyze after that will solely determine
the new statistics.)

I agree, not a bug.

Cheers,

Jeff

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

James R Skaggs

Date:

23 February 2013, 14:05:11

Okay, I have some more info.

Some background info.  This one table gets so many changes, I CLUSTER it
each night.  However, after I do this. The statistics still appear to be
incorrect.  Even after I do a "select pg_stat_reset();" Followed by 3
ANALYZE at default_statistics_target as 1, 10, and 100

    select relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_del,
n_tup_hot_upd
        from pg_stat_all_tables
    ('stream_file', 109061143L, 103913868L, 0L, 14201L, 0L)

Is it possible that there are still dead tuples after a CLUSTER?

Explain Analyze thinks we have 112M rows (in fact there are 10M), so it
needs do do a disc sort with work_mem=1GB..

"Aggregate  (cost=23622814.39..23622814.40 rows=1 width=12) (actual
time=57512.462..57512.462 rows=1 loops=1)"
"  ->  GroupAggregate  (cost=21536989.70..23425506.64 rows=11274728
width=53) (actual time=50825.396..57457.421 rows=202394 loops=1)"
"        Output: max(sf.current_code_line_count),
max(sf.current_comment_line_count), max(sf.current_blank_line_count)"
"        ->  Sort  (cost=21536989.70..21818857.90 rows=112747280 width=53)
(actual time=50825.386..55004.916 rows=7683730 loops=1)"
"              Output: sf.current_code_line_count,
sf.current_comment_line_count, sf.current_blank_line_count,
sf.file_path_id, sf.current_source_md5"
"              Sort Key: sf.file_path_id, sf.current_source_md5"
"              Sort Method:  external merge  Disk: 502288kB"
"              ->  Seq Scan on stream_file sf  (cost=0.00..2604208.80
rows=112747280 width=53) (actual time=0.033..27922.485 rows=7683730
loops=1)"
"                    Output: sf.current_code_line_count,
sf.current_comment_line_count, sf.current_blank_line_count,
sf.file_path_id, sf.current_source_md5"
"Total runtime: 57693.835 ms"

Now we do many, many ANALYZE VERBOSE, and converge on the correct value,
which is known to be about 10M rows.

INFO:  analyzing "public.stream_file"
INFO:  "stream_file": scanned 30000 of 1476736 pages, containing 158846
live rows and 2175512 dead rows; 30000 rows in sample, 112747282 estimated
total rows
Query returned successfully with no result in 9172 ms.

... 200X !...

INFO:  analyzing "public.stream_file"
INFO:  "stream_file": scanned 30000 of 1480611 pages, containing 158776
live rows and 2170410 dead rows; 30000 rows in sample, 9769236 estimated
total rows
Query returned successfully with no result in 441 ms.

Now, the optimizer thinks we can do a has aggregate in memory and we get
better performance.

"Aggregate  (cost=1734729.12..1734729.14 rows=1 width=12) (actual
time=33816.049..33816.049 rows=1 loops=1)"
"  ->  HashAggregate  (cost=1700534.50..1717631.81 rows=976989 width=53)
(actual time=33535.083..33712.787 rows=202404 loops=1)"
"        Output: max(sf.current_code_line_count),
max(sf.current_comment_line_count), max(sf.current_blank_line_count)"
"        ->  Seq Scan on stream_file sf  (cost=0.00..1578410.89
rows=9769889 width=53) (actual time=392.435..26278.143 rows=7710223
loops=1)"
"              Output: sf.id, sf.current_blank_line_count,
sf.current_code_line_count, sf.current_comment_line_count,
sf.current_source_md5, sf.component_id, sf.current_file_instance_id,
sf.current_file_state_id, sf.file_path_id, sf.stream_element_id"
"Total runtime: 33822.707 ms"

But later in the day, the statistics revert back to the 100M number!  Any
ideas?  Is there some kind of cache that is remembering the old statistics.?


Thanks and Regards,

James Skaggs
IT/FIS Longmont
SeaTel: 8 684 1048
General: +1 612 367 6224


On Sun, Feb 10, 2013 at 1:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

> On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> > "jimbob@seagate.com" <jimbob@seagate.com> wrote:
> >
> >> INFO:  analyzing "public.stream_file"
> >> INFO:  "stream_file": scanned 30000 of 2123642 pages, containing
> >> 184517 live rows and 2115512 dead rows; 30000 rows in sample,
> >> 158702435 estimated total rows
> >
> > 184517 live rows in 30000 randomly sampled pages out of 2123642
> > total pages, means that the statistics predict that a select
> > count(*) will find about  13 million live rows to count.
> >
> >> After "analyze verbose", the table shows 158 million rows. A
> >> select count(1) yields 13.8 million rows.
> >
> > OK, the estimate was 13 million and there were actually 13.8
> > million, but it is a random sample used to generate estimates.
> > That seems worse than average, but close enough to be useful.
> > The 158.7 million total rows includes dead rows, which must be
> > visited to determine visibility, but will not be counted because
> > they are not visible to the counting transaction.
>
> To clarify here, the 158.7 million estimate does not *intentionally*
> include dead rows.  As you say, the ANALYZE did get a very good
> instantaneous estimate of the number of live rows.  However, ANALYZE
> doesn't over-write the old estimate, it averages its estimate into the
> old one.  After the table shape changes dramatically, the ANALYZE
> needs to be run repeatedly before the estimate will converge to the
> new reality.  (Of course a cluster or vacuum full will blow away the
> old statistics, so the next analyze after that will solely determine
> the new statistics.)
>
> I agree, not a bug.
>
> Cheers,
>
> Jeff
>

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

Jeff Janes

Date:

24 February 2013, 00:17:28

On Sun, Feb 10, 2013 at 12:10 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

> On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> >
> > OK, the estimate was 13 million and there were actually 13.8
> > million, but it is a random sample used to generate estimates.
> > That seems worse than average, but close enough to be useful.
> > The 158.7 million total rows includes dead rows, which must be
> > visited to determine visibility, but will not be counted because
> > they are not visible to the counting transaction.
>
> To clarify here, the 158.7 million estimate does not *intentionally*
> include dead rows.  As you say, the ANALYZE did get a very good
> instantaneous estimate of the number of live rows.  However, ANALYZE
> doesn't over-write the old estimate, it averages its estimate into the
> old one.  After the table shape changes dramatically, the ANALYZE
> needs to be run repeatedly before the estimate will converge to the
> new reality.  (Of course a cluster or vacuum full will blow away the
> old statistics, so the next analyze after that will solely determine
> the new statistics.)
>

I was incredibly wrong here.  The cluster or vacuum do not blow away the
stats so that the next analyze gets to solely determine them.  Rather, they
impose their own idea of live tuples, and then analyze can only update that
incrementally as it averages itself into the old value.

Worse, the two methods have very different ideas of what constitutes a live
tuple.  ANALYZE thinks tuples that are visible to a current/recent snapshot
are live.  While CLUSTER and VACUUM think tuples that are possibly visible
to anyone are live.

I would say that this is a bug, or at least approaching to being one.  It
is not obvious whether reltuples and n_live_tuples should count the
"recently dead", but it should either be one way or the other and not an
unholy mixture of the two.

As it is now, a cluster or simple vacuum will snap n_live_tuples so that it
counts recently dead, then analyze will slowly converge it to excludes
recently dead, and then the next vacuum will snap it back again.

Of course, all of this only comes into play in the presence of very
long-lived transactions that prevent tuples from going away.  Otherwise the
number recently dead is small enough not to matter.

create table foo as select (random()*1000000)::integer as val from
generate_series(1,50000000);

In a different session, open a transaction and leave it open: begin; create
temp table adlfkj (x serial);

Back in the main session:

delete from foo where val > 100;

run this repeatedly and watch the rows estimate slowly decay:

ANALYZE verbose foo;  explain select count(*) from foo;

Then run this and watch it instantly spring back:

VACUUM VERBOSE foo ; explain select count(*) from foo;

Cheers,

Jeff

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

jimbob

Date:

26 February 2013, 17:10:30

So, I have some observations.  Is this what you are seeing as well?

So when we CLUSTER a table heavily-updated table:

CLUSTER does appear to reset  *n_dead_tup*, *n_tup_ins*, *n_tup_del*,
*n_tup_hot_upd*, but NOT *n_live_tup*

pg_stat_reset() truly clears out all the statistics counters.  I tried this
because *n_live_tup* is not correct.

A subsequent ANALYZE will update *n_dead_tup* and *n_live_tup* to some
values that could not possibly be based on the newly CLUSTERed table

So, how to get correct statistics for a heavily updated table? In my
experience, we only need to get the exponent correct, but we're not even
getting that.

BTW, I've upgraded to 8.4.15.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-7853-Incorrect-statistics-in-table-with-many-dead-rows-tp5743845p5746602.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

jimbob

Date:

01 March 2013, 01:55:09

BTW "jimbob" and "James.R.Skaggs" are the same person.  I just didn't want to
use my "work" email for this....



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-7853-Incorrect-statistics-in-table-with-many-dead-rows-tp5743845p5747000.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #7853: Incorrect statistics in table with many dead rows.

From

Jeff Janes

Date:

02 March 2013, 19:19:10

On Fri, Feb 22, 2013 at 3:41 PM, James R Skaggs
<james.r.skaggs@seagate.com>wrote:

> Okay, I have some more info.
>
> Some background info.  This one table gets so many changes, I CLUSTER it
> each night.  However, after I do this. The statistics still appear to be
> incorrect.  Even after I do a "select pg_stat_reset();" Followed by 3
> ANALYZE at default_statistics_target as 1, 10, and 100
>
>     select relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_del,
> n_tup_hot_upd
>         from pg_stat_all_tables
>     ('stream_file', 109061143L, 103913868L, 0L, 14201L, 0L)
>
> Is it possible that there are still dead tuples after a CLUSTER?
>

Yes.  A cluster must bring along any tuples which are possibly visible to
any open transaction.  Your root problem seems to be that you have
long-open transactions which are preventing vacuum from doing its thing,
which leads you try clustering, but the long-open transaction prevents that
from doing its things effectively as well.

Perhaps PG could deal with this situation more gracefully, but
fundamentally you have to figure why you have these ancient transactions
lying around, and fix them or kill them.

Cheers,

Jeff