Re: BUG #7853: Incorrect statistics in table with many dead rows. - Mailing list pgsql-bugs

From Kevin Grittner
Subject Re: BUG #7853: Incorrect statistics in table with many dead rows.
Date
Msg-id 1360105935.25882.YahooMailNeo@web162904.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: BUG #7853: Incorrect statistics in table with many dead rows.  (James R Skaggs <james.r.skaggs@seagate.com>)
Responses Re: BUG #7853: Incorrect statistics in table with many dead rows.  (James R Skaggs <james.r.skaggs@seagate.com>)
List pgsql-bugs
James R Skaggs <james.r.skaggs@seagate.com> wrote:=0A=0A> Agreed, we should=
n't have so many dead rows.=A0 Our autovacuum is=0A> set on but has default=
 parameters.=A0=A0 We are clustering today. =0A> This, of course, means dow=
ntime and inconvenience to the users.=0A=0ARight, which is why it's importa=
nt to figure out why the bloat=0Ahappened.=A0 Sometimes it is unavoidable, =
like when you delete 90% of=0Athe rows in your table or a long-lived "idle =
in transaction"=0Aconnection prevents autovacuum from being able to do its =
work=0Anormally.=A0 To prevent further downtime it is important to figure=
=0Aout what happened and make appropriate changes to your monitoring=0Aor v=
acuuming.=0A=0A> Here is the troublesome query:=0A>=0A>> select =0A>>=A0=A0=
=A0 sum(case when t1.cdlc < 0 then 0 else t1.cdlc end) as code_line_cnt, =
=0A>>=A0=A0=A0 sum(case when t1.cmlc < 0 then 0 else t1.cmlc end) as commen=
t_line_cnt, =0A>>=A0=A0=A0 sum(case when t1.bllc < 0 then 0 else t1.bllc en=
d) as blank_line_cnt =0A>>=A0 from =0A>>=A0=A0=A0 ( select =0A>>=A0=A0=A0=
=A0=A0=A0=A0 max(sf.current_code_line_count) as cdlc, =0A>>=A0=A0=A0=A0=A0=
=A0=A0 max(sf.current_comment_line_count) as cmlc,=0A>>=A0=A0=A0=A0=A0=A0=
=A0 max(sf.current_blank_line_count) as bllc =0A>>=A0=A0=A0=A0=A0 from =0A>=
>=A0=A0=A0=A0=A0=A0=A0 stream_file sf =0A>>=A0=A0=A0=A0=A0 group by sf.file=
_path_id, sf.current_source_md5 =0A>>=A0=A0=A0 ) as t1;=0A=0AWell, I don't =
see that the planner has a lot of choice there=0Abesides whether to use a s=
ort or a hash to do the inner=0Aaggregation.=A0 Are you saying that prior t=
o the bloat it used a hash=0Aaggregation, and that was faster?=A0 And that =
you feel that it should=0Abe using that even with the bloat?=A0 That the de=
ad rows seem to be=0Agetting included in the statistics, driving to the slo=
wer plan, and=0Ayou feel they should be omitted?=0A=0ANote that I'm not aru=
ing one way or another on these points at the=0Amoment; I'm just trying to =
understand your point clearly.=0A=0A-Kevin

pgsql-bugs by date:

Previous
From: James R Skaggs
Date:
Subject: Re: BUG #7853: Incorrect statistics in table with many dead rows.
Next
From: Dave Page
Date:
Subject: Re: BUG #7851: Installer crash with message: An error occured executing the Microsoft VC++ runtimeinstaller