Re: 10.1: hash index size exploding on vacuum full analyze - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: 10.1: hash index size exploding on vacuum full analyze
Date
Msg-id CAA4eK1KLTFFLPuzx2Xfkq4Qd+xz-8Pt3rigwhq5fVL0UcR1sxQ@mail.gmail.com
Whole thread Raw
In response to Re: 10.1: hash index size exploding on vacuum full analyze  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: 10.1: hash index size exploding on vacuum full analyze  (AP <pgsql@inml.weebeastie.net>)
List pgsql-bugs
On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> Hi!
>
>> I think if we update the stats in copy_heap_data after copying the
>> data, then we don't see such problem.  Attached patch should fix the
>> issue.  You can try this patch to see if it fixes the issue for you.
>
> I'm afraid I'm not able to reproduce the problem which patch should fix.
>
> What I did (today's master, without patch):
> autovacuum off
> pgbench -i -s 100
>
> select relname, relpages, reltuples from pg_class where relname =
> 'pgbench_accounts';
>      relname      | relpages | reltuples
> ------------------+----------+-----------
>  pgbench_accounts |   163935 |     1e+07
>
> vacuum full pgbench_accounts;
>
> # select relname, relpages, reltuples from pg_class where relname =
> 'pgbench_accounts';
>      relname      | relpages | reltuples
> ------------------+----------+-----------
>  pgbench_accounts |   163935 |     1e+07
>
>
> I've tried to add hash index to that table and print notice about number of
> pages and tuples immediately after estimate_rel_size() in hashbuild(). hash
> index got right estimation even I deleted all rows before vacuum full. What
> am I doing wrong?
>

The estimation depends on the type of columns and stats.  I think we
need to use schema and stats in the way AP is using to see the effect
AP is seeing.  I was under impression that AP will help us in
verifying the problem as he can reproduce it, but it seems he is busy.
It seems Ashutosh is trying to reproduce the problem in a slightly
different way, let us see if with his test we can see the similar
effect.

> Patch looks good except, seems, updating stats is better to move to
> swap_relation_files(), then it will work even for  toast tables.
>

Initially, I have also thought of doing it in swap_relation_files, but
we don't have stats values there.  We might be able to pass it, but
not sure if there is any need for same.  As far as Toast table's case
is concerned, I don't see the problem because we are copying the data
row-by-row only for heap where the value of num_tuples and num_pages
could be different.  See  copy_heap_data.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-bugs by date:

Previous
From: Amit Kapila
Date:
Subject: Re: 10.1: hash index size exploding on vacuum full analyze
Next
From: Ashutosh Sharma
Date:
Subject: Re: 10.1: hash index size exploding on vacuum full analyze