Re: [BUGS] Routine analyze of single column prevents standard autoanalyze from running at all - Mailing list pgsql-hackers

From Tomasz Ostrowski
Subject Re: [BUGS] Routine analyze of single column prevents standard autoanalyze from running at all
Date
Msg-id 3eebdfc2-95dc-9ede-7467-47a3c26b8892@ato.waw.pl
Whole thread Raw
In response to Re: [BUGS] Routine analyze of single column prevents standard autoanalyze from running at all  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [BUGS] Routine analyze of single column prevents standard autoanalyze from running at all  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
W dniu 2016-06-08 o 05:04, Tom Lane pisze:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> Is there any significant advantage to not analyzing all columns? Only
>> case I can think of is if you have a fair number of columns that have
>> been toasted; otherwise I'd think IO would completely swamp any other
>> considerations.
>
> Yeah, my guess is that the OP's example where analyzing just one column
> was significantly cheaper boiled down to some of the other columns being
> mostly toasted data.  Otherwise it's hard to see how there's much more
> expense in analyzing them all.

Actually no - this volatile column has smaller "statistics" than most of 
the table, so analyzing it is much faster when it's data is not in RAM. 
Here is a small exaggerated example showing a difference:

$ psql
tometzky=> create table test (id serial, data text);
tometzky=> insert into test(data) select 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. '
||generate_series(0,10000000)::text;
tometzky=> alter table test alter column id set statistics 10;
tometzky=> alter table test alter column data set statistics 1000;
tometzky=> \q

# Drop OS page cache and restart postgres
# so the table data won't be in RAM anymore:
$ sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches'
$ sudo systemctl restart postgresql;

# Test single column analyze:
$ psql
tometzky=> \timing
Timing is on.
tometzky=> analyze verbose test(id);
INFO:  analyzing "public.test"
INFO:  "test": scanned 3000 of 123457 pages, containing 243000 live rows 
and 0 dead rows; 3000 rows in sample, 2712238 estimated total rows
ANALYZE
Time: 422,521 ms
tometzky=> \q

# Drop OS page cache and restart postgres again
$ sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches'
$ sudo systemctl restart postgresql;

$ psql
tometzky=> \timing
Timing is on.
tometzky=> analyze verbose test;
INFO:  analyzing "public.test"
INFO:  "test": scanned 123457 of 123457 pages, containing 10000001 live 
rows and 0 dead rows; 300000 rows in sample, 10000001 estimated total rows
ANALYZE
Time: 9447,519 ms

-- 
Tomasz "Tometzky" Ostrowski



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Reviewing freeze map code
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: slower connect from hostnossl clients