Re: Reduce maximum error in tuples estimation after vacuum. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Reduce maximum error in tuples estimation after vacuum.
Date
Msg-id 20130626.101606.211650168.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Reduce maximum error in tuples estimation after vacuum.  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: Reduce maximum error in tuples estimation after vacuum.
List pgsql-hackers
Sorry for lots of mistakes I've made,

I noticed that I was dancing on the 9.3dev at sometime, so I
reran on 9.4devel up to date, having the same result from the
view of inaccuracy of pg_class.reltuples after vacuuming.

Although, what differs from my old testset, vacuum reported that
it sanned the whole table pages. It cannot be the case that it
gives such a reltuples value for the case since before...

Please let me have a bit of time to diagnose this. 

=====
> When I tried to run vactest.sh, it gives below error:
> linux:~/akapila/vacuum_nlivetup> ./vactest.sh 
> ./vactest.sh: line 11: syntax error near unexpected token `&' 
> ./vactest.sh: line 11: `        psql ${dbname} -c "vacuum verbose t" |&
> egrep "INFO: *\"t\": found " | sed -e 's/^.* versions in \([0-9]*\)
> .*$/\1/''

'|&' I carelessly used should not be understood by real
/bin/sh. Since I might use other notations out of sh syntex, it
seems to more helpful to replace the shbang with '/bin/bash'
instead of '/bin/sh'..

In addition, the test script I brought up here discards all
outputs of sql commands. The attached script shows the verbose
response from vacuum.
- replaced shbang with /bin/bash- replaced '|&' notation with '2>&1 |'- removed '> /dev/null' from psql commandline

> Can you help me in reproducing the problem by letting me know
> if I am doing something wrong or results of test are not
> predictable?

Thank you for getting involved and sorry for the insufficient
preparation. But please wait for a while to go on.

I ran the attached revised script for the distcleaned current
head on the master branch (9.4devel) on CentOS 6.4 and got the
same result as previous, shown below. But I found the result
ununderstandable. I'd like to have a bit time to diagnose this.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


===========
$ ./vactest.sh
test1 ratio = 0.4
DROP TABLE
CREATE TABLE
INSERT 0 1000000
UPDATE 699999
psql:<stdin>:5: INFO:  vacuuming "public.t"
psql:<stdin>:5: INFO:  "t": removed 699999 row versions in 4459 pages
psql:<stdin>:5: INFO:  "t": found 699999 removable, 1000000 nonremovable row versions in 10829 out of 10829 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.13u sec elapsed 0.13 sec.
VACUUM
DELETE 399999
test1 ratio = 0.99
DROP TABLE
CREATE TABLE
INSERT 0 1000000
UPDATE 699999
psql:<stdin>:5: INFO:  vacuuming "public.t"
psql:<stdin>:5: INFO:  "t": removed 699999 row versions in 4459 pages
psql:<stdin>:5: INFO:  "t": found 699999 removable, 1000000 nonremovable row versions in 10829 out of 10829 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.15u sec elapsed 0.25 sec.
VACUUM
DELETE 989999
test1 ratio = 1.00
DROP TABLE
CREATE TABLE
INSERT 0 1000000
UPDATE 699999
psql:<stdin>:5: INFO:  vacuuming "public.t"
psql:<stdin>:5: INFO:  "t": removed 699999 row versions in 4459 pages
psql:<stdin>:5: INFO:  "t": found 699999 removable, 1000000 nonremovable row versions in 10829 out of 10829 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.14u sec elapsed 0.19 sec.
VACUUM
DELETE 999999# | del% | ## | pages | n_live_tup | tups est | tups real |  est/real  | bufs 
---+------+----+-------+------------+----------+-----------+------------+------1 |  0.4 |  1 | 10829 |     600001 |
764808|    600001 |      1.275 | 25491 | 0.99 |  1 |  6370 |      10001 |   417600 |     10001 |     41.756 | 63081 |
1 |  1 |  6370 |          1 |   411673 |         1 | 411673.000 | 6371
 
(3 rows)
========
#! /bin/bash

dbname="postgres"
function insert_result() {psql ${dbname} -f - > /dev/null <<EOF
insert into result select $1, $2, $3, c.relpages, s.n_live_tup, c.reltuples, (select count(*) from t), reltuples::float
/(select count(*) from t) from pg_stat_user_tables s, pg_class c where s.relname = 't' and c.relname = 't';
 
EOF
}

function vac_with_bufs() {psql ${dbname} -c "vacuum verbose t" 2>&1 |  egrep "INFO: *\"t\": found " | sed -e 's/^.*
versionsin \([0-9]*\) .*$/\1/'
 
}

function update_result_bufs() {local test_no=$1local delratio=$2local vac_no=$3local bufs=$4
psql ${dbname} -c "update result set bufs=${bufs} where \"#\"=$test_no and \"del%\"=$delratio and \"##\"=$vac_no"
>/dev/null
}

function store_result() {local test_no=$1local delratio=$2scanned_bufs=`vac_with_bufs`insert_result $test_no $delratio
1update_result_bufs$test_no $delratio 1 $scanned_bufs
 
#    scanned_bufs=`vac_with_bufs`
#    insert_result $test_no $delratio 2
#    update_result_bufs $test_no $delratio 2 $scanned_bufs
}


function test1() {local delratio=$1
echo "test1 ratio = $delratio"psql ${dbname} -f - <<EOF
drop table if exists t;
create table t (a int, b int, c int, d int default 0, e int default 0, f int default 0);
insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, $nrows) a);
update t set b = b + 1 where a <  (select count(*) from t) * 0.7;
vacuum verbose t;
delete from t where a < (select count(*) from t) * $delratio
EOFstore_result 1 $delratio
}

function test2() {local delratio=$1
echo "test2 ratio = $delratio"psql ${dbname} -f - <<EOF
drop table if exists t;
create table t (a int, b text);
insert into t (select a, 'abcdefg' from generate_series((select count(*) from t) + 1, $nrows) a);
update t set b = repeat('abcdefghij', 250) where a <  (select count(*) from t) * 0.7;
vacuum verbose t;
delete from t where a < (select count(*) from t) * $delratio;
EOFstore_result 2 $delratio
}

psql ${dbname} -f - > /dev/null <<EOF
drop table if exists result;
create table result ("#" int, "del%" float, "##" int, pages int, n_live_tup int, "tups est" int, "tups real" int,
"est/real"numeric(10, 3), bufs int default 0);
 
EOF
nrows=1000000
# test1 0.1
# test1 0.2
# test1 0.3
test1 0.4
# test1 0.5
# test1 0.6
# test1 0.7
# test1 0.8
# test1 0.9
# test1 0.95
test1 0.99
test1 1.00

# nrows=100000
# test2 0.1
# test2 0.2
# test2 0.3
# test2 0.4
# test2 0.5
# test2 0.6
# test2 0.7
# test2 0.8
# test2 0.9
# test2 0.95
# test2 0.99
# test2 1.00

psql ${dbname} -c 'select * from result order by "#", "del%", "##"'

pgsql-hackers by date:

Previous
From: David Johnston
Date:
Subject: Re: Kudos for Reviewers -- straw poll
Next
From: Tom Lane
Date:
Subject: Re: Possible bug in CASE evaluation