Thread: visibility map - what do i miss?

visibility map - what do i miss?

From
hubert depesz lubaczewski
Date:
--- repost to hackers as suggested by RhodiumToad ---

hi,
i tried to test new "visibility map" feature.

to do so i:
1. fetched postgresql sources from cvs
2. compiled
3. turned autovacuum off
4. started pg
5. ran this queries: - CREATE TABLE test_1 (i INT4); - CREATE TABLE test_2 (i INT4); - CREATE TABLE test_3 (i INT4); -
CREATETABLE test_4 (i INT4); - INSERT INTO test_1 SELECT generate_series(1, 100000000); - INSERT INTO test_2 SELECT
generate_series(1,100000000); - INSERT INTO test_3 SELECT generate_series(1, 100000000); - INSERT INTO test_4 SELECT
generate_series(1,100000000); - UPDATE test_2 SET i = i + 1 WHERE i < 10000000; - UPDATE test_3 SET i = i + 1 WHERE i <
50000000;- UPDATE test_4 SET i = i + 1 WHERE i < 90000000; - VACUUM test_1; - VACUUM test_2; - VACUUM test_3; - VACUUM
test_4;

I did it 2 times, first with sources of pg from 1st of november, and
second - with head from yesterday evening (warsaw, poland time).

results puzzled me.

First run - without visibility maps, timing of vacuums:
Time: 267844.822 ms
Time: 138854.592 ms
Time: 305467.950 ms
Time: 487133.179 ms

Second run - on head:

Time: 252218.609 ms
Time: 234388.763 ms
Time: 334016.413 ms
Time: 575698.750 ms

Now - as I understand the change - visilibity maps should make second run much faster?

Tests were performed on laptop. During first test I used it to browse the web,
read mail. During second test - nobody used the laptop.

Relation forms seem to exist:
# select oid from pg_database where datname = 'depesz'; oid
-------16389
(1 row)
# select relfilenode from pg_class where relname ~ 'test_';relfilenode
-------------      26756      26759      26762      26765
(4 rows)

=> ls -l {26756,26759,26762,26765}*
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:31 26756
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:33 26756.1
-rw------- 1 pgdba pgdba 1065066496 2008-12-06 01:34 26756.2
-rw------- 1 pgdba pgdba     811008 2008-12-06 01:34 26756_fsm
-rw------- 1 pgdba pgdba      57344 2008-12-06 01:34 26756_vm
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:35 26759
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:36 26759.1
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:37 26759.2
-rw------- 1 pgdba pgdba  312582144 2008-12-06 01:39 26759.3
-rw------- 1 pgdba pgdba     892928 2008-12-06 01:39 26759_fsm
-rw------- 1 pgdba pgdba      57344 2008-12-06 01:39 26759_vm
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:39 26762
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:49 26762.1
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:41 26762.2
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:42 26762.3
-rw------- 1 pgdba pgdba  523862016 2008-12-06 01:43 26762.4
-rw------- 1 pgdba pgdba    1204224 2008-12-06 01:43 26762_fsm
-rw------- 1 pgdba pgdba      81920 2008-12-06 01:53 26762_vm
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 02:01 26765
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 02:08 26765.1
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 02:18 26765.2
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:50 26765.3
-rw------- 1 pgdba pgdba 1073741824 2008-12-06 01:51 26765.4
-rw------- 1 pgdba pgdba  735141888 2008-12-06 02:00 26765.5
-rw------- 1 pgdba pgdba    1523712 2008-12-06 02:00 26765_fsm
-rw------- 1 pgdba pgdba      98304 2008-12-06 02:18 26765_vm

What do I miss?

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007


Re: visibility map - what do i miss?

From
"Guillaume Smet"
Date:
On Sat, Dec 6, 2008 at 12:53 PM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> First run - without visibility maps, timing of vacuums:
> Time: 267844.822 ms
> Time: 138854.592 ms
> Time: 305467.950 ms
> Time: 487133.179 ms
>
> Second run - on head:
>
> Time: 252218.609 ms
> Time: 234388.763 ms
> Time: 334016.413 ms
> Time: 575698.750 ms
>
> Now - as I understand the change - visilibity maps should make second run much faster?

If I understand correctly what Heikki explained, not if you run VACUUM
only once (and you confirmed me on IRC you run it only once). It's the
VACUUM which sets the PD_ALL_VISIBLE flag on the pages so the first
VACUUM should be slower with HEAD than with 8.3 as it has far more
work to do.

The second VACUUM should then be faster.

-- 
Guillaume


Re: visibility map - what do i miss?

From
Heikki Linnakangas
Date:
Guillaume Smet wrote:
> On Sat, Dec 6, 2008 at 12:53 PM, hubert depesz lubaczewski
> <depesz@depesz.com> wrote:
>> First run - without visibility maps, timing of vacuums:
>> Time: 267844.822 ms
>> Time: 138854.592 ms
>> Time: 305467.950 ms
>> Time: 487133.179 ms
>>
>> Second run - on head:
>>
>> Time: 252218.609 ms
>> Time: 234388.763 ms
>> Time: 334016.413 ms
>> Time: 575698.750 ms
>>
>> Now - as I understand the change - visilibity maps should make second run much faster?
> 
> If I understand correctly what Heikki explained, not if you run VACUUM
> only once (and you confirmed me on IRC you run it only once). It's the
> VACUUM which sets the PD_ALL_VISIBLE flag on the pages so the first
> VACUUM should be slower with HEAD than with 8.3 as it has far more
> work to do.
> 
> The second VACUUM should then be faster.

That diagnosis is not quite right, but the prognosis is correct. The 
first VACUUM after the UPDATEs doesn't set the PD_ALL_VISIBLE flags, 
because there's still dead tuples on the pages. The dead tuples are 
removed in the 2nd pass of the first vacuum, but it doesn't try to set 
the PD_ALL_VISIBLE flags; that's only done in the first phase.

The second vacuum is just as slow as the first one, because the 
visibility map doesn't have any bits set yet. The second vacuum will set 
the bits, though, so the *third* vacuum should go faster.

So setting the PD_ALL_VISIBLE flags doesn't slow things down. That 
should be just a tiny bit of extra CPU work per vacuumed page, not 
something that would show up in performance tests.

This is the 1st issue I mentioned in this mail:

http://archives.postgresql.org/message-id/4925664C.3090605@enterprisedb.com

There was some suggestions in that thread, but none has been implemented.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com