New gist vacuum. - Mailing list pgsql-hackers

From Костя Кузнецов
Subject New gist vacuum.
Date
Msg-id 1147341441925550@web17j.yandex.ru
Whole thread Raw
Responses Re: New gist vacuum.  (Michael Paquier <michael.paquier@gmail.com>)
Re: New gist vacuum.  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
<p>Hello. I am student from gsoc programm.<br />My project is sequantial access in vacuum of gist.<br /><br />New
vacuumhas 2 big step:<br />physical order scan pages and cleaning after 1 step.<br /><br /><br />1 scan - scan all
pagesand create information map(hashmap) and add information to rescan stack( stack of pages that needed to
rescanning<br/><br />second step is work only with page(from rescan stack) where there is a changes. In new version of
vacuumbesides increased speed also there is a deleting of pages. Only leaf pages can be deleted. The process of
deleteingpages is (1. delete link to page. 2. change rightlinks (if needed) 3. set deleted). I added 2 action in wal
(wheni set delete flag and when i change rightlinks). When i delete links to leaf pages from inner page i always save 1
linkto leaf(avoiding situations with empty inner pages).<p>I attach some speed benchmarks.<p>i compare old and new
versionon my laptop(without ssd). the test: table "point_tbl" from regression database. i insert about 200 millions
rows.after that i delete 33 million and run vacuum.<p>size of index is about 18 gb.<p>old version:<p>INFO: vacuuming
"public.point_tbl"<br/>INFO: scanned index "gpointind" to remove 11184520 row versions<br />DETAIL: CPU 84.70s/72.26u
secelapsed 27007.14 sec.<br />INFO: "point_tbl": removed 11184520 row versions in 400715 pages<br />DETAIL: CPU
3.96s/3.10usec elapsed 233.12 sec.<br />INFO: scanned index "gpointind" to remove 11184523 row versions<br />DETAIL:
CPU87.10s/69.05u sec elapsed 26410.44 sec.<br />INFO: "point_tbl": removed 11184523 row versions in 400715 pages<br
/>DETAIL:CPU 4.23s/3.36u sec elapsed 331.43 sec.<br />INFO: scanned index "gpointind" to remove 11184523 row
versions<br/>DETAIL: CPU 87.65s/65.73u sec elapsed 26230.35 sec.<br />INFO: "point_tbl": removed 11184523 row versions
in400715 pages<br />DETAIL: CPU 4.47s/3.41u sec elapsed 342.93 sec.<br />INFO: scanned index "gpointind" to remove 866
rowversions<br />DETAIL: CPU 79.97s/39.64u sec elapsed 23341.88 sec.<br />INFO: "point_tbl": removed 866 row versions
in31 pages<br />DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.<br />INFO: index "gpointind" now contains 201326592 row
versionsin 2336441 pages<br />DETAIL: 33554432 index row versions were removed.<br />0 index pages have been deleted, 0
arecurrently reusable.<p> <p> <p>new vacuum is about<p> <p>INFO: vacuuming "public.point_tbl"<br />INFO: scanned index
"gpointind"to remove 11184520 row versions<br />DETAIL: CPU 13.00s/27.57u sec elapsed 1864.22 sec.<br />INFO:
"point_tbl":removed 11184520 row versions in 400715 pages<br />DETAIL: CPU 3.46s/2.86u sec elapsed 214.04 sec.<br
/>INFO:scanned index "gpointind" to remove 11184523 row versions<br />DETAIL: CPU 14.17s/27.02u sec elapsed 2163.67
sec.<br/>INFO: "point_tbl": removed 11184523 row versions in 400715 pages<br />DETAIL: CPU 3.33s/2.99u sec elapsed
222.60sec.<br />INFO: scanned index "gpointind" to remove 11184523 row versions<br />DETAIL: CPU 11.84s/25.23u sec
elapsed1828.71 sec.<br />INFO: "point_tbl": removed 11184523 row versions in 400715 pages<br />DETAIL: CPU 3.44s/2.81u
secelapsed 215.06 sec.<br />INFO: scanned index "gpointind" to remove 866 row versions<br />DETAIL: CPU 5.62s/6.68u sec
elapsed176.67 sec.<br />INFO: "point_tbl": removed 866 row versions in 31 pages<br />DETAIL: CPU 0.00s/0.00u sec
elapsed0.01 sec.<br />INFO: index "gpointind" now contains 201326592 row versions in 2336360 pages<br />DETAIL:
33554432index row versions were removed.<br />150833 index pages have been deleted, 150833 are currently reusable.<br
/>CPU5.54s/2.08u sec elapsed 165.61 sec.<br />INFO: "point_tbl": found 33554432 removable, 201326592 nonremovable row
versionsin 1202176 out of 1202176 pages<br />DETAIL: 0 dead row versions cannot be removed yet.<br />There were 0
unuseditem pointers.<br />Skipped 0 pages due to buffer pins.<br />0 pages are entirely empty.<br />CPU 73.50s/116.82u
secelapsed 8300.73 sec.<br />INFO: analyzing "public.point_tbl"<br />INFO: "point_tbl": scanned 100 of 1202176 pages,
containing16756 live rows and 0 dead rows; 100 rows in sample, 201326601 estimated total rows<br />VACUUM<p> <p>There
isa big speed up + we can reuse some pages.<p>Thanks. 

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Multi-column distinctness.
Next
From: Tomas Vondra
Date:
Subject: Re: DBT-3 with SF=20 got failed