Bad query optimizer misestimation because of TOAST tables - Mailing list pgsql-performance

From Markus Schaber
Subject Bad query optimizer misestimation because of TOAST tables
Date
Msg-id 42010C24.90301@logi-track.com
Whole thread Raw
Responses Re: Bad query optimizer misestimation because of TOAST tables
Re: [postgis-users] Bad query optimizer misestimation because of
List pgsql-performance
[This mail goes as X-Post to both pgsql-perform and postgis-users
because postgis users may suffer from this problem, but I would prefer
to keep the Discussion on pgsql-performance as it is a general TOAST
problem and not specific to PostGIS alone.]

Hello,

Running PostGIS 0.8.1 under PostgreSQL 7.4.6-7 (Debian), I struggled
over the following problem:

logigis=# explain analyze SELECT geom FROM adminbndy1 WHERE geom &&
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
                                                       QUERY PLAN


------------------------------------------------------------------------------------------------------------------------
 Seq Scan on adminbndy1  (cost=0.00..4.04 rows=1 width=121) (actual
time=133.591..7947.546 rows=5 loops=1)
   Filter: (geom && 'SRID=4326;BOX3D(9.4835390946502 47.3936574074074
0,9.5164609053498 47.4063425925926 0)'::geometry)
 Total runtime: 7947.669 ms
(3 Zeilen)

logigis=# set enable_seqscan to off;
SET
logigis=# explain analyze SELECT geom FROM adminbndy1 WHERE geom &&
setsrid('BOX3D(9.4835390946502 47.39365740740741,9.5164609053498
47.40634259259259)'::box3d,4326);
                                                             QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using adminbndy1_geom_idx on adminbndy1  (cost=0.00..4.44
rows=1 width=121) (actual time=26.902..27.066 rows=5 loops=1)
   Index Cond: (geom && 'SRID=4326;BOX3D(9.4835390946502
47.3936574074074 0,9.5164609053498 47.4063425925926 0)'::geometry)
 Total runtime: 27.265 ms
(3 Zeilen)

So the query planner choses to ignore the index, although it is
appropriate. My first idea was that the statistics, but that turned out
not to be the problem. As the above output shows, the query optimizer
already guesses a rowcount of 1 which is even smaller than the actual
number of 5 fetched rows, so this should really make the query planner
use the index.

Some amount of increasingly random tries later, I did the following:

logigis=# vacuum full freeze analyze verbose adminbndy1;
INFO:  vacuuming "public.adminbndy1"
INFO:  "adminbndy1": found 0 removable, 83 nonremovable row versions in
3 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 128 to 1968 bytes long.
There were 1 unused item pointers.
Total free space (including removable row versions) is 5024 bytes.
0 pages are or will become empty, including 0 at the end of the table.
3 pages containing 5024 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  index "adminbndy1_geom_idx" now contains 83 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "adminbndy1": moved 0 row versions, truncated 3 to 3 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_19369"
INFO:  "pg_toast_19369": found 0 removable, 32910 nonremovable row
versions in 8225 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 37 to 2034 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 167492 bytes.
0 pages are or will become empty, including 0 at the end of the table.
66 pages containing 67404 free bytes are potential move destinations.
CPU 0.67s/0.04u sec elapsed 2.76 sec.
INFO:  index "pg_toast_19369_index" now contains 32910 row versions in
127 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.00u sec elapsed 0.14 sec.
INFO:  "pg_toast_19369": moved 0 row versions, truncated 8225 to 8225 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.adminbndy1"
INFO:  "adminbndy1": 3 pages, 83 rows sampled, 83 estimated total rows
VACUUM
logigis=#

IMHO, this tells the reason. The query planner has a table size of 3
pages, which clearly is a case for a seqscan. But during the seqscan,
the database has to fetch an additional amount of 8225 toast pages and
127 toast index pages, and rebuild the geometries contained therein.

And the total number of 8355 pages = 68MB is a rather huge amount of
data to fetch.

I think this problem bites every user that has rather large columns that
get stored in the TOAST table, when querying on those column.

As a small workaround, I could imagine to add a small additional column
in the table that contains the geometry's bbox, and which I use the &&
operator against. This should avoid touching the TOAST for the skipped rows.

But the real fix should be to add the toast pages to the query planners
estimation for the sequential scan. What do you think about it?

Markus

--
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:schabios@logi-track.com | www.logi-track.com

pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Bitmap indexes
Next
From: Bruce Momjian
Date:
Subject: Re: High end server and storage for a PostgreSQL OLTP system