Re: gincostestimate - Mailing list pgsql-hackers
From | Jan Urbański |
---|---|
Subject | Re: gincostestimate |
Date | |
Msg-id | 4C4CAFE1.5090208@wulczer.org Whole thread Raw |
In response to | gincostestimate (Teodor Sigaev <teodor@sigaev.ru>) |
Responses |
Re: gincostestimate
|
List | pgsql-hackers |
On 02/07/10 14:33, Teodor Sigaev wrote: > Patch implements much more accuracy estimation of cost for GIN index > scan than generic cost estimation function. Hi, I'm reviewing this patch, and to begin with it I tried to reproduce the problem that originally came up on -performance in http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php The links from that mail are now dead, so I set up my own test environment:* one table testfts(id serial, body text, body_ftstsvector)* 50000 rows, each with 1000 random words taken from /usr/share/dict/british-english-insane (the wbritish-insane Debian package) separated by a single space* each row also had the word "commonterm" at the end, 80% had commonterm80, 60% had commonterm60 etc (using the same methodology as Jesper, that commonterm60 can appear only if commonterm80 is in the row)* a GIN index on the tsvectors I was able to reproduce his issue, that is: select id from ftstest where body_fts @@ to_tsquery('commonterm80'); was choosing a sequential scan, which was resulting in much longer execution than the bitmap index plan that I got after disabling seqscans. I then applied the patch, recompiled PG and tried again... and nothing changed. I first tried running ANALYSE and then dropping and recreating the GIN index, but the planner still chooses the seq scan. Full explains below (the NOTICE is a debugging aid from the patch, which I temporarily enabled to see if it's picking up the code). I'll continue reading the code and trying to understand what it does, but in the meantime: am I doing something wrong that I don't see the planner switching to the bitmap index plan? I see that the difference in costs is small, so maybe I just need to tweak the planner knobs a bit? Is the output below expected? Cheers, Jan wulczer=# explain analyse select id from ftstest where body_fts @@ to_tsquery('commonterm80'); NOTICE: GIN stats: nEntryPages: 49297.000000 nDataPages: 16951.000000 nPendingPages :0.000000 nEntries: 277521.000000 QUERY PLAN ------------------------------------------------------------------------------------------------------------------Seq Scanon ftstest (cost=0.00..1567.00 rows=39890 width=4) (actual time=221.893..33179.794 rows=39923 loops=1) Filter: (body_fts @@ to_tsquery('commonterm80'::text))Total runtime: 33256.661ms (3 rows) wulczer=# set enable_seqscan to false; SET Time: 0.257 ms wulczer=# explain analyse select id from ftstest where body_fts @@ to_tsquery('commonterm80'); NOTICE: GIN stats: nEntryPages: 49297.000000 nDataPages: 16951.000000 nPendingPages :0.000000 nEntries: 277521.000000 QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------Bitmap HeapScan on ftstest (cost=449.15..1864.50 rows=39890 width=4) (actual time=107.421..181.284 rows=39923 loops=1) Recheck Cond: (body_fts @@ to_tsquery('commonterm80'::text)) -> BitmapIndex Scan on ftstest_gin_idx (cost=0.00..439.18 rows=39890 width=0) (actual time=97.057..97.057 rows=39923 loops=1) Index Cond: (body_fts @@ to_tsquery('commonterm80'::text))Totalruntime: 237.218 ms (5 rows) Time: 237.999 ms
pgsql-hackers by date: