Hi,
I am running PostgreSQL 8.3.3 on Linux 2.6.18
jdietrch@io:~/z$ uname -a
Linux io.mills.lan 2.6.18-6-vserver-k7 #1 SMP Sat Jun 7 01:10:29 UTC
2008 i686 GNU/Linux
Short version of the problem:
When I run a full-text search that requires a full scan of the table,
an error is returned (GIN indexes don't support sequential scan)
instead of falling back on a sequential scan which would return
the results of the query.
Long version:
I have a database with two tables named 'one' and 'two':
z1=> \d one
Table "public.one"
Column | Type | Modifiers
--------+------+-----------
a | text |
Indexes:
"fts_a" gin (to_tsvector('simple'::regconfig, a))
z1=> \d two
Table "public.two"
Column | Type | Modifiers
--------+------+-----------
b | text |
Indexes:
"fts_b" gin (to_tsvector('simple'::regconfig, b))
Table 'one' has 51 rows:
z1=> select * from one;
a
--------------
Two
Three
Four
Five
****<snip>****
Forty nine
Fifty
Fifty one
Fifty two
(51 rows)
Table 'two' has 5001 rows:
z1=> select * from two;
b
--------------------------------------------
Fifty three
Fifty four
Fifty five
Fifty six
****<snip>****
Five thousand fifty
Five thousand fifty one
Five thousand fifty two
Five thousand fifty three
(5001 rows)
(At the bottom of this email I have copied the commands
I used to create the database.)
Now I run a full-text query on table 'one':
z1=> select count(*) from one where to_tsvector('simple', a) @@
to_tsquery('simple', '!Four');
count
-------
47
(1 row)
Running the same query under 'explain analyze' shows that the index is not being
used, but a sequential scan is being done:
z1=> explain analyze select a from one where to_tsvector('simple', a)
@@ to_tsquery('simple', '!Four');
QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on one (cost=0.00..1.77 rows=1 width=32) (actual
time=0.019..0.309 rows=47 loops=1)
Filter: (to_tsvector('simple'::regconfig, a) @@ '!''four'''::tsquery)
Total runtime: 0.341 ms
(3 rows)
That all works fine.
But all is not fine when I do the same thing on the longer table 'two':
z1=> \set VERBOSITY verbose
z1=> select count(*) from two where to_tsvector('simple', b) @@
to_tsquery('simple', '!Four');
ERROR: 0A000: query requires full scan, which is not supported by GIN indexes
LOCATION: gin_extract_tsquery, tsginidx.c:74
I understand that this query does require a full scan, and I understand that
GIN indexes don't support a full scan, but why couldn't the planner fall back
to a sequential scan in this case? Of course it's slower, but I would prefer a
slower answer than failure with an error and no answer at all. I can simulate
this solution by doing the following, which forces a sequential scan.
z1=> set enable_bitmapscan to off;
SET
z1=> set enable_indexscan to off;
SET
z1=> select count(*) from two where to_tsvector('simple', b) @@
to_tsquery('simple', '!Four');
count
-------
3277
(1 row)
z1=> explain analyze select count(*) from two where
to_tsvector('simple', b) @@ to_tsquery('simple', '!Four');
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Aggregate (cost=114.03..114.04 rows=1 width=0) (actual
time=91.171..91.171 rows=1 loops=1)
-> Seq Scan on two (cost=0.00..114.02 rows=5 width=0) (actual
time=0.028..89.598 rows=3277 loops=1)
Filter: (to_tsvector('simple'::regconfig, b) @@ '!''four'''::tsquery)
Total runtime: 91.208 ms
(4 rows)
Any thoughts? Is this something that could be fixed in a future version of
PostgreSQL? Or is the current behavior intentionally the way it is for
some reason I haven't thought of yet?
Thanks a lot for your help,
James Dietrich
P.S. Here is the procedure I used create the test database.
I can also provide the output of pg_dump upon request.
jdietrch@io:~/z$ psql template1 -U stariadmin -W
Password for user stariadmin:
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
template1=> create database z1;
CREATE DATABASE
template1=> \q
jdietrch@io:~/z$ psql z1 -U stariadmin -W -f a_commands
Password for user stariadmin:
CREATE TABLE
CREATE INDEX
CREATE TABLE
CREATE INDEX
CREATE SEQUENCE
CREATE LANGUAGE
CREATE FUNCTION
nextval
---------
1
(1 row)
CREATE FUNCTION
CREATE FUNCTION
fill_one
----------
done
(1 row)
fill_two
----------
done
(1 row)
jdietrch@io:~/z$ cat a_commands
create table one(a text);
create index fts_a on one using gin(to_tsvector('simple', a));
create table two(b text);
create index fts_b on two using gin(to_tsvector('simple', b));
create sequence sequ;
create language plpgsql;
create or replace function num_word(out m_word text) as $$
declare
n integer;
o money;
begin
select into o nextval('sequ')::text::money;
select into m_word trim(substring(cash_words(o), 1,
position('dollars' in cash_words(o))-1));
end;
$$ language plpgsql;
select nextval('sequ');
create or replace function fill_one(out f_o text) as $$
begin
for i in 0..50 loop
insert into one values(num_word());
end loop;
f_o = 'done';
end;
$$ language plpgsql;
create or replace function fill_two(out f_tw text) as $$
begin
for i in 0..5000 loop
insert into two values(num_word());
end loop;
f_tw = 'done';
end;
$$ language plpgsql;
select fill_one();
select fill_two();
jdietrch@io:~/z$