Thread: BUG #8048: Text Search

BUG #8048: Text Search

From
luigisag@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      8048
Logged by:          Luigi
Email address:      luigisag@gmail.com
PostgreSQL version: 9.2.0
Operating system:   Windows 7
Description:        =


I've configured 2 table like this

        CREATE TABLE "User_Full_Text_Search" (
        "Email" varchar(50),
        "UserId" varchar(50),
        "Full_Text_Search" varchar(4096)
        )
        WITH (OIDS=3DFALSE)
        ;
        =

        ALTER TABLE "User_Full_Text_Search" OWNER TO "postgres";
        =

        CREATE INDEX IX_FullText ON "User_Full_Text_Search" USING
gin(to_tsvector('italian', "Full_Text_Search"));
        =

        =

        CREATE TABLE "User_Full_Text_Search_2" (
        "Email" varchar(50),
        "UserId" varchar(50),
        "Full_Text_Search" varchar(4096),
        "tsv" varchar(4096)
        )
        WITH (OIDS=3DFALSE)
        ;
        =

        ALTER TABLE "User_Full_Text_Search_2" OWNER TO "postgres";
        =

        CREATE INDEX IX_FullText_2 ON "User_Full_Text_Search_2" USING
gin("tsv");
    CREATE TRIGGER "tsvectorupdate" BEFORE INSERT OR UPDATE ON
"User_Full_Text_Search_2"
    FOR EACH ROW
    EXECUTE PROCEDURE "tsvector_update_trigger"('tsv', 'pg_catalog.italian',
'Full_Text_Search');

Column Full_Text_Search (table User_Full_Text_Search) is just a single word
or max 2 words separeted by space " " (ex: test tester), and tsv (table
User_Full_Text_Search_2) is populate by materializing column with a
ts_vector of Full_Text_Search.

Now if i perform those 2 queries

    select "UserId","Email" from "User_Full_Text_Search"    =

    where to_tsvector('italian',"Full_Text_Search") @@ to_tsquery('italian',
'test|developer')
    GROUP BY "UserId","Email"
    =

    select "UserId","Email" from "User_Full_Text_Search_2"    =

    where "tsv" @@ to_tsquery('italian', 'test|developer')
    GROUP BY "UserId","Email"

Records on Tables (are same) like 10 milion. =


Execution time of 1st query is 120 seconds (result set like 750.000)
Execution time of 2st query is 270 seconds (result set like 750.000) same
records

I don't understand why a materialized column is more slow than a calculeted
one...

Re: BUG #8048: Text Search

From
Kevin Grittner
Date:
"luigisag@gmail.com" <luigisag@gmail.com> wrote:=0A=0A>=A0=A0=A0=A0=A0=A0=
=A0=A0 CREATE TABLE "User_Full_Text_Search_2" (=0A>=A0=A0=A0=A0=A0=A0=A0=A0=
 "Email" varchar(50),=0A>=A0=A0=A0=A0=A0=A0=A0=A0 "UserId" varchar(50),=0A>=
=A0=A0=A0=A0=A0=A0=A0=A0 "Full_Text_Search" varchar(4096),=0A>=A0=A0=A0=A0=
=A0=A0=A0=A0 "tsv" varchar(4096)=0A>=A0=A0=A0=A0=A0=A0=A0=A0 )=0A>=A0=A0=A0=
=A0=A0=A0=A0=A0 WITH (OIDS=3DFALSE)=0A>=A0=A0=A0=A0=A0=A0=A0=A0 ;=0A=0A>=A0=
=A0=A0=A0=A0=A0=A0=A0 CREATE INDEX IX_FullText_2 ON "User_Full_Text_Search_=
2"=0A>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 USING gin("tsv");=0A=0A> I don't under=
stand why a materialized column is more slow than a=0A> calculeted one...=
=0A=0ATry materializing it as a tsvector instead of a varchar.=0A=0A--=0AKe=
vin Grittner=0AEnterpriseDB: http://www.enterprisedb.com=0AThe Enterprise P=
ostgreSQL Company

Re: BUG #8048: Text Search

From
Tom Lane
Date:
luigisag@gmail.com writes:
>         CREATE TABLE "User_Full_Text_Search_2" (
>         "Email" varchar(50),
>         "UserId" varchar(50),
>         "Full_Text_Search" varchar(4096),
>         "tsv" varchar(4096)
>         )
>         WITH (OIDS=FALSE)
>         ;

You declared tsv as a plain varchar column, not a tsvector, so text
searches on it aren't optimized.  I'm surprised the system even let
you build a GIN index on it --- maybe you have btree_gin installed?
Anyway that index isn't useful for answering a full-text-search
query, as you'll see if you compare EXPLAIN results.

            regards, tom lane