Re: procost for to_tsvector - Mailing list pgsql-hackers

From Tom Lane
Subject Re: procost for to_tsvector
Date
Msg-id 21515.1430676296@sss.pgh.pa.us
Whole thread Raw
In response to Re: procost for to_tsvector  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Tom> and some experiments of my own, but I wonder why we are only
>> Tom> thinking of to_tsvector.  Isn't to_tsquery, for example, just
>> Tom> about as expensive?  What of other text search functions?

>> Making the same change for to_tsquery and plainto_tsquery would be
>> reasonable; that would help with the seqscan cost for cases like
>> to_tsvector('config',col) @@ to_tsquery('blah') where the non-immutable
>> form of to_tsquery is used.

> Works for me.

>> I don't recall seeing cases of any of the other functions figuring into
>> planner decisions.

> It's not so much "are they popular" as "do they involve parsing raw
> text".  Once you've got the tsvector or tsquery, later steps are
> (I think) much more efficient.

I poked at this a bit more, and noted that:

* ts_headline() also parses input text, and is demonstrably at least as
expensive per-input-byte as to_tsvector.

* ts_match_tt() and ts_match_tq() invoke to_tsvector internally,
and thus should certainly have as great a cost.

* tsquery_rewrite_query() actually executes a SQL query given as a string,
with cost that is uncertain, but treating it as a unit-cost function is
surely completely silly.  Since our default cost for PL-language functions
is 100, probably setting this one to 100 as well is a reasonable proposal.

So I think we should set procost for all of these functions to 100, as
per attached.  Any objections?

            regards, tom lane

diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 55c246e..0a0b2bb 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 3625 (  tsvector_conca
*** 4494,4501 ****

  DATA(insert OID = 3634 (  ts_match_vq            PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "3614 3615" _null_
_null__null_ _null_ _null_ ts_match_vq _null_ _null_ _null_ )); 
  DATA(insert OID = 3635 (  ts_match_qv            PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "3615 3614" _null_
_null__null_ _null_ _null_ ts_match_qv _null_ _null_ _null_ )); 
! DATA(insert OID = 3760 (  ts_match_tt            PGNSP PGUID 12 3 0 0 0 f f f f t f s 2 0 16 "25 25" _null_ _null_
_null__null_ _null_ ts_match_tt _null_ _null_ _null_ )); 
! DATA(insert OID = 3761 (  ts_match_tq            PGNSP PGUID 12 2 0 0 0 f f f f t f s 2 0 16 "25 3615" _null_ _null_
_null__null_ _null_ ts_match_tq _null_ _null_ _null_ )); 

  DATA(insert OID = 3648 (  gtsvector_compress    PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_
_null__null_ _null_ gtsvector_compress _null_ _null_ _null_ )); 
  DESCR("GiST tsvector support");
--- 4494,4501 ----

  DATA(insert OID = 3634 (  ts_match_vq            PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "3614 3615" _null_
_null__null_ _null_ _null_ ts_match_vq _null_ _null_ _null_ )); 
  DATA(insert OID = 3635 (  ts_match_qv            PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "3615 3614" _null_
_null__null_ _null_ _null_ ts_match_qv _null_ _null_ _null_ )); 
! DATA(insert OID = 3760 (  ts_match_tt            PGNSP PGUID 12 100 0 0 0 f f f f t f s 2 0 16 "25 25" _null_ _null_
_null__null_ _null_ ts_match_tt _null_ _null_ _null_ )); 
! DATA(insert OID = 3761 (  ts_match_tq            PGNSP PGUID 12 100 0 0 0 f f f f t f s 2 0 16 "25 3615" _null_
_null__null_ _null_ _null_ ts_match_tq _null_ _null_ _null_ )); 

  DATA(insert OID = 3648 (  gtsvector_compress    PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_
_null__null_ _null_ gtsvector_compress _null_ _null_ _null_ )); 
  DESCR("GiST tsvector support");
*************** DESCR("show real useful query for GiST i
*** 4554,4560 ****

  DATA(insert OID = 3684 (  ts_rewrite        PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 3615 "3615 3615 3615" _null_
_null__null_ _null_ _null_ tsquery_rewrite _null_ _null_ _null_ )); 
  DESCR("rewrite tsquery");
! DATA(insert OID = 3685 (  ts_rewrite        PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 3615 "3615 25" _null_ _null_
_null__null_ _null_ tsquery_rewrite_query _null_ _null_ _null_ )); 
  DESCR("rewrite tsquery");

  DATA(insert OID = 3695 (  gtsquery_compress                PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281"
_null__null_ _null_ _null_ _null_ gtsquery_compress _null_ _null_ _null_ )); 
--- 4554,4560 ----

  DATA(insert OID = 3684 (  ts_rewrite        PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 3615 "3615 3615 3615" _null_
_null__null_ _null_ _null_ tsquery_rewrite _null_ _null_ _null_ )); 
  DESCR("rewrite tsquery");
! DATA(insert OID = 3685 (  ts_rewrite        PGNSP PGUID 12 100 0 0 0 f f f f t f v 2 0 3615 "3615 25" _null_ _null_
_null__null_ _null_ tsquery_rewrite_query _null_ _null_ _null_ )); 
  DESCR("rewrite tsquery");

  DATA(insert OID = 3695 (  gtsquery_compress                PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281"
_null__null_ _null_ _null_ _null_ gtsquery_compress _null_ _null_ _null_ )); 
*************** DESCR("(internal)");
*** 4644,4669 ****
  DATA(insert OID = 3741 (  thesaurus_lexize    PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 2281 "2281 2281 2281 2281"
_null__null_ _null_ _null_ _null_ thesaurus_lexize _null_ _null_ _null_ )); 
  DESCR("(internal)");

! DATA(insert OID = 3743 (  ts_headline    PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 25 "3734 25 3615 25" _null_ _null_
_null__null_ _null_ ts_headline_byid_opt _null_ _null_ _null_ )); 
  DESCR("generate headline");
! DATA(insert OID = 3744 (  ts_headline    PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 25 "3734 25 3615" _null_ _null_
_null__null_ _null_ ts_headline_byid _null_ _null_ _null_ )); 
  DESCR("generate headline");
! DATA(insert OID = 3754 (  ts_headline    PGNSP PGUID 12 1 0 0 0 f f f f t f s 3 0 25 "25 3615 25" _null_ _null_
_null__null_ _null_ ts_headline_opt _null_ _null_ _null_ )); 
  DESCR("generate headline");
! DATA(insert OID = 3755 (  ts_headline    PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 25 "25 3615" _null_ _null_ _null_
_null__null_ ts_headline _null_ _null_ _null_ )); 
  DESCR("generate headline");

! DATA(insert OID = 3745 (  to_tsvector        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 3614 "3734 25" _null_ _null_
_null__null_ _null_ to_tsvector_byid _null_ _null_ _null_ )); 
  DESCR("transform to tsvector");
! DATA(insert OID = 3746 (  to_tsquery        PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 3615 "3734 25" _null_ _null_
_null__null_ _null_ to_tsquery_byid _null_ _null_ _null_ )); 
  DESCR("make tsquery");
! DATA(insert OID = 3747 (  plainto_tsquery    PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 3615 "3734 25" _null_ _null_
_null__null_ _null_ plainto_tsquery_byid _null_ _null_ _null_ )); 
  DESCR("transform to tsquery");
! DATA(insert OID = 3749 (  to_tsvector        PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 3614 "25" _null_ _null_ _null_
_null__null_ to_tsvector _null_ _null_ _null_ )); 
  DESCR("transform to tsvector");
! DATA(insert OID = 3750 (  to_tsquery        PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 3615 "25" _null_ _null_ _null_
_null__null_ to_tsquery _null_ _null_ _null_ )); 
  DESCR("make tsquery");
! DATA(insert OID = 3751 (  plainto_tsquery    PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 3615 "25" _null_ _null_ _null_
_null__null_ plainto_tsquery _null_ _null_ _null_ )); 
  DESCR("transform to tsquery");

  DATA(insert OID = 3752 (  tsvector_update_trigger            PGNSP PGUID 12 1 0 0 0 f f f f f f v 0 0 2279 "" _null_
_null__null_ _null_ _null_ tsvector_update_trigger_byid _null_ _null_ _null_ )); 
--- 4644,4669 ----
  DATA(insert OID = 3741 (  thesaurus_lexize    PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 2281 "2281 2281 2281 2281"
_null__null_ _null_ _null_ _null_ thesaurus_lexize _null_ _null_ _null_ )); 
  DESCR("(internal)");

! DATA(insert OID = 3743 (  ts_headline    PGNSP PGUID 12 100 0 0 0 f f f f t f i 4 0 25 "3734 25 3615 25" _null_
_null__null_ _null_ _null_ ts_headline_byid_opt _null_ _null_ _null_ )); 
  DESCR("generate headline");
! DATA(insert OID = 3744 (  ts_headline    PGNSP PGUID 12 100 0 0 0 f f f f t f i 3 0 25 "3734 25 3615" _null_ _null_
_null__null_ _null_ ts_headline_byid _null_ _null_ _null_ )); 
  DESCR("generate headline");
! DATA(insert OID = 3754 (  ts_headline    PGNSP PGUID 12 100 0 0 0 f f f f t f s 3 0 25 "25 3615 25" _null_ _null_
_null__null_ _null_ ts_headline_opt _null_ _null_ _null_ )); 
  DESCR("generate headline");
! DATA(insert OID = 3755 (  ts_headline    PGNSP PGUID 12 100 0 0 0 f f f f t f s 2 0 25 "25 3615" _null_ _null_ _null_
_null__null_ ts_headline _null_ _null_ _null_ )); 
  DESCR("generate headline");

! DATA(insert OID = 3745 (  to_tsvector        PGNSP PGUID 12 100 0 0 0 f f f f t f i 2 0 3614 "3734 25" _null_ _null_
_null__null_ _null_ to_tsvector_byid _null_ _null_ _null_ )); 
  DESCR("transform to tsvector");
! DATA(insert OID = 3746 (  to_tsquery        PGNSP PGUID 12 100 0 0 0 f f f f t f i 2 0 3615 "3734 25" _null_ _null_
_null__null_ _null_ to_tsquery_byid _null_ _null_ _null_ )); 
  DESCR("make tsquery");
! DATA(insert OID = 3747 (  plainto_tsquery    PGNSP PGUID 12 100 0 0 0 f f f f t f i 2 0 3615 "3734 25" _null_ _null_
_null__null_ _null_ plainto_tsquery_byid _null_ _null_ _null_ )); 
  DESCR("transform to tsquery");
! DATA(insert OID = 3749 (  to_tsvector        PGNSP PGUID 12 100 0 0 0 f f f f t f s 1 0 3614 "25" _null_ _null_
_null__null_ _null_ to_tsvector _null_ _null_ _null_ )); 
  DESCR("transform to tsvector");
! DATA(insert OID = 3750 (  to_tsquery        PGNSP PGUID 12 100 0 0 0 f f f f t f s 1 0 3615 "25" _null_ _null_ _null_
_null__null_ to_tsquery _null_ _null_ _null_ )); 
  DESCR("make tsquery");
! DATA(insert OID = 3751 (  plainto_tsquery    PGNSP PGUID 12 100 0 0 0 f f f f t f s 1 0 3615 "25" _null_ _null_
_null__null_ _null_ plainto_tsquery _null_ _null_ _null_ )); 
  DESCR("transform to tsquery");

  DATA(insert OID = 3752 (  tsvector_update_trigger            PGNSP PGUID 12 1 0 0 0 f f f f f f v 0 0 2279 "" _null_
_null__null_ _null_ _null_ tsvector_update_trigger_byid _null_ _null_ _null_ )); 

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: CTE optimization fence on the todo list?
Next
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Add transforms feature