Re: Supporting = operator in gin/gist_trgm_ops - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Supporting = operator in gin/gist_trgm_ops
Date
Msg-id CAPpHfduP3JPH-iVUTJRvPTP2mY4EPKEz4iJBcRHr7Z+RCCxFKQ@mail.gmail.com
Whole thread Raw
In response to Re: Supporting = operator in gin/gist_trgm_ops  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Supporting = operator in gin/gist_trgm_ops
List pgsql-hackers
On Mon, Oct 26, 2020 at 7:38 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> Ah, yes this might lead to bad performance if the "fake wildcard"
> matches too many rows, but this shouldn't be a very common use case,
> and the only alternative for that might be to create trigrams for non
> alphanumerics characters.  I didn't try to do that because it would
> mean meaningful overhead for mainstream usage of pg_trgm, and would
> also mean on-disk format break.  In my opinion supporting = should be
> a best effort, especially for such corner cases.

It would be more efficient to generate trigrams for equal operator
using generate_trgm() instead of generate_wildcard_trgm().  It some
cases it would generate more trigrams.  For instance generate_trgm()
would generate '__a', '_ab', 'ab_' for '%ab%' while
generate_wildcard_trgm() would generate nothing.

Also I wonder how our costing would work if there are multiple indices
of the same column.  We should clearly prefer btree than pg_trgm
gist/gin, and I believe our costing provides this.  But we also should
prefer btree_gist/btree_gin than pg_trgm gist/gin, and I'm not sure
our costing provides this especially for gist.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PATCH: Report libpq version and configuration
Next
From: Anastasia Lubennikova
Date:
Subject: Commitfest 2020-11