Thread: pg_trgm and slow bitmap index scan plan

pg_trgm and slow bitmap index scan plan

From
Mathieu De Zutter
Date:
Hi all,

I've been trying to apply pg_tgrm for the search-function of my application. The database fits a few times in the available RAM, and is mostly read-only.
Plans, schema and configs in attachment. Postgresql version 9.1.4 on Debian.

When just searching in one table, it behaves perfectly here. When I put constraints on multiple connected tables (performance and performer), it takes some bad decisions. Somehow the planner thinks that an index scan on a trigram index (on a string) is as fast as an index scan on a btree of an int. Because of that, it will combine both index scans into an "AND" bitmap index scan. Since this is done in a nested loop, the performance gets very bad. The trigram index scan should not be repeated as it is relatively slow and always the same query.

When I disable bitmap scans, it will search on both tables and then hash everything together. This avoids launching the same index scan over and over again. This is much faster.

Since my database is mostly in memory, I guess I could safely disable bitmap scan (or at least for some query), since I understand that this kind of scan is often a way to have a better IO performance. There's little IO in my setup.
However, I'd rather get some help in fixing it right!

Thanks,

Mathieu
Attachment

Re: pg_trgm and slow bitmap index scan plan

From
Merlin Moncure
Date:
On Tue, Aug 28, 2012 at 2:39 AM, Mathieu De Zutter <mathieu@dezutter.org> wrote:
> Hi all,
>
> I've been trying to apply pg_tgrm for the search-function of my application.
> The database fits a few times in the available RAM, and is mostly read-only.
> Plans, schema and configs in attachment. Postgresql version 9.1.4 on Debian.
>
> When just searching in one table, it behaves perfectly here. When I put
> constraints on multiple connected tables (performance and performer), it
> takes some bad decisions. Somehow the planner thinks that an index scan on a
> trigram index (on a string) is as fast as an index scan on a btree of an
> int. Because of that, it will combine both index scans into an "AND" bitmap
> index scan. Since this is done in a nested loop, the performance gets very
> bad. The trigram index scan should not be repeated as it is relatively slow
> and always the same query.
>
> When I disable bitmap scans, it will search on both tables and then hash
> everything together. This avoids launching the same index scan over and over
> again. This is much faster.
>
> Since my database is mostly in memory, I guess I could safely disable bitmap
> scan (or at least for some query), since I understand that this kind of scan
> is often a way to have a better IO performance. There's little IO in my
> setup.
> However, I'd rather get some help in fixing it right!

Yeah -- gist_trgm_ops is expensive and the planner is not taking that
into account.  I wonder if operator classes (pg_opclass) should have a
planner influencing costing component.

merlin