Home > mailing lists

Thread: Optimizing Postgresql ILIKE while query

Optimizing Postgresql ILIKE while query

From

aman gupta

Date:

22 October 2018, 08:56:44

Hi Team,

Greetings for the day!!

Platform:

PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

Issue:

We have the base table which contains 22M records and we created a view on top of it while querying the view with ILIKE clause it took 44 seconds and with LIKE Clause 20 Seconds

Query:

fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';

Also attaching the comparison for both ILIKE and LIKE test performed.

Expectation:

How can we optimize our ILIKE query, since it is hardcoded in the application and we can't use any other keyword than ILIKE .

BR//

Aman Gupta

+918447611183

amangpt89@gmail.com

Attachment

Re: Optimizing Postgresql ILIKE while query

From

Pavel Stehule

Date:

22 October 2018, 09:00:51

po 22. 10. 2018 v 7:57 odesílatel aman gupta <amangpt89@gmail.com> napsal:

Hi Team,

Greetings for the day!!

Platform:

PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

Issue:

We have the base table which contains 22M records and we created a view on top of it while querying the view with ILIKE clause it took 44 seconds and with LIKE Clause 20 Seconds

Query:

fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';

Also attaching the comparison for both ILIKE and LIKE test performed.

Expectation:

How can we optimize our ILIKE query, since it is hardcoded in the application and we can't use any other keyword than ILIKE .

look on trigram index

https://www.postgresql.org/docs/11/static/pgtrgm.html

Regards

Pavel

BR//
Aman Gupta
+918447611183
amangpt89@gmail.com

Re: Optimizing Postgresql ILIKE while query

From

Pavel Stehule

Date:

22 October 2018, 09:00:51

po 22. 10. 2018 v 7:57 odesílatel aman gupta <amangpt89@gmail.com> napsal:

Hi Team,

Greetings for the day!!

Platform:

PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit

Issue:

We have the base table which contains 22M records and we created a view on top of it while querying the view with ILIKE clause it took 44 seconds and with LIKE Clause 20 Seconds

Query:

fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';

Also attaching the comparison for both ILIKE and LIKE test performed.

Expectation:

How can we optimize our ILIKE query, since it is hardcoded in the application and we can't use any other keyword than ILIKE .

look on trigram index

https://www.postgresql.org/docs/11/static/pgtrgm.html

Regards

Pavel

BR//
Aman Gupta
+918447611183
amangpt89@gmail.com

Re: Optimizing Postgresql ILIKE while query

From

Alban Hertroys

Date:

22 October 2018, 10:23:35

> On 22 Oct 2018, at 7:56, aman gupta <amangpt89@gmail.com> wrote:
>
> Issue:
>
> We have the base table which contains 22M records and we created a view on top of it while querying the view with
ILIKEclause it took 44 seconds and with LIKE Clause 20 Seconds 
>
> Query:
>
> fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
> select
destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs
frommmsuper.test_20m_view  where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%'; 

Perhaps, when you have a question about timing, you shouldn't turn off the timing in the query plan? Now we can't see
wherethe time is spent. 

> <LIKE_Clause_ILIKE_Clause_Postgres_Response.txt>

That's all sequential scans that each remove a significant amount of rows. That probably costs a significant amount of
timeto do. 

It looks like you don't have any indices on the underlying table(s) at all. I'd start there and then look at the ILIKE
problemagain. By that time, Pavel's suggestion for a trigram index on that text field is probably spot-on. 

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Re: Optimizing Postgresql ILIKE while query

From

Alban Hertroys

Date:

22 October 2018, 10:23:35

> On 22 Oct 2018, at 7:56, aman gupta <amangpt89@gmail.com> wrote:
>
> Issue:
>
> We have the base table which contains 22M records and we created a view on top of it while querying the view with
ILIKEclause it took 44 seconds and with LIKE Clause 20 Seconds 
>
> Query:
>
> fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
> select
destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs
frommmsuper.test_20m_view  where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%'; 

Perhaps, when you have a question about timing, you shouldn't turn off the timing in the query plan? Now we can't see
wherethe time is spent. 

> <LIKE_Clause_ILIKE_Clause_Postgres_Response.txt>

That's all sequential scans that each remove a significant amount of rows. That probably costs a significant amount of
timeto do. 

It looks like you don't have any indices on the underlying table(s) at all. I'd start there and then look at the ILIKE
problemagain. By that time, Pavel's suggestion for a trigram index on that text field is probably spot-on. 

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Re: Optimizing Postgresql ILIKE while query

From

Scottix

Date:

22 October 2018, 18:22:34

Also leading wildcards can inhibit the use of indexes. Best to try to avoid LIKE queries similar to '%TERM'

On Mon, Oct 22, 2018 at 12:23 AM Alban Hertroys <haramrae@gmail.com> wrote:

> On 22 Oct 2018, at 7:56, aman gupta <amangpt89@gmail.com> wrote:
>
> Issue:
>
> We have the base table which contains 22M records and we created a view on top of it while querying the view with ILIKE clause it took 44 seconds and with LIKE Clause 20 Seconds
>
> Query:
>
> fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
> select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';

Perhaps, when you have a question about timing, you shouldn't turn off the timing in the query plan? Now we can't see where the time is spent.

> <LIKE_Clause_ILIKE_Clause_Postgres_Response.txt>

That's all sequential scans that each remove a significant amount of rows. That probably costs a significant amount of time to do.

It looks like you don't have any indices on the underlying table(s) at all. I'd start there and then look at the ILIKE problem again. By that time, Pavel's suggestion for a trigram index on that text field is probably spot-on.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Re: Optimizing Postgresql ILIKE while query

From

Scottix

Date:

22 October 2018, 18:22:34

Also leading wildcards can inhibit the use of indexes. Best to try to avoid LIKE queries similar to '%TERM'

On Mon, Oct 22, 2018 at 12:23 AM Alban Hertroys <haramrae@gmail.com> wrote:

> On 22 Oct 2018, at 7:56, aman gupta <amangpt89@gmail.com> wrote:
>
> Issue:
>
> We have the base table which contains 22M records and we created a view on top of it while querying the view with ILIKE clause it took 44 seconds and with LIKE Clause 20 Seconds
>
> Query:
>
> fm_db_custom_db=# EXPLAIN (ANALYZE, TIMING OFF)
> select destination,hostname,inputfilename,inputtime,logicalservername,outputfilename,outputtime,processinglink,source,totalinputbytes,totalinputcdrs,totaloutputbytes,totaloutputcdrs from mmsuper.test_20m_view where inputfilename ilike '%SDPOUTPUTCDR_4001_BLSDP09_ADM_4997_18-10-15-02549.ASN%';

Perhaps, when you have a question about timing, you shouldn't turn off the timing in the query plan? Now we can't see where the time is spent.

> <LIKE_Clause_ILIKE_Clause_Postgres_Response.txt>

That's all sequential scans that each remove a significant amount of rows. That probably costs a significant amount of time to do.

It looks like you don't have any indices on the underlying table(s) at all. I'd start there and then look at the ILIKE problem again. By that time, Pavel's suggestion for a trigram index on that text field is probably spot-on.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.