Custom Operator for citext LIKE predicates question - Mailing list pgsql-hackers
From | Efrain J. Berdecia |
---|---|
Subject | Custom Operator for citext LIKE predicates question |
Date | |
Msg-id | 1197272596.186456.1642049516576@mail.yahoo.com Whole thread Raw |
Responses |
Re: Custom Operator for citext LIKE predicates question
|
List | pgsql-hackers |
After attempting to use gin and gist indexes for our queries that run against citext columns, our team has come up with the following to make our queries run from 2 mins to 25ms;
CREATE EXTENSION pg_trgm
CREATE EXTENSION btree_gin --may not be needed, checking
CREATE OPERATOR CLASS gin_trgm_ops_ci_new
FOR TYPE citext USING gin
AS
OPERATOR 1 % (text, text),
FUNCTION 1 btint4cmp (int4, int4),
FUNCTION 2 gin_extract_value_trgm (text, internal),
FUNCTION 3 gin_extract_query_trgm (text, internal, int2, internal, internal, internal, internal),
FUNCTION 4 gin_trgm_consistent (internal,int2, text, int4, internal, internal, internal, internal),
STORAGE int4;
ALTER OPERATOR FAMILY gin_trgm_ops_ci_new USING gin ADD
OPERATOR 3 ~~ (citext, citext),
OPERATOR 4 ~~* (citext, citext);
ALTER OPERATOR FAMILY gin_trgm_ops_ci_new USING gin ADD
OPERATOR 7 %> (text, text),
FUNCTION 6 (text,text) gin_trgm_triconsistent (internal, int2, text, int4, internal, internal, internal);
Our question is, does anyone see any flaw on this?
Also, could this not be incorporated into postgres natively?
I'm posting the old and new explain plans;
New explain;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------
Aggregate (cost=874327.76..874327.77 rows=1 width=8) (actual time=21.952..21.954 rows=1 loops=1)
-> Nested Loop (cost=1620.95..874284.13 rows=17449 width=0) (actual time=6.259..21.948 rows=9 loops=1)
-> Bitmap Heap Scan on t775 b1 (cost=1620.39..525029.25 rows=45632 width=35) (actual time=6.212..8.189 rows=13 loops=1)
Recheck Cond: ((c240001002 ~~ 'smp%'::citext) OR (c200000020 ~~ 'smp%'::citext) OR (c200000001 ~~ 'smp%'::citext))
Rows Removed by Index Recheck: 259
Filter: ((c400079600 <> 'ABC_BUSINESSSERVICE'::citext) AND (c400127400 = 'ABC.ASSET'::citext) AND ((c1000000001 = 'Mrictton Global'::citext) OR (c1000000001 = 'ABCOpsMonitoring'::citext) OR (c100000
0001 = 'Mrictton'::citext) OR (c1000000001 = 'Mrictton EITTE'::citext) OR (c1000000001 = 'Mrictton Finance'::citext) OR (c1000000001 = 'Mrictton Generic Services and Support'::citext) OR (c1000000001 = 'Mrictton G
lobal'::citext) OR (c1000000001 = 'Mrictton Global Demo Solutions'::citext) OR (c1000000001 = 'Mrictton HR Direct'::citext) OR (c1000000001 = 'Mrictton Marketing and Communications'::citext) OR (c1000000001 = 'Eri
csson Master Data Management'::citext) OR (c1000000001 = 'Mrictton OHS'::citext) OR (c1000000001 = 'Mrictton Patents and Licensing'::citext) OR (c1000000001 = 'Mrictton Sales'::citext) OR (c1000000001 = 'Mrictton
Security'::citext) OR (c1000000001 = 'Mrictton Shared Services'::citext) OR (c1000000001 = 'Mrictton Sourcing'::citext) OR (c1000000001 = 'Mrictton Supply ROD'::citext) OR (c1000000001 = 'Mrictton SW Supply Operat
ions'::citext) OR (c1000000001 = 'Remedy,a ABC Software Company'::citext)) AND (c400079600 = ANY ('{ABC_DATABASE,ABC_ACCOUNT,ABC_MEDIA,ABC.CORE:ABC_CONCRETECOLLECTION,ABC_PACKAGE,ABC_BIOS,ABC_SYSTEMSOFTWARE,ABC_KEYBOARD,ABC_LAN,ABC_LOGICALSYSTEMCOMPONENT,ABC_LNSGROUP,ABC_PHYSICALLOCATION,ABC_FLOPPYDRIVE,ABC_DOCUMENT,ABC_BUSINESSSERVICE,ABC_DATABASESTORAGE,ABC_NETWORKPORT,ABC_VIRTUALSYSTEMENABLER,ABC_POINTINGDEVICE,ABC_PRINTER,ABC_SYSTEMRESOURCE,ABC_CONNECTIVITYSEGMENT,ABC.CORE:ABC_BUSINESSPROCESS,ABC_PROTOCOLENDPOINT,ABC_TRANSACTION,ABC_APPLICATIONINFRASTRUCTURE,ABC_SOFTWARESERVER,ABC_UPS,ABC_ACTIVITY,ABC_CDROMDRIVE,ABC.CORE:ABC_RASD,ABC_PRODUCT,ABC_REMOTEFILESYSTEM,ABC_IPENDPOINT,ABC_LOCALFILESYSTEM,ABC_APPLICATION,ABC_IPCONNECTIVITYSUBNET,ABC_CLUSTER,ABC_CHASSIS,ABC_WAN,ABC_PATCH,ABC_ADMINDOMAIN,ABC.CORE:ABC_RESOURCEPOOL,ABC_IPXCONNECTIVITYNETWORK,ABC_HARDWARESYSTEMCOMPONENT,ABC_FILESYSTEM,ABC_MONITOR,ABC_CONNECTIVITYGROUP,ABC_EQUIPMENT,ABC_MAINFRAME,ABC_RACK,ABC_OPERATINGSYSTEM,ABC_PROCESSOR,ABC_SHARE,ABC_LANENDPOINT,ABC_HARDWAREPACKAGE,ABC_TAPEDRIVE,ABC_COMMUNICATIONENDPOINT,ABC_APPLICATIONSYSTEM,ABC_CARD,ABC_DISKPARTITION,ABC.CORE:ABC_VIRTUALSYSTEMSETTINGDATA,ABC_MEMORY,ABC_NTDOMAIN,ABC_COMPUTERSYSTEM,ABC_DISKDRIVE,ABC_SERVICEOFFERINGINSTANCE,ABC_ROLE,ABC_APPLICATIONSERVICE}'::citext[])))
Rows Removed by Filter: 62
Heap Blocks: exact=313
-> BitmapOr (cost=1620.39..1620.39 rows=163489 width=0) (actual time=5.703..5.704 rows=0 loops=1)
-> Bitmap Index Scan on oto2 (cost=0.00..528.72 rows=54496 width=0) (actual time=0.724..0.724 rows=41 loops=1)
Index Cond: (c240001002 ~~ 'smp%'::citext)
-> Bitmap Index Scan on oto3 (cost=0.00..528.72 rows=54496 width=0) (actual time=4.852..4.852 rows=331 loops=1)
Index Cond: (c200000020 ~~ 'smp%'::citext)
-> Bitmap Index Scan on oto4 (cost=0.00..528.72 rows=54496 width=0) (actual time=0.127..0.127 rows=0 loops=1)
Index Cond: (c200000001 ~~ 'smp%'::citext)
-> Index Scan using i1279_0_400129200_t1279 on t1279 b2 (cost=0.56..7.64 rows=1 width=35) (actual time=1.057..1.058 rows=1 loops=13)
Index Cond: (c400129200 = b1.c400129200)
Filter: ((c7 <> 6) AND (c7 <> 8))
Rows Removed by Filter: 0
Planning Time: 2.478 ms
Execution Time: 22.059 ms
(21 rows)
Time: 26.510 ms
Old explain with slow plan;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------
Limit (cost=1926420.44..1926420.70 rows=102 width=1199) (actual time=16396.091..16569.194 rows=9 loops=1)
-> Sort (cost=1926420.44..1926458.76 rows=15326 width=1199) (actual time=16396.089..16569.190 rows=9 loops=1)
Sort Key: b1.c200000020 NULLS FIRST, ((concat((concat(b1.c1, '|'))::citext, COALESCE(b2.c1, ''::citext)))::citext)
Sort Method: quicksort Memory: 29kB
-> WindowAgg (cost=1000.56..1925832.51 rows=15326 width=1199) (actual time=16396.025..16569.138 rows=9 loops=1)
-> Gather (cost=1000.56..1925564.30 rows=15326 width=1191) (actual time=4288.742..16569.068 rows=9 loops=1)
Workers Planned: 6
Workers Launched: 6
-> Nested Loop (cost=0.56..1923031.70 rows=2554 width=1191) (actual time=9430.362..16387.794 rows=1 loops=7)
-> Parallel Seq Scan on t1279 b2 (cost=0.00..530806.15 rows=416134 width=910) (actual time=0.016..575.311 rows=353200 loops=7)
Filter: ((c7 <> 6) AND (c7 <> 8))
Rows Removed by Filter: 574840
-> Index Scan using efrain_test_ix_t775_2 on t775 b1 (cost=0.56..3.34 rows=1 width=316) (actual time=0.044..0.044 rows=0 loops=2472402)
Index Cond: ((c400129200 = b2.c400129200) AND (c400127400 = 'ABC.ASSET'::citext))
Filter: ((c400079600 <> 'ABC_BUSINESSSERVICE'::citext) AND ((c240001002 ~~ 'smp%'::citext) OR (c200000020 ~~ 'smp%'::citext) OR (c200000001 ~~ 'smp%'::citext)) AND ((c1000000001 =
'Mrictton Global'::citext) OR (c1000000001 = 'ABCOpsMonitoring'::citext) OR (c1000000001 = 'Mrictton'::citext) OR (c1000000001 = 'Mrictton EITTE'::citext) OR (c1000000001 = 'Mrictton Finance'::citext) OR (c1000000001 = 'Mrictton Generic Services and Support'::citext) OR (c1000000001 = 'Mrictton Global'::citext) OR (c1000000001 = 'Mrictton Global Demo Solutions'::citext) OR (c1000000001 = 'Mrictton HR Direct'::citext) OR (c1000000001 = 'Mrictton Marketing and Communications'::citext) OR (c1000000001 = 'Mrictton Master Data Management'::citext) OR (c1000000001 = 'Mrictton OHS'::citext) OR (c1000000001 = 'Mrictton Patents and Licensing'::citext) OR (c1000000001 = 'Mrictton Sales'::citext) OR (c1000000001 = 'Mrictton Security'::citext) OR (c1000000001 = 'Mrictton Shared Services'::citext) OR (c1000000001 = 'Mrictton Sourcing'::citext) OR (c1000000001 = 'Mrictton Supply ROD'::citext) OR (c1000000001 = 'Mrictton SW Supply Operations'::citext) OR (c1000000001 = 'Remedy,a ABC Software Company'::citext)) AND (c400079600 = ANY ('{ABC_DATABASE,ABC_ACCOUNT,ABC_MEDIA,ABC.CORE:ABC_CONCRETECOLLECTION,ABC_PACKAGE,ABC_BIOS,ABC_SYSTEMSOFTWARE,ABC_KEYBOARD,ABC_LAN,ABC_LOGICALSYSTEMCOMPONENT,ABC_LNSGROUP,ABC_PHYSICALLOCATION,ABC_FLOPPYDRIVE,ABC_DOCUMENT,ABC_BUSINESSSERVICE,ABC_DATABASESTORAGE,ABC_NETWORKPORT,ABC_VIRTUALSYSTEMENABLER,ABC_POINTINGDEVICE,ABC_PRINTER,ABC_SYSTEMRESOURCE,ABC_CONNECTIVITYSEGMENT,ABC.CORE:ABC_BUSINESSPROCESS,ABC_PROTOCOLENDPOINT,ABC_TRANSACTION,ABC_APPLICATIONINFRASTRUCTURE,ABC_SOFTWARESERVER,ABC_UPS,ABC_ACTIVITY,ABC_CDROMDRIVE,ABC.CORE:ABC_RASD,ABC_PRODUCT,ABC_REMOTEFILESYSTEM,ABC_IPENDPOINT,ABC_LOCALFILESYSTEM,ABC_APPLICATION,ABC_IPCONNECTIVITYSUBNET,ABC_CLUSTER,ABC_CHASSIS,ABC_WAN,ABC_PATCH,ABC_ADMINDOMAIN,ABC.CORE:ABC_RESOURCEPOOL,ABC_IPXCONNECTIVITYNETWORK,ABC_HARDWARESYSTEMCOMPONENT,ABC_FILESYSTEM,ABC_MONITOR,ABC_CONNECTIVITYGROUP,ABC_EQUIPMENT,ABC_MAINFRAME,ABC_RACK,ABC_OPERATINGSYSTEM,ABC_PROCESSOR,ABC_SHARE,ABC_LANENDPOINT,ABC_HARDWAREPACKAGE,ABC_TAPEDRIVE,ABC_COMMUNICATIONENDPOINT,ABC_APPLICATIONSYSTEM,ABC_CARD,ABC_DISKPARTITION,ABC.CORE:ABC_VIRTUALSYSTEMSETTINGDATA,ABC_MEMORY,
ABC_NTDOMAIN,ABC_COMPUTERSYSTEM,ABC_DISKDRIVE,ABC_SERVICEOFFERINGINSTANCE,ABC_ROLE,ABC_APPLICATIONSERVICE}'::citext[])))
Rows Removed by Filter: 1
Planning Time: 3.205 ms
Execution Time: 16569.351 ms
(18 rows)
Time: 16577.806 ms (00:16.578)
Products
PostgreSQL Community Edition
Product Version
PostgreSQL 12
Thanks.
pgsql-hackers by date: