Query performance going from Oracle to Postgres - Mailing list pgsql-general
From | Dirschel, Steve |
---|---|
Subject | Query performance going from Oracle to Postgres |
Date | |
Msg-id | DM6PR03MB43323851E8589A0F43972C91FAEFA@DM6PR03MB4332.namprd03.prod.outlook.com Whole thread Raw |
Responses |
Re: Query performance going from Oracle to Postgres
Re: Query performance going from Oracle to Postgres Re: Query performance going from Oracle to Postgres |
List | pgsql-general |
We are in the process of converting from Oracle to Postgres and I have a query that is using the same index in Postgres as is used in Oracle but in Postgres the query does 16x more buffer/logical reads. I’d like to understand why. The query is hitting a partitioned table but to simply things I changed it to hit a single partition. This partition has the same number of rows in Oracle as in Postgres.
Here is the Postgres query, partition definition, execution plan. I will also include the Oracle execution plan below in case anyone is interested.
explain (analyze, buffers)
select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from hist28.history_event_display_timestamp_20230301 historyeve0_ where historyeve0_.IS_DELETED=0
and historyeve0_.PRISM_GUID='i0accd6a20000018405f095ee669dc5b4'
and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW'
and (historyeve0_.EVENT_TYPE not in ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryEvent'))
and (historyeve0_.PRODUCT_VIEW in ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA')
or historyeve0_.PRODUCT_VIEW is null)
and historyeve0_.CLIENT_ID='WLNCE_VNJXL7'
and (historyeve0_.DISPLAY_TIMESTAMP between '2022-03-01 00:00:00' and '2023-03-01 23:59:59.999');
Aggregate (cost=56.64..56.65 rows=1 width=8) (actual time=0.930..0.931 rows=1 loops=1)
Buffers: shared hit=341
-> Index Scan using history_event_display_timesta_prism_guid_display_timestamp_idx1 on history_event_display_timestamp_20230301 historyeve0_ (cost=0.42..56.64 rows=1 width=33) (actual time=0.034..0.897 rows=332 loops=1)
Index Cond: (((prism_guid)::text = 'i0accd6a20000018405f095ee669dc5b4'::text) AND (display_timestamp >= '2022-03-01 00:00:00'::timestamp without time zone) AND (display_timestamp <= '2023-03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = 'CARSWELL.WESTLAW'::text) AND (is_deleted = '0'::numeric) AND ((client_id)::text = 'WLNCE_VNJXL7'::text))
Filter: (((event_type)::text <> ALL ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::text[])) AND (((product_view)::text = ANY ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[])) OR (product_view IS NULL)))
Buffers: shared hit=341
Planning:
Buffers: shared hit=6
Planning Time: 0.266 ms
Execution Time: 0.965 ms
(10 rows)
*** 341 logical reads to find 332 rows.
Oracle will find the same 332 rows using the same index but in Oracle it only does 20 logical reads. I thought maybe the index was fragmented so I reindexed that index:
reindex index hist28.history_event_display_timesta_prism_guid_display_timestamp_idx1;
Plan after that:
Aggregate (cost=40.64..40.65 rows=1 width=8) (actual time=0.707..0.708 rows=1 loops=1)
Buffers: shared hit=328
-> Index Scan using history_event_display_timesta_prism_guid_display_timestamp_idx1 on history_event_display_timestamp_20230301 historyeve0_ (cost=0.42..40.64 rows=1 width=33) (actual time=0.032..0.683 rows=332 loops=1)
Index Cond: (((prism_guid)::text = 'i0accd6a20000018405f095ee669dc5b4'::text) AND (display_timestamp >= '2022-03-01 00:00:00'::timestamp without time zone) AND (display_timestamp <= '2023-03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = 'CARSWELL.WESTLAW'::text) AND (is_deleted = '0'::numeric) AND ((client_id)::text = 'WLNCE_VNJXL7'::text))
Filter: (((event_type)::text <> ALL ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::text[])) AND (((product_view)::text = ANY ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[])) OR (product_view IS NULL)))
Buffers: shared hit=328
Planning:
Buffers: shared hit=27
Planning Time: 0.321 ms
Execution Time: 0.741 ms
(10 rows)
Shared hit came down a little but is still 16x more than Oracle. The actual query will hit 12 partitions so the extra amount of work it does in Postgres adds up (note some data points will find significantly more rows than 332 so performance is much worse than shown here but this data point is used just to show the difference between Oracle and Postgres). The interesting part it is seems Postgres is doing 1 shared hit per row. I don’t know anyone on this list knows Postgres’s internal index implementations vs Oracle’s but is Postgres’s Index Scan operation significantly different than Oracle’s index range scan? There is something implemented much less efficiently here vs Oracle and I don’t know what I can do to get this query to perform more closer to Oracle.
Thanks in advance.
Here is the partition definition in Postgres:
\d hist28.history_event_display_timestamp_20230301
Table "hist28.history_event_display_timestamp_20230301"
Column | Type | Collation | Nullable | Default
------------------------+--------------------------------+-----------+----------+-----------------------------------------
display_timestamp | timestamp(6) without time zone | | not null |
history_event_sid | character varying(32) | | not null |
event_type | character varying(50) | | not null |
event_sub_type | character varying(100) | | |
content_type | character varying(100) | | |
related_entity_type | character varying(50) | | |
related_entity_id | character varying(256) | | |
prism_guid | character varying(50) | | not null |
client_id | character varying(250) | | |
matter_id | character varying(50) | | |
delimiter | character varying(20) | | |
session_id | character varying(35) | | not null |
is_deleted | numeric(1,0) | | not null | 0
created | timestamp(6) without time zone | | not null |
changed | timestamp(6) without time zone | | not null |
cobalt_event_timestamp | timestamp(6) without time zone | | |
product_sid | character varying(32) | | not null |
cobalt_event_id | character varying(256) | | |
unique_lookup_key | character varying(200) | | not null |
display_name | character varying(3000) | | not null |
event_payload | text | | |
sub_content_type | character varying(250) | | |
product_view | character varying(64) | | | 'DefaultProductView'::character varying
Partition of: hist28.history_event FOR VALUES FROM ('2023-02-01 00:00:00') TO ('2023-03-01 00:00:00')
Indexes:
"history_event_display_timestamp_20230301_pkey" PRIMARY KEY, btree (history_event_sid, display_timestamp)
"history_event_display_timest_prism_guid_display_timestamp_idx39" btree (prism_guid, display_timestamp, content_type, event_type, product_sid, is_deleted, related_entity_id)
"history_event_display_timesta_prism_guid_display_timestamp_idx1" UNIQUE, btree (prism_guid, display_timestamp, unique_lookup_key, event_type, session_id, related_entity_id, product_sid, is_deleted, client_
id, product_view)
"history_event_display_timesta_prism_guid_is_deleted_changed_idx" btree (prism_guid, is_deleted, changed)
"history_event_display_timestamp_20230301_changed_idx" btree (changed)
The index the query is using- history_event_display_timesta_prism_guid_display_timestamp_idx1- it is the same index being used in Oracle, just named differently in Oracle.
Here is the Oracle plan and execution statistics hitting the same partition:
select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from hist28.HISTORY_EVENT partition(DISPLAY_TIMESTAMP_20230301) historyeve0_ where historyeve0_.IS_DELETED=0
and historyeve0_.PRISM_GUID='i0accd6a20000018405f095ee669dc5b4'
and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW'
and (historyeve0_.EVENT_TYPE not in ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryEvent'))
and (historyeve0_.PRODUCT_VIEW in ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA')
or historyeve0_.PRODUCT_VIEW is null)
and historyeve0_.CLIENT_ID='WLNCE_VNJXL7'
and (historyeve0_.DISPLAY_TIMESTAMP between to_timestamp('2022-03-01 00:00:00','YYYY-MM-DD HH24:MI:SS') and to_timestamp('2023-03-01 23:59:59','YYYY-MM-DD HH24:MI:SS'));
Plan hash value: 3361538278
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 20 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 20 |
|* 2 | FILTER | | 1 | | 332 |00:00:00.01 | 20 |
| 3 | PARTITION RANGE SINGLE| | 1 | 1 | 332 |00:00:00.01 | 20 |
|* 4 | INDEX RANGE SCAN | HISTORY_EVENT_IDX02 | 1 | 1 | 332 |00:00:00.01 | 20 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_TIMESTAMP(:SYS_B_16,:SYS_B_17)<=TO_TIMESTAMP(:SYS_B_18,:SYS_B_19))
4 - access("HISTORYEVE0_"."PRISM_GUID"=:SYS_B_01 AND
"HISTORYEVE0_"."DISPLAY_TIMESTAMP">=TO_TIMESTAMP(:SYS_B_16,:SYS_B_17) AND
"HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 AND "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND
"HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND "HISTORYEVE0_"."DISPLAY_TIMESTAMP"<=TO_TIMESTAMP(:SYS_B_1
8,:SYS_B_19))
filter(("HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02
AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_03 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_04 AND
"HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_05 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_06 AND
"HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_07 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_08 AND
"HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND (INTERNAL_FUNCTION("HISTORYEVE0_"."PRODUCT_VIEW") OR
"HISTORYEVE0_"."PRODUCT_VIEW" IS NULL)))
It finds the same 332 rows but it does only 20 logical reads. The Oracle “access” step and the Postgres “Index Cond” columns used for index access appears to be the same for Postgres and Oracle.
Thanks
pgsql-general by date: