Thread: Query performance going from Oracle to Postgres
We are in the process of converting from Oracle to Postgres and I have a query that is using the same index in Postgres as is used in Oracle but in Postgres the query does 16x more buffer/logical reads. I’d like to understand why. The query is hitting a partitioned table but to simply things I changed it to hit a single partition. This partition has the same number of rows in Oracle as in Postgres.
Here is the Postgres query, partition definition, execution plan. I will also include the Oracle execution plan below in case anyone is interested.
explain (analyze, buffers)
select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from hist28.history_event_display_timestamp_20230301 historyeve0_ where historyeve0_.IS_DELETED=0
and historyeve0_.PRISM_GUID='i0accd6a20000018405f095ee669dc5b4'
and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW'
and (historyeve0_.EVENT_TYPE not in ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryEvent'))
and (historyeve0_.PRODUCT_VIEW in ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA')
or historyeve0_.PRODUCT_VIEW is null)
and historyeve0_.CLIENT_ID='WLNCE_VNJXL7'
and (historyeve0_.DISPLAY_TIMESTAMP between '2022-03-01 00:00:00' and '2023-03-01 23:59:59.999');
Aggregate (cost=56.64..56.65 rows=1 width=8) (actual time=0.930..0.931 rows=1 loops=1)
Buffers: shared hit=341
-> Index Scan using history_event_display_timesta_prism_guid_display_timestamp_idx1 on history_event_display_timestamp_20230301 historyeve0_ (cost=0.42..56.64 rows=1 width=33) (actual time=0.034..0.897 rows=332 loops=1)
Index Cond: (((prism_guid)::text = 'i0accd6a20000018405f095ee669dc5b4'::text) AND (display_timestamp >= '2022-03-01 00:00:00'::timestamp without time zone) AND (display_timestamp <= '2023-03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = 'CARSWELL.WESTLAW'::text) AND (is_deleted = '0'::numeric) AND ((client_id)::text = 'WLNCE_VNJXL7'::text))
Filter: (((event_type)::text <> ALL ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::text[])) AND (((product_view)::text = ANY ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[])) OR (product_view IS NULL)))
Buffers: shared hit=341
Planning:
Buffers: shared hit=6
Planning Time: 0.266 ms
Execution Time: 0.965 ms
(10 rows)
*** 341 logical reads to find 332 rows.
Oracle will find the same 332 rows using the same index but in Oracle it only does 20 logical reads. I thought maybe the index was fragmented so I reindexed that index:
reindex index hist28.history_event_display_timesta_prism_guid_display_timestamp_idx1;
Plan after that:
Aggregate (cost=40.64..40.65 rows=1 width=8) (actual time=0.707..0.708 rows=1 loops=1)
Buffers: shared hit=328
-> Index Scan using history_event_display_timesta_prism_guid_display_timestamp_idx1 on history_event_display_timestamp_20230301 historyeve0_ (cost=0.42..40.64 rows=1 width=33) (actual time=0.032..0.683 rows=332 loops=1)
Index Cond: (((prism_guid)::text = 'i0accd6a20000018405f095ee669dc5b4'::text) AND (display_timestamp >= '2022-03-01 00:00:00'::timestamp without time zone) AND (display_timestamp <= '2023-03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = 'CARSWELL.WESTLAW'::text) AND (is_deleted = '0'::numeric) AND ((client_id)::text = 'WLNCE_VNJXL7'::text))
Filter: (((event_type)::text <> ALL ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::text[])) AND (((product_view)::text = ANY ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[])) OR (product_view IS NULL)))
Buffers: shared hit=328
Planning:
Buffers: shared hit=27
Planning Time: 0.321 ms
Execution Time: 0.741 ms
(10 rows)
Shared hit came down a little but is still 16x more than Oracle. The actual query will hit 12 partitions so the extra amount of work it does in Postgres adds up (note some data points will find significantly more rows than 332 so performance is much worse than shown here but this data point is used just to show the difference between Oracle and Postgres). The interesting part it is seems Postgres is doing 1 shared hit per row. I don’t know anyone on this list knows Postgres’s internal index implementations vs Oracle’s but is Postgres’s Index Scan operation significantly different than Oracle’s index range scan? There is something implemented much less efficiently here vs Oracle and I don’t know what I can do to get this query to perform more closer to Oracle.
Thanks in advance.
Here is the partition definition in Postgres:
\d hist28.history_event_display_timestamp_20230301
Table "hist28.history_event_display_timestamp_20230301"
Column | Type | Collation | Nullable | Default
------------------------+--------------------------------+-----------+----------+-----------------------------------------
display_timestamp | timestamp(6) without time zone | | not null |
history_event_sid | character varying(32) | | not null |
event_type | character varying(50) | | not null |
event_sub_type | character varying(100) | | |
content_type | character varying(100) | | |
related_entity_type | character varying(50) | | |
related_entity_id | character varying(256) | | |
prism_guid | character varying(50) | | not null |
client_id | character varying(250) | | |
matter_id | character varying(50) | | |
delimiter | character varying(20) | | |
session_id | character varying(35) | | not null |
is_deleted | numeric(1,0) | | not null | 0
created | timestamp(6) without time zone | | not null |
changed | timestamp(6) without time zone | | not null |
cobalt_event_timestamp | timestamp(6) without time zone | | |
product_sid | character varying(32) | | not null |
cobalt_event_id | character varying(256) | | |
unique_lookup_key | character varying(200) | | not null |
display_name | character varying(3000) | | not null |
event_payload | text | | |
sub_content_type | character varying(250) | | |
product_view | character varying(64) | | | 'DefaultProductView'::character varying
Partition of: hist28.history_event FOR VALUES FROM ('2023-02-01 00:00:00') TO ('2023-03-01 00:00:00')
Indexes:
"history_event_display_timestamp_20230301_pkey" PRIMARY KEY, btree (history_event_sid, display_timestamp)
"history_event_display_timest_prism_guid_display_timestamp_idx39" btree (prism_guid, display_timestamp, content_type, event_type, product_sid, is_deleted, related_entity_id)
"history_event_display_timesta_prism_guid_display_timestamp_idx1" UNIQUE, btree (prism_guid, display_timestamp, unique_lookup_key, event_type, session_id, related_entity_id, product_sid, is_deleted, client_
id, product_view)
"history_event_display_timesta_prism_guid_is_deleted_changed_idx" btree (prism_guid, is_deleted, changed)
"history_event_display_timestamp_20230301_changed_idx" btree (changed)
The index the query is using- history_event_display_timesta_prism_guid_display_timestamp_idx1- it is the same index being used in Oracle, just named differently in Oracle.
Here is the Oracle plan and execution statistics hitting the same partition:
select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from hist28.HISTORY_EVENT partition(DISPLAY_TIMESTAMP_20230301) historyeve0_ where historyeve0_.IS_DELETED=0
and historyeve0_.PRISM_GUID='i0accd6a20000018405f095ee669dc5b4'
and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW'
and (historyeve0_.EVENT_TYPE not in ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryEvent'))
and (historyeve0_.PRODUCT_VIEW in ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA')
or historyeve0_.PRODUCT_VIEW is null)
and historyeve0_.CLIENT_ID='WLNCE_VNJXL7'
and (historyeve0_.DISPLAY_TIMESTAMP between to_timestamp('2022-03-01 00:00:00','YYYY-MM-DD HH24:MI:SS') and to_timestamp('2023-03-01 23:59:59','YYYY-MM-DD HH24:MI:SS'));
Plan hash value: 3361538278
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 20 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 20 |
|* 2 | FILTER | | 1 | | 332 |00:00:00.01 | 20 |
| 3 | PARTITION RANGE SINGLE| | 1 | 1 | 332 |00:00:00.01 | 20 |
|* 4 | INDEX RANGE SCAN | HISTORY_EVENT_IDX02 | 1 | 1 | 332 |00:00:00.01 | 20 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_TIMESTAMP(:SYS_B_16,:SYS_B_17)<=TO_TIMESTAMP(:SYS_B_18,:SYS_B_19))
4 - access("HISTORYEVE0_"."PRISM_GUID"=:SYS_B_01 AND
"HISTORYEVE0_"."DISPLAY_TIMESTAMP">=TO_TIMESTAMP(:SYS_B_16,:SYS_B_17) AND
"HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 AND "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND
"HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND "HISTORYEVE0_"."DISPLAY_TIMESTAMP"<=TO_TIMESTAMP(:SYS_B_1
8,:SYS_B_19))
filter(("HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02
AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_03 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_04 AND
"HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_05 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_06 AND
"HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_07 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_08 AND
"HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND (INTERNAL_FUNCTION("HISTORYEVE0_"."PRODUCT_VIEW") OR
"HISTORYEVE0_"."PRODUCT_VIEW" IS NULL)))
It finds the same 332 rows but it does only 20 logical reads. The Oracle “access” step and the Postgres “Index Cond” columns used for index access appears to be the same for Postgres and Oracle.
Thanks
On Wed, 2023-09-06 at 20:06 +0000, Dirschel, Steve wrote: > We are in the process of converting from Oracle to Postgres and I have a query that is using > the same index in Postgres as is used in Oracle but in Postgres the query does 16x more > buffer/logical reads. I’d like to understand why. > > explain (analyze, buffers) > select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ > from hist28.history_event_display_timestamp_20230301 historyeve0_ where historyeve0_.IS_DELETED=0 > and historyeve0_.PRISM_GUID='i0accd6a20000018405f095ee669dc5b4' > and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW' > and (historyeve0_.EVENT_TYPE not in > ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryE > vent')) > and (historyeve0_.PRODUCT_VIEW in ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA') > or historyeve0_.PRODUCT_VIEW is null) > and historyeve0_.CLIENT_ID='WLNCE_VNJXL7' > and (historyeve0_.DISPLAY_TIMESTAMP between '2022-03-01 00:00:00' and '2023-03-01 23:59:59.999'); > > Aggregate (cost=56.64..56.65 rows=1 width=8) (actual time=0.930..0.931 rows=1 loops=1) > Buffers:shared hit=341 > -> Index Scan using history_event_display_timesta_prism_guid_display_timestamp_idx1 on history_event_display_timestamp_20230301historyeve0_ (cost=0.42..56.64 rows=1 width=33) (actual > time=0.034..0.897 rows=332 loops=1) > Index Cond: (((prism_guid)::text = 'i0accd6a20000018405f095ee669dc5b4'::text) AND (display_timestamp >= '2022-03-0100:00:00'::timestamp without time zone) AND (display_timestamp <= '2023- > 03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = 'CARSWELL.WESTLAW'::text) AND (is_deleted= '0'::numeric) AND ((client_id)::text = 'WLNCE_VNJXL7'::text)) > Filter: (((event_type)::text <> ALL > ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::t > ext[])) AND (((product_view)::text = ANY ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[]))OR (product_view IS NULL))) > Buffers: shared hit=341 > Planning: > Buffers: shared hit=6 > Planning Time: 0.266 ms > Execution Time: 0.965 ms > (10 rows) > > *** 341 logical reads to find 332 rows. > > Plan hash value: 3361538278 > > ---------------------------------------------------------------------------------------------------------- > | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | > ---------------------------------------------------------------------------------------------------------- > | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 20 | > | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 20 | > |* 2 | FILTER | | 1 | | 332 |00:00:00.01 | 20 | > | 3 | PARTITION RANGE SINGLE| | 1 | 1 | 332 |00:00:00.01 | 20 | > |* 4 | INDEX RANGE SCAN |HISTORY_EVENT_IDX02 | 1 | 1 | 332 |00:00:00.01 | 20 | > ---------------------------------------------------------------------------------------------------------- > > Predicate Information (identified by operation id): > --------------------------------------------------- > > 2 - filter(TO_TIMESTAMP(:SYS_B_16,:SYS_B_17)<=TO_TIMESTAMP(:SYS_B_18,:SYS_B_19)) > 4 - access("HISTORYEVE0_"."PRISM_GUID"=:SYS_B_01 AND > "HISTORYEVE0_"."DISPLAY_TIMESTAMP">=TO_TIMESTAMP(:SYS_B_16,:SYS_B_17) AND > "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 AND "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND > "HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND "HISTORYEVE0_"."DISPLAY_TIMESTAMP"<=TO_TIMESTAMP(:SYS_B_1 > 8,:SYS_B_19)) > filter(("HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 > AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_03 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_04 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_05 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_06 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_07 AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_08 AND > "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND (INTERNAL_FUNCTION("HISTORYEVE0_"."PRODUCT_VIEW") OR > "HISTORYEVE0_"."PRODUCT_VIEW" IS NULL))) > > It finds the same 332 rows but it does only 20 logical reads. In PostgreSQL, the 332 matching rows seem to me stored mostly on different pages, while they are better clustered in your Oracle table. If it is really very important for you, and the 57 milliseconds for the index scan is too much, you can reorganize the table with CLUSTER hist28.history_event_display_timestamp_20230301 USING history_event_display_timesta_prism_guid_display_timestamp_idx1; That should reduce the number of pages read. Yours, Laurenz Albe
On Thu, 7 Sept 2023 at 11:14, Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote: > select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from hist28.history_event_display_timestamp_20230301 historyeve0_where historyeve0_.IS_DELETED=0 > history_event_sid | character varying(32) | | not null | Do you get an Index Only Scan if you use count(*) instead of count(historyeve0_.HISTORY_EVENT_SID)? It seems that column is the only one missing from the index and that might be the only thing that's stopping PostgreSQL using an Index Only Scan. Perhaps Oracle realises that the column is NOT NULL so can do count(*) instead of counting the non-nulls from that column. That may allow it to Index Only Scan? I'm not versed in reading Oracle plans so don't know if it's possible to tell. David
On Wed, Sep 6, 2023 at 1:07 PM Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote: > Oracle will find the same 332 rows using the same index but in Oracle it only does 20 logical reads. I thought maybe theindex was fragmented so I reindexed that index: It seems likely that the problem here is that some of the predicates appear as so-called "Filter:" conditions, as opposed to true index quals. For reasons that aren't particularly fundamental, Postgres B-Trees cannot push down != (or <>) to the index level. Strangely enough, this is possible with IS NOT NULL. I'm working to improve things in this area. That difference is easy to see in the following example. The first query does far fewer buffer accesses than the. second query, despite being very similar: regression=# explain (analyze, buffers) select * from multi_test where a = 1 and b is not null; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Index Only Scan using multi_test_idx on multi_test (cost=0.29..50.10 rows=2191 width=8) (actual time=0.095..0.100 rows=14 loops=1) Index Cond: ((a = 1) AND (b IS NOT NULL)) Heap Fetches: 0 Buffers: shared hit=3 Planning Time: 0.056 ms Execution Time: 0.145 ms (6 rows) regression=# explain (analyze, buffers) select * from multi_test where a = 1 and b != 42; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Index Only Scan using multi_test_idx on multi_test (cost=0.29..222.57 rows=2191 width=8) (actual time=0.087..2.982 rows=14 loops=1) Index Cond: (a = 1) Filter: (b <> 42) Rows Removed by Filter: 10000 Heap Fetches: 0 Buffers: shared hit=11 Planning Time: 0.076 ms Execution Time: 3.204 ms (8 rows) (There are lots of index tuples matching (a, b) = "(1, NULL)" here, you see.) -- Peter Geoghegan
On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan <pg@bowt.ie> wrote: > It seems likely that the problem here is that some of the predicates > appear as so-called "Filter:" conditions, as opposed to true index > quals. hmm, if that were true we'd see "Rows Removed by Filter" in the explain analyze. I think all that's going on is that each tuple is on a different page and the heap accesses are just causing many buffers to be accessed. It seems to me that an IOS would likely fix that problem which is why I suggested count(*) instead of count(<not_null_column_thats_the_only_col_used_thats_not_in_the_index>) David
On Thu, Sep 7, 2023 at 3:48 AM David Rowley <dgrowleyml@gmail.com> wrote: > On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan <pg@bowt.ie> wrote: > > It seems likely that the problem here is that some of the predicates > > appear as so-called "Filter:" conditions, as opposed to true index > > quals. > > hmm, if that were true we'd see "Rows Removed by Filter" in the > explain analyze. That's why I hedged, with "seems likely". The problem with using filter conditions rather than true index quals isn't limited to the problem of extra heap accesses. It happened to be convenient to make my point that way, but that isn't particularly fundamental here. I deliberately chose to make my example involve an index-only scan (that doesn't require any heap accesses) for this reason. > I think all that's going on is that each tuple is on a different page > and the heap accesses are just causing many buffers to be accessed. This index is an absolute monstrosity. I find it far easier to believe that the real explanation is the one that Steve intuited: that there is an issue with the way that the physical data structures (which are more or less comparable in both systems) are accessed in Postgres. The index in question ("history_event_display_timesta_prism_guid_display_timestamp_idx1") has certain columns that are omitted from the query. These columns nevertheless appear between other columns that the query filters on. The leading two columns ("prism_guid" and "display_timestamp") are made into index quals by Postgres, but the next index column after those two ("unique_lookup_key") is omitted by the query, and so isn't an index qual. In fact *four* columns are omitted after that one. But, the very-low-order index column "product_sid" *does* appear in the query, and so also appears as Postgres index quals. There is every chance that the true underlying explanation is that Oracle is able to skip over significant parts of the index structure dynamically. In fact I'm practically certain that that's the case, since the "product_sid" column appears as an "access predicate", rather than as a "filter predicate". These terms are explained here: https://use-the-index-luke.com/sql/explain-plan/oracle/filter-predicates https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates How could "product_sid" be used as an "access predicate" given the omitted index columns? It seems very likely that parts of the index can be skipped in Oracle, but not in Postgres -- at least not yet. Like Markus Winand, I think that it's a real problem that EXPLAIN doesn't yet expose the difference between access predicates and filter predicates. Some of the index quals shown by EXPLAIN are marked SK_BT_REQFWD/SK_BT_REQBKWD by nbtree (meaning that they're what Oracle calls "access predicates"), while other are not (meaning that they're what Oracle calls "filter predicates"). That issue makes it far harder to spot these sorts of problems. > It seems to me that an IOS would likely fix that problem which is why I > suggested count(*) instead of > count(<not_null_column_thats_the_only_col_used_thats_not_in_the_index>) Probably, but why should we have to use an index-only scan? And what about my example, that already used one, and still showed a big disparity where there is no inherently reason why there had to be? My example didn't involve jumping to another part of the index because the problem seems to be more general than that. -- Peter Geoghegan
Thanks to all who replied on this issue. I overlooked the difference in the execution plan between Oracle and Postgres. Oracle in fact does an index range scan only with no table lookup. When I changed the query to be a count(*) Postgreswent from 332 logical reads to 19. Oracle did 20. Oracle did 20 regardless if it was count(*) or count(non-indexed-column). Regards Steve -----Original Message----- From: Peter Geoghegan <pg@bowt.ie> Sent: Thursday, September 7, 2023 8:22 PM To: David Rowley <dgrowleyml@gmail.com> Cc: Dirschel, Steve <steve.dirschel@thomsonreuters.com>; pgsql-general@lists.postgresql.org; Wong, Kam Fook (TR Technology)<kamfook.wong@thomsonreuters.com> Subject: [EXT] Re: Query performance going from Oracle to Postgres External Email: Use caution with links and attachments. On Thu, Sep 7, 2023 at 3:48 AM David Rowley <dgrowleyml@gmail.com> wrote: > On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan <pg@bowt.ie> wrote: > > It seems likely that the problem here is that some of the predicates > > appear as so-called "Filter:" conditions, as opposed to true index > > quals. > > hmm, if that were true we'd see "Rows Removed by Filter" in the > explain analyze. That's why I hedged, with "seems likely". The problem with using filter conditions rather than true index quals isn't limitedto the problem of extra heap accesses. It happened to be convenient to make my point that way, but that isn't particularlyfundamental here. I deliberately chose to make my example involve an index-only scan (that doesn't require any heap accesses) for this reason. > I think all that's going on is that each tuple is on a different page > and the heap accesses are just causing many buffers to be accessed. This index is an absolute monstrosity. I find it far easier to believe that the real explanation is the one that Steve intuited:that there is an issue with the way that the physical data structures (which are more or less comparable in bothsystems) are accessed in Postgres. The index in question ("history_event_display_timesta_prism_guid_display_timestamp_idx1") has certain columns that are omitted from the query. These columns nevertheless appear between other columns that the queryfilters on. The leading two columns ("prism_guid" and "display_timestamp") are made into index quals by Postgres, but the next indexcolumn after those two ("unique_lookup_key") is omitted by the query, and so isn't an index qual. In fact *four* columnsare omitted after that one. But, the very-low-order index column "product_sid" *does* appear in the query, and soalso appears as Postgres index quals. There is every chance that the true underlying explanation is that Oracle is able to skip over significant parts of the indexstructure dynamically. In fact I'm practically certain that that's the case, since the "product_sid" column appearsas an "access predicate", rather than as a "filter predicate". These terms are explained here: https://urldefense.com/v3/__https://use-the-index-luke.com/sql/explain-plan/oracle/filter-predicates__;!!GFN0sa3rsbfR8OLyAw!fZJXNap0mP7xLTcE_5unwlR5eDfBHgw2F5LTBDGtjV_btV6Zze0MSWnTHamU16Fmu-kII-FwrC4WK7WLP60$ https://urldefense.com/v3/__https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates__;!!GFN0sa3rsbfR8OLyAw!fZJXNap0mP7xLTcE_5unwlR5eDfBHgw2F5LTBDGtjV_btV6Zze0MSWnTHamU16Fmu-kII-FwrC4WCXw7ubs$ How could "product_sid" be used as an "access predicate" given the omitted index columns? It seems very likely that partsof the index can be skipped in Oracle, but not in Postgres -- at least not yet. Like Markus Winand, I think that it's a real problem that EXPLAIN doesn't yet expose the difference between access predicatesand filter predicates. Some of the index quals shown by EXPLAIN are marked SK_BT_REQFWD/SK_BT_REQBKWD by nbtree(meaning that they're what Oracle calls "access predicates"), while other are not (meaning that they're what Oraclecalls "filter predicates"). That issue makes it far harder to spot these sorts of problems. > It seems to me that an IOS would likely fix that problem which is why > I suggested count(*) instead of > count(<not_null_column_thats_the_only_col_used_thats_not_in_the_index> > ) Probably, but why should we have to use an index-only scan? And what about my example, that already used one, and still showeda big disparity where there is no inherently reason why there had to be? My example didn't involve jumping to anotherpart of the index because the problem seems to be more general than that. -- Peter Geoghegan