Expression push down from Join Node to below node. - Mailing list pgsql-hackers

From Shubhankar Anand Kulkarni
Subject Expression push down from Join Node to below node.
Date
Msg-id 196f702a715.7308025321701.1672670818480320441@zohocorp.com
Whole thread Raw
List pgsql-hackers

Hi Hackers,
 
While dealing with a few queries, I noticed that when the join expression (join clause) is used in projection as well, the expression will be computed twice.
For a better understanding, please take reference from the following example:


SELECT sensitive_data1, column1 FROM benchmark_encytion AS t1 LEFT JOIN ( SELECT aes256_cbc_decrypt( c1, '\x1234' :: bytea, '\x5678' :: bytea ) AS column1 FROM cipher ) AS t2 ON t1.sensitive_data1 = t2.column1;


As you can see in the above Query, the join clause involves the column which needs to be decrypted first and then joined with other table and then in projection we need the needed decrypted values to print as well, in this case the plan generated by the PG is as mentioned below
(refer to the image as well):


                                                                                                 
      QUERY PLAN                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Right Join  (cost=22.74..73.43 rows=1 width=65)
   Output: t1.sensitive_data1, aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea)
   Hash Cond: (aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea) = t1.sensitive_data1)
   ->  Foreign Scan on public.cipher  (cost=0.00..50.68 rows=1 width=49)
         Output: cipher.c1, cipher.c2, cipher.c3, cipher.c4, cipher.c5, cipher.c6
         CStore Dir: /home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/116931
         CStore Table Size: 2424 kB
   ->  Hash  (cost=22.72..22.72 rows=1 width=33)
         Output: t1.sensitive_data1
         ->  Foreign Scan on public.benchmark_encytion t1  (cost=0.00..22.72 rows=1 width=33)
               Output: t1.sensitive_data1
               CStore Dir: /home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/135230
               CStore Table Size: 1268 kB
Query Identifier: 1810637692808683603
(14 rows)



As seen in the plan, join clause uses aes256_cbc_decrypt funcExpr to join columns and we are selecting the same as projection from hasjJoin node resulting in computing the expr twice, which is very costly.
 
My doubt here is, while planing this join, why can't we parse the join clause and pass the expressions involved there to the respective scan nodes and use it above that wherever needed as a Var
In this particular case, we can push_down the expression (decrypt funcExpr) from join clause to the foreign scan of cipher table. Why have we not handled this case in PG?
Pls share your thoughts on the same, also pls correct me if my understanding is wrong here.


Sample_case_for_expr_push_down.png






Thanks and Regards.
Shubhankar  Kulkarni
ZLabs-CStore



Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add comment explaining why queryid is int64 in pg_stat_statements
Next
From: Amit Langote
Date:
Subject: Re: generic plans and "initial" pruning