Hi Hackers,
While dealing with a few queries, I noticed that when the join expression (join clause) is used in projection as well, the expression will be computed twice.
For a better understanding, please take reference from the following example:
SELECT sensitive_data1, column1 FROM benchmark_encytion AS t1 LEFT JOIN ( SELECT aes256_cbc_decrypt( c1, '\x1234' :: bytea, '\x5678' :: bytea ) AS column1 FROM cipher ) AS t2 ON t1.sensitive_data1 = t2.column1;
|
As you can see in the above Query, the join clause involves the column which needs to be decrypted first and then joined with other table and then in projection we need the needed decrypted values to print as well, in this case the plan generated by the PG is as mentioned below (refer to the image as well):
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Right Join (cost=22.74..73.43 rows=1 width=65)
Output: t1.sensitive_data1, aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea)
Hash Cond: (aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea) = t1.sensitive_data1)
-> Foreign Scan on public.cipher (cost=0.00..50.68 rows=1 width=49)
Output: cipher.c1, cipher.c2, cipher.c3, cipher.c4, cipher.c5, cipher.c6
CStore Dir: /home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/116931
CStore Table Size: 2424 kB
-> Hash (cost=22.72..22.72 rows=1 width=33)
Output: t1.sensitive_data1
-> Foreign Scan on public.benchmark_encytion t1 (cost=0.00..22.72 rows=1 width=33)
Output: t1.sensitive_data1
CStore Dir: /home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/135230
CStore Table Size: 1268 kB
Query Identifier: 1810637692808683603
(14 rows)
As seen in the plan, join clause uses aes256_cbc_decrypt funcExpr to join columns and we are selecting the same as projection from hasjJoin node resulting in computing the expr twice, which is very costly.
My doubt here is, while planing this join, why can't we parse the join clause and pass the expressions involved there to the respective scan nodes and use it above that wherever needed as a Var?
In this particular case, we can push_down the expression (decrypt funcExpr) from join clause to the foreign scan of cipher table. Why have we not handled this case in PG?
Pls share your thoughts on the same, also pls correct me if my understanding is wrong here.

Thanks and Regards.
Shubhankar Kulkarni
ZLabs-CStore