TEXT vs VARCHAR join qual push down diffrence, bug or expected? - Mailing list pgsql-hackers

From Jeevan Chalke
Subject TEXT vs VARCHAR join qual push down diffrence, bug or expected?
Date
Msg-id CAM2+6=X-_ZF26hsAkgQ_=vocNOZGwLLpJ-UiO4eYY47Q+s48uA@mail.gmail.com
Whole thread Raw
Responses Re: TEXT vs VARCHAR join qual push down diffrence, bug or expected?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

It is observed that, when we have one remote (huge) table and one local
(small) table and a join between them, then
 1. If the column type is text, then we push the join qual to the remote
    server, so that we will have less rows to fetch, and thus execution time
    is very less.
 2. If the column type is varchar, then we do not push the join qual to the
    remote server, resulting into large number of data fetch and thus
    execution time is very high.

Here is the EXPLAIN plan for such queries:

When VARCHAR column:

                                                                QUERY PLAN                                                                
-------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=100.15..4594935.73 rows=230 width=120) (actual time=0.490..291.339 rows=1 loops=1)
   Output: a.ename, d.dname
   Join Filter: ((a.deptno)::text = (d.deptno)::text)
   Rows Removed by Join Filter: 100099
   ->  Index Scan using emp2_pk on public.emp2 a  (cost=0.15..8.17 rows=1 width=76) (actual time=0.009..0.013 rows=1 loops=1)
         Output: a.empno, a.ename, a.deptno
         Index Cond: (a.empno = '7369'::numeric)
   ->  Foreign Scan on public.fdw_dept2 d  (cost=100.00..4594353.50 rows=45925 width=120) (actual time=0.466..274.990 rows=100100 loops=1)
         Output: d.deptno, d.dname
         Remote SQL: SELECT deptno, dname FROM public.dept2
 Planning time: 0.697 ms
 Execution time: 291.467 ms
(12 rows)



When TEXT column:

                                                          QUERY PLAN                                                         
------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=100.57..216.63 rows=238 width=120) (actual time=0.375..0.378 rows=1 loops=1)
   Output: a.ename, d.dname
   ->  Index Scan using emp3_pk on public.emp3 a  (cost=0.15..8.17 rows=1 width=70) (actual time=0.010..0.011 rows=1 loops=1)
         Output: a.empno, a.ename, a.deptno
         Index Cond: (a.empno = '7369'::numeric)
   ->  Foreign Scan on public.fdw_dept3 d  (cost=100.42..208.45 rows=1 width=114) (actual time=0.362..0.362 rows=1 loops=1)
         Output: d.deptno, d.dname
         Remote SQL: SELECT deptno, dname FROM public.dept3 WHERE (($1::text = deptno))
 Planning time: 1.220 ms
 Execution time: 0.498 ms
(10 rows)


Attached test script to reproduce this theory.

I have observed that, since we do not have an equality operator for VARCHAR
type, we convert VARCHAR to TEXT using RelabelType and use texteq operator
function.
However in foreign_expr_walker(), for T_RelabelType case, we have these
conditions which do not allow us push the qual to remote.

                /*
                 * RelabelType must not introduce a collation not derived from
                 * an input foreign Var.
                 */
                collation = r->resultcollid;
                if (collation == InvalidOid)
                    state = FDW_COLLATE_NONE;
                else if (inner_cxt.state == FDW_COLLATE_SAFE &&
                         collation == inner_cxt.collation)
                    state = FDW_COLLATE_SAFE;
                else
                    state = FDW_COLLATE_UNSAFE;


I guess, since we do push qual to remote in case of TEXT, we should do the
same for VARCHAR too.

Also given that RelabelType are just dummy wrapper for binary compatible
types, can we simply set collation and state from its inner context instead
on above check block. Like

                /*
                 * Since RelabelType represents a "dummy" type coercion between
                 * two binary-compatible datatypes, set collation and state got
                 * from the inner_cxt.
                 */
                collation = inner_cxt.collation;
                state = inner_cxt.state;

Inputs/Thought?


--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug in numeric multiplication
Next
From: James Sewell
Date:
Subject: Streaming Replication clusters and load balancing