Reduce "Var IS [NOT] NULL" quals during constant folding - Mailing list pgsql-hackers

From BharatDB
Subject Reduce "Var IS [NOT] NULL" quals during constant folding
Date
Msg-id CAAh00ETEMEXntw1gxp=xP+4sqrz80tK1R4VEhTpqH9CJpxs-wA@mail.gmail.com
Whole thread Raw
List pgsql-hackers

Subject: Contribution Interest – Bug Fix on Reducing "Var IS [NOT] NULL" Quals During Constant Folding

Dear Team,

I hope this message finds you well.

With reference to the conversation ongioing in message ID: (CAMbWs487wxHq0fUu+Cew7Y+n+wpY_B--PT-61syXrE40uZqErw , 
I am writing to express my interest in contributing to the ongoing work 
on fixing the bug related to reducing “Var IS [NOT] NULL” quals during constant folding. As part of my initial efforts, I have been exploring the planner code paths and introducing an approach to gather early catalog information for base relations.

Specifically, I have implemented the following:

  • Preprocessing Relation RTEs

  • Added a new static function preprocess_relation_rtes() in src/backend/optimizer/plan/planner.c to collect attribute-level metadata (e.g., NOT NULL, generated columns, inheritance flags) at an early stage.

  • Hash Table for Attribute Info

  • Defined a new planner-local hash table (relattrinfo_htab) for storing per-attribute information.

  • Created a supporting header (src/include/optimizer/relattrinfo.h) and implementation file (src/backend/optimizer/util/relattrinfo.c).

  • Planner Integration

  • Initialized the attribute info hash table in subquery_planner().

  • Ensured preprocessing is invoked before pull_up_sublinks().

  • Optimization Step

  • Extended the constant folding path to check stored attribute metadata during IS [NOT] NULL qual simplification, allowing safe reduction to constant true/false where applicable.


1. Declare - preprocess_relation_rtes()

File: in subquery_planner() inside src/backend/optimizer/plan/planner.c.

  • Insert your new step before pull_up_sublinks().

Snippet:

    /*
     * Do preprocessing of RTEs before we pull up sublinks
     */
    preprocess_relation_rtes(root);


2.Create a new static function preprocess_relation_rtes() in planner.c:

File: src/backend/optimizer/plan/planner.c

Snippet:

/*
 * preprocess_relation_rtes
 *   Gather early catalog info for each base relation
 *   (NOT NULL attrs, generated cols, inheritance flags).
 */
static void preprocess_relation_rtes(PlannerInfo *root)
{
    ListCell *lc;
    Relation rel;

    foreach(lc, root->parse->rtable)
    {
        RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);

        if (rte->rtekind == RTE_RELATION)
        {
            rel = table_open(rte->relid, AccessShareLock);

            /* Collect attnotnull info */
            TupleDesc tupdesc = RelationGetDescr(rel);
            for (int attno = 0; attno < tupdesc->natts; attno++)
            {
                Form_pg_attribute attr = TupleDescAttr(tupdesc, attno);
                if (!attr->attisdropped)
                {
                    if (attr->attnotnull)
                        store_notnull_info(root, rte->relid, attno + 1);

                    if (attr->attgenerated)
                        store_generated_info(root, rte->relid, attno + 1);
                }
            }

            /* Inheritance flag */
            if (rel->rd_rel->relhassubclass)
                store_inh_info(root, rte->relid);

            table_close(rel, AccessShareLock);
        }
    }

}


3. Filenamesrc/include/nodes/pathnodes.h

Snippet:

typedef struct PlannerInfo

{

 HTAB *relattrinfo_htab; /* plannerlocal cache of attr metadata */

} PlannerInfo;


4. Create a new header.

Location: src/include/optimizer/relattrinfo.h

Snippet:

#ifndef RELATTRINFO_H

#define RELATTRINFO_H

#include "postgres.h"

#include "utils/hsearch.h"


/* key for hash table */

typedef struct RelAttrInfoKey

{

Oid relid; /* relation OID */

AttrNumber attno; /* attribute number (1-based) */

} RelAttrInfoKey;


/* value stored in hash table */

typedef struct RelAttrInfoEntry

{

RelAttrInfoKey key;

bool notnull;

bool generated;

} RelAttrInfoEntry;

extern HTAB *create_relattrinfo_htab(void);

#endif /* RELATTRINFO_H */


5. Created a new file.

Location: src/backend/optimizer/util/relattrinfo.c

Snippet:

#include "postgres.h"

#include "optimizer/relattrinfo.h"

HTAB *create_relattrinfo_htab(void)

{

HASHCTL ctl;

memset(&ctl, 0, sizeof(ctl));

ctl.keysize = sizeof(RelAttrInfoKey);

ctl.entrysize = sizeof(RelAttrInfoEntry);

return hash_create("Planner relattr info cache",

128, /* initial size */

&ctl,

HASH_ELEM | HASH_BLOBS);

}


6. File: src/backend/optimizer/plan/planner.c in subquery_planner() (this runs before most rewrites).

Snippet:

/* Create hash table for early attr info */

root->relattrinfo_htab = create_relattrinfo_htab();


/* Collect early info before pull_up_sublinks */

preprocess_relation_rtes(root);



7. Created a hash table

File: src/backend/optimizer/plan/planner.c

Example:

typedef struct RelAttrInfoKey
{
    Oid relid;
    AttrNumber attno;
} RelAttrInfoKey;

typedef struct RelAttrInfoEntry
{
    RelAttrInfoKey key;
    bool notnull;
    bool generated;
} RelAttrInfoEntry;

Initialize it in subquery_planner():

root->relattrinfo_htab = create_relattrinfo_htab();


8. Constant folding

if (IsA(arg, Var))
{
    Var *var = (Var *) arg;
    if (lookup_notnull_info(root, var->varno, var->varattno))
    {
        if (ntest->nulltesttype == IS_NOT_NULL)
            return (Expr *) makeBoolConst(true, false);
        else
            return (Expr *) makeBoolConst(false, false);
    }
}

With this groundwork, my plan is to continue refining the approach, validating correctness against inheritance cases, and ensuring compliance with planner invariants. I would also like to engage with reviewers to align this with the broader design direction.

Please let me know how best I can proceed with submitting my patch for review and contributing further to this fix.

Thank you for your time and consideration.


Best Regards,

Soumya


pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: Non-reproducible AIO failure
Next
From: Andrew Jackson
Date:
Subject: Re: Adding pg_dump flag for parallel export to pipes