Use zero for nullness estimates of system attributes - Mailing list pgsql-hackers

From Edmund Horner
Subject Use zero for nullness estimates of system attributes
Date
Msg-id CAMyN-kCa3BFUFrCTtQeprxTU1anCd3Pua7zXstGCKq4pXgjukw@mail.gmail.com
Whole thread Raw
Responses Re: Use zero for nullness estimates of system attributes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I added some code to selfuncs.c to estimate the selectivity of CTID, including nullness, in my ongoing attempt to add TID range scans [1].  And as Tom pointed out [2], no system attribute can be null, so we might as well handle them all.

That's what the attached patch does.
I observed a few interesting things with outer join selectivity:

While system attributes aren't NULL in the table, they can be in queries such as:

    SELECT *
    FROM a LEFT JOIN b ON a.id = b.id
    WHERE b.ctid IS NULL;

And the patch does affect the estimates for such plans.  But it's just replacing one hardcoded nullness (0.005) for another (0.0), which seems no worse than the original.

I was a bit concerned that with, for example,

    CREATE TABLE a (id INTEGER);
    INSERT INTO a SELECT * FROM generate_series(1,1000);
    ANALYZE a;
    CREATE TABLE b (id INTEGER, id2 INTEGER);
    INSERT INTO b SELECT *, * FROM generate_series(1,10);
    ANALYZE b;

    EXPLAIN ANALYZE
    SELECT * FROM a LEFT JOIN b ON a.id = b.id
    WHERE b.ctid IS NULL;

you get a row estimate of 1 (vs the actual 990).  It's not specific to system attributes.  Plain left-join selectivity calculation doesn't seem to take into account the join selectivity, while anti-join calculation does.

I do not think this affects the usefulness of the present patch, but maybe it's something we could improve.

Finally: I thought about introducing a macro to attnum.h:

    /*
    * AttrNumberIsForSystemAttr
    * True iff the attribute number corresponds to a system attribute.
    */
    #define AttrNumberIsForSystemAttr(attributeNumber) \
     ((bool) ((attributeNumber) < 0))

But there's a zillion places that could be changed to use it, so I haven't in this version of the patch.

Edmund


Attachment

pgsql-hackers by date:

Previous
From: Chapman Flack
Date:
Subject: Re: PostgreSQL vs SQL/XML Standards
Next
From: Pavel Stehule
Date:
Subject: Re: proposal - plpgsql unique statement id