Thread: Getting NOT NULL constraint from pg_attribute

Getting NOT NULL constraint from pg_attribute

From
Wu Ivy
Date:
Hi developers,

I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc. 
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:

TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.

Any idea of why is it happening? 
Thanks in advance!

Best,
Ivy

Re: Getting NOT NULL constraint from pg_attribute

From
Tom Lane
Date:
Wu Ivy <ivywuyzl@gmail.com> writes:
> I’m currently building a Postgres C extension that fetch data from a Postgres table.
> Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The
resultrows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.  
> In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain).
Ican get this information by calling: 

>     TupleDesc tupdesc = SPI_tuptable->tupdesc;
>     bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
> However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for
columnsthat do have the NOT NULL constraint. 

The output columns of a SELECT query are never marked nullable, regardless
of what the source data was.

            regards, tom lane


Re: Getting NOT NULL constraint from pg_attribute

From
Wu Ivy
Date:
Hi tom,

Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.

Thank you again!
Best,
Ivy
On Aug 17, 2018, at 6:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wu Ivy <ivywuyzl@gmail.com> writes:
I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:

TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.

The output columns of a SELECT query are never marked nullable, regardless
of what the source data was.

regards, tom lane

Re: Getting NOT NULL constraint from pg_attribute

From
"David G. Johnston"
Date:
On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:
Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.

Basically the nullability property is used by the planner for optimization during the joining of physical tables.  As soon as you try outputting columns the ability to enforce not null goes away because of, in particular, outer joins.  While some changes could maybe be made the cost-benefit to do so doesn't seem favorable.

David J.

Re: Getting NOT NULL constraint from pg_attribute

From
Wu Ivy
Date:
Thanks for the response. Really appreciate it!

Regards,
Ivy

2018-08-20 10:40 GMT-07:00 David G. Johnston <david.g.johnston@gmail.com>:
On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:
Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.

Basically the nullability property is used by the planner for optimization during the joining of physical tables.  As soon as you try outputting columns the ability to enforce not null goes away because of, in particular, outer joins.  While some changes could maybe be made the cost-benefit to do so doesn't seem favorable.

David J.


Re: Getting NOT NULL constraint from pg_attribute

From
Tom Lane
Date:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:
>> Why are SELECT query never marked nullable?

> Basically the nullability property is used by the planner for optimization
> during the joining of physical tables.  As soon as you try outputting
> columns the ability to enforce not null goes away because of, in
> particular, outer joins.  While some changes could maybe be made the
> cost-benefit to do so doesn't seem favorable.

A further thought on this is that really it's a historical accident that
the elements of tuple descriptors are exactly pg_attribute rows.  There
are a *whole lot* of fields in pg_attribute that aren't especially
relevant to tuple sets generated on-the-fly within a query, and typically
won't get filled with anything except default values.  The only fields
that really mean a lot for a dynamic tuple set are the data type and
values derived from that, and in some usages the column name.

[ wanders away wondering if it'd be worth our time to design a new,
more compact TupleDesc struct without the meaningless fields ... ]

            regards, tom lane


Re: Getting NOT NULL constraint from pg_attribute

From
Andres Freund
Date:
On 2018-08-23 11:04:30 -0400, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:
> >> Why are SELECT query never marked nullable?
> 
> > Basically the nullability property is used by the planner for optimization
> > during the joining of physical tables.  As soon as you try outputting
> > columns the ability to enforce not null goes away because of, in
> > particular, outer joins.  While some changes could maybe be made the
> > cost-benefit to do so doesn't seem favorable.
> 
> A further thought on this is that really it's a historical accident that
> the elements of tuple descriptors are exactly pg_attribute rows.  There
> are a *whole lot* of fields in pg_attribute that aren't especially
> relevant to tuple sets generated on-the-fly within a query, and typically
> won't get filled with anything except default values.  The only fields
> that really mean a lot for a dynamic tuple set are the data type and
> values derived from that, and in some usages the column name.

And arguably there's a fair bit of redundancy in pg_attribute, just
because it's convenient for tupledescs.  Given that pg_attribute very
commonly is the largest catalog table by far, that very well could use
some attention.  Without tupdescs in mind, there's really not much point
for pg_attribute to repeat a good portion of pg_type again, for example,
nor is attcacheoff really meaningful.


> [ wanders away wondering if it'd be worth our time to design a new,
> more compact TupleDesc struct without the meaningless fields ... ]

Yes, I think it'd would be.

Greetings,

Andres Freund


Re: Getting NOT NULL constraint from pg_attribute

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> And arguably there's a fair bit of redundancy in pg_attribute, just
> because it's convenient for tupledescs.  Given that pg_attribute very
> commonly is the largest catalog table by far, that very well could use
> some attention.  Without tupdescs in mind, there's really not much point
> for pg_attribute to repeat a good portion of pg_type again, for example,
> nor is attcacheoff really meaningful.

Agreed about attcacheoff, but I'm less sure that we can drop the
"redundant" info copied from pg_type.  The sticking point there is
that somebody could drop a column, then drop the type the column had,
but you still need to be able to skip over values in that column.
So at least attlen and attalign are not removable.

            regards, tom lane