Thread: Getting NOT NULL constraint from pg_attribute
Hi developers,
I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:
TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.
Any idea of why is it happening?
Thanks in advance!
Best,
Ivy
Wu Ivy <ivywuyzl@gmail.com> writes: > I’m currently building a Postgres C extension that fetch data from a Postgres table. > Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The resultrows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc. > In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). Ican get this information by calling: > TupleDesc tupdesc = SPI_tuptable->tupdesc; > bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull; > However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columnsthat do have the NOT NULL constraint. The output columns of a SELECT query are never marked nullable, regardless of what the source data was. regards, tom lane
Hi tom,
Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.
Thank you again!
Best,
Ivy
On Aug 17, 2018, at 6:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Wu Ivy <ivywuyzl@gmail.com> writes:I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.
The output columns of a SELECT query are never marked nullable, regardless
of what the source data was.
regards, tom lane
On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:
Thanks for the quick respond.Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.
Basically the nullability property is used by the planner for optimization during the joining of physical tables. As soon as you try outputting columns the ability to enforce not null goes away because of, in particular, outer joins. While some changes could maybe be made the cost-benefit to do so doesn't seem favorable.
David J.
Thanks for the response. Really appreciate it!
Regards,
Ivy
2018-08-20 10:40 GMT-07:00 David G. Johnston <david.g.johnston@gmail.com>:
On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:Thanks for the quick respond.Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.Basically the nullability property is used by the planner for optimization during the joining of physical tables. As soon as you try outputting columns the ability to enforce not null goes away because of, in particular, outer joins. While some changes could maybe be made the cost-benefit to do so doesn't seem favorable.David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote: >> Why are SELECT query never marked nullable? > Basically the nullability property is used by the planner for optimization > during the joining of physical tables. As soon as you try outputting > columns the ability to enforce not null goes away because of, in > particular, outer joins. While some changes could maybe be made the > cost-benefit to do so doesn't seem favorable. A further thought on this is that really it's a historical accident that the elements of tuple descriptors are exactly pg_attribute rows. There are a *whole lot* of fields in pg_attribute that aren't especially relevant to tuple sets generated on-the-fly within a query, and typically won't get filled with anything except default values. The only fields that really mean a lot for a dynamic tuple set are the data type and values derived from that, and in some usages the column name. [ wanders away wondering if it'd be worth our time to design a new, more compact TupleDesc struct without the meaningless fields ... ] regards, tom lane
On 2018-08-23 11:04:30 -0400, Tom Lane wrote: > "David G. Johnston" <david.g.johnston@gmail.com> writes: > > On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote: > >> Why are SELECT query never marked nullable? > > > Basically the nullability property is used by the planner for optimization > > during the joining of physical tables. As soon as you try outputting > > columns the ability to enforce not null goes away because of, in > > particular, outer joins. While some changes could maybe be made the > > cost-benefit to do so doesn't seem favorable. > > A further thought on this is that really it's a historical accident that > the elements of tuple descriptors are exactly pg_attribute rows. There > are a *whole lot* of fields in pg_attribute that aren't especially > relevant to tuple sets generated on-the-fly within a query, and typically > won't get filled with anything except default values. The only fields > that really mean a lot for a dynamic tuple set are the data type and > values derived from that, and in some usages the column name. And arguably there's a fair bit of redundancy in pg_attribute, just because it's convenient for tupledescs. Given that pg_attribute very commonly is the largest catalog table by far, that very well could use some attention. Without tupdescs in mind, there's really not much point for pg_attribute to repeat a good portion of pg_type again, for example, nor is attcacheoff really meaningful. > [ wanders away wondering if it'd be worth our time to design a new, > more compact TupleDesc struct without the meaningless fields ... ] Yes, I think it'd would be. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > And arguably there's a fair bit of redundancy in pg_attribute, just > because it's convenient for tupledescs. Given that pg_attribute very > commonly is the largest catalog table by far, that very well could use > some attention. Without tupdescs in mind, there's really not much point > for pg_attribute to repeat a good portion of pg_type again, for example, > nor is attcacheoff really meaningful. Agreed about attcacheoff, but I'm less sure that we can drop the "redundant" info copied from pg_type. The sticking point there is that somebody could drop a column, then drop the type the column had, but you still need to be able to skip over values in that column. So at least attlen and attalign are not removable. regards, tom lane