Getting NOT NULL constraint from pg_attribute

Started by Wu Ivyover 7 years ago8 messages
#1Wu Ivy
ivywuyzl@gmail.com

Hi developers,

I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:

TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.

Any idea of why is it happening?
Thanks in advance!

Best,
Ivy

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Wu Ivy (#1)
Re: Getting NOT NULL constraint from pg_attribute

Wu Ivy <ivywuyzl@gmail.com> writes:

I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:

TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.

The output columns of a SELECT query are never marked nullable, regardless
of what the source data was.

regards, tom lane

#3Wu Ivy
ivywuyzl@gmail.com
In reply to: Tom Lane (#2)
Re: Getting NOT NULL constraint from pg_attribute

Hi tom,

Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too clear on the definition of attnotnull. Can you give me a example in which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while getting the data from SPI_cursor_fetch? The only way I can think is to call another separate command to query the table schema, but it will be in a separate transaction in that case.

Thank you again!
Best,
Ivy

Show quoted text

On Aug 17, 2018, at 6:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wu Ivy <ivywuyzl@gmail.com> writes:

I’m currently building a Postgres C extension that fetch data from a Postgres table.
Since the table can be large, in order to prevent memory overrun, I use SPI_cursor_fetch to fetch chunks of data. The result rows are saved in SPITupleTable* SPI_tuptable and attributes are saved in SPI_tuptable->tupdesc.
In order to process my data, I need to get information of column nullability (whether column has NOT NULL constrain). I can get this information by calling:

TupleDesc tupdesc = SPI_tuptable->tupdesc;
bool is_nullable = TupleDescAttr(tupdesc, column_num - 1) -> attnotnull;
However, the result (is_nullable) is always 0, meaning the column does not have NOT NULLl constraint, even for columns that do have the NOT NULL constraint.

The output columns of a SELECT query are never marked nullable, regardless
of what the source data was.

regards, tom lane

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Wu Ivy (#3)
Re: Getting NOT NULL constraint from pg_attribute

On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:

Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I
call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too
clear on the definition of *attnotnull*. Can you give me a example in
which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while
getting the data from SPI_cursor_fetch? The only way I can think is to call
another separate command to query the table schema, but it will be in a
separate transaction in that case.

Basically the nullability property is used by the planner for optimization
during the joining of physical tables. As soon as you try outputting
columns the ability to enforce not null goes away because of, in
particular, outer joins. While some changes could maybe be made the
cost-benefit to do so doesn't seem favorable.

David J.

#5Wu Ivy
ivywuyzl@gmail.com
In reply to: David G. Johnston (#4)
Re: Getting NOT NULL constraint from pg_attribute

Thanks for the response. Really appreciate it!

Regards,
Ivy

2018-08-20 10:40 GMT-07:00 David G. Johnston <david.g.johnston@gmail.com>:

Show quoted text

On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:

Thanks for the quick respond.
Why are SELECT query never marked nullable? For nullable columns, when I
call SPI_getvalue(), the result (in char*) is NULL. I don’t think I’m too
clear on the definition of *attnotnull*. Can you give me a example in
which the tupleTable is can be marked nullable?
Also, is there any other ways to get nullability of each column while
getting the data from SPI_cursor_fetch? The only way I can think is to call
another separate command to query the table schema, but it will be in a
separate transaction in that case.

Basically the nullability property is used by the planner for optimization
during the joining of physical tables. As soon as you try outputting
columns the ability to enforce not null goes away because of, in
particular, outer joins. While some changes could maybe be made the
cost-benefit to do so doesn't seem favorable.

David J.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#4)
Re: Getting NOT NULL constraint from pg_attribute

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:

Why are SELECT query never marked nullable?

Basically the nullability property is used by the planner for optimization
during the joining of physical tables. As soon as you try outputting
columns the ability to enforce not null goes away because of, in
particular, outer joins. While some changes could maybe be made the
cost-benefit to do so doesn't seem favorable.

A further thought on this is that really it's a historical accident that
the elements of tuple descriptors are exactly pg_attribute rows. There
are a *whole lot* of fields in pg_attribute that aren't especially
relevant to tuple sets generated on-the-fly within a query, and typically
won't get filled with anything except default values. The only fields
that really mean a lot for a dynamic tuple set are the data type and
values derived from that, and in some usages the column name.

[ wanders away wondering if it'd be worth our time to design a new,
more compact TupleDesc struct without the meaningless fields ... ]

regards, tom lane

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#6)
Re: Getting NOT NULL constraint from pg_attribute

On 2018-08-23 11:04:30 -0400, Tom Lane wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:

Why are SELECT query never marked nullable?

Basically the nullability property is used by the planner for optimization
during the joining of physical tables. As soon as you try outputting
columns the ability to enforce not null goes away because of, in
particular, outer joins. While some changes could maybe be made the
cost-benefit to do so doesn't seem favorable.

A further thought on this is that really it's a historical accident that
the elements of tuple descriptors are exactly pg_attribute rows. There
are a *whole lot* of fields in pg_attribute that aren't especially
relevant to tuple sets generated on-the-fly within a query, and typically
won't get filled with anything except default values. The only fields
that really mean a lot for a dynamic tuple set are the data type and
values derived from that, and in some usages the column name.

And arguably there's a fair bit of redundancy in pg_attribute, just
because it's convenient for tupledescs. Given that pg_attribute very
commonly is the largest catalog table by far, that very well could use
some attention. Without tupdescs in mind, there's really not much point
for pg_attribute to repeat a good portion of pg_type again, for example,
nor is attcacheoff really meaningful.

[ wanders away wondering if it'd be worth our time to design a new,
more compact TupleDesc struct without the meaningless fields ... ]

Yes, I think it'd would be.

Greetings,

Andres Freund

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: Getting NOT NULL constraint from pg_attribute

Andres Freund <andres@anarazel.de> writes:

And arguably there's a fair bit of redundancy in pg_attribute, just
because it's convenient for tupledescs. Given that pg_attribute very
commonly is the largest catalog table by far, that very well could use
some attention. Without tupdescs in mind, there's really not much point
for pg_attribute to repeat a good portion of pg_type again, for example,
nor is attcacheoff really meaningful.

Agreed about attcacheoff, but I'm less sure that we can drop the
"redundant" info copied from pg_type. The sticking point there is
that somebody could drop a column, then drop the type the column had,
but you still need to be able to skip over values in that column.
So at least attlen and attalign are not removable.

regards, tom lane