Increase value of OUTER_VAR

Started by Andrei Lepikhovabout 5 years ago29 messageshackers
Jump to latest
#1Andrei Lepikhov
lepihov@gmail.com

Hi,

Playing with a large value of partitions I caught the limit with 65000
table entries in a query plan:

if (IS_SPECIAL_VARNO(list_length(glob->finalrtable)))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("too many range table entries")));

Postgres works well with so many partitions.
The constants INNER_VAR, OUTER_VAR, INDEX_VAR are used as values of the
variable 'var->varno' of integer type. As I see, they were introduced
with commit 1054097464 authored by Marc G. Fournier, in 1996.
Value 65000 was relevant to the size of the int type at that time.

Maybe we will change these values to INT_MAX? (See the patch in attachment).

--
regards,
Andrey Lepikhov
Postgres Professional

Attachments:

0001-Set-values-of-special-varnos-to-the-upper-bound-of-t.patchtext/plain; charset=UTF-8; name=0001-Set-values-of-special-varnos-to-the-upper-bound-of-t.patch; x-mac-creator=0; x-mac-type=0Download+3-4
#2David Rowley
dgrowleyml@gmail.com
In reply to: Andrei Lepikhov (#1)
Re: Increase value of OUTER_VAR

On Wed, 3 Mar 2021 at 21:29, Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote:

Playing with a large value of partitions I caught the limit with 65000
table entries in a query plan:

if (IS_SPECIAL_VARNO(list_length(glob->finalrtable)))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("too many range table entries")));

Postgres works well with so many partitions.
The constants INNER_VAR, OUTER_VAR, INDEX_VAR are used as values of the
variable 'var->varno' of integer type. As I see, they were introduced
with commit 1054097464 authored by Marc G. Fournier, in 1996.
Value 65000 was relevant to the size of the int type at that time.

Maybe we will change these values to INT_MAX? (See the patch in attachment).

I don't really see any reason not to increase these a bit, but I'd
rather we kept them at some realistic maximum rather than all-out went
to INT_MAX.

I imagine a gap was left between 65535 and 65000 to allow space for
more special varno in the future. We did get INDEX_VAR since then, so
it seems like it was probably a good idea to leave a gap.

The problem I see what going close to INT_MAX is that the ERROR you
mention is unlikely to work correctly since a list_length() will never
get close to having INT_MAX elements before palloc() would exceed
MaxAllocSize for the elements array.

Something like 1 million seems like a more realistic limit to me.
That might still be on the high side, but it'll likely mean we'd not
need to revisit this for quite a while.

David

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Rowley (#2)
Re: Increase value of OUTER_VAR

On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 3 Mar 2021 at 21:29, Andrey Lepikhov <a.lepikhov@postgrespro.ru> wrote:

Playing with a large value of partitions I caught the limit with 65000
table entries in a query plan:

if (IS_SPECIAL_VARNO(list_length(glob->finalrtable)))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("too many range table entries")));

Postgres works well with so many partitions.
The constants INNER_VAR, OUTER_VAR, INDEX_VAR are used as values of the
variable 'var->varno' of integer type. As I see, they were introduced
with commit 1054097464 authored by Marc G. Fournier, in 1996.
Value 65000 was relevant to the size of the int type at that time.

Maybe we will change these values to INT_MAX? (See the patch in attachment).

I don't really see any reason not to increase these a bit, but I'd
rather we kept them at some realistic maximum rather than all-out went
to INT_MAX.

I imagine a gap was left between 65535 and 65000 to allow space for
more special varno in the future. We did get INDEX_VAR since then, so
it seems like it was probably a good idea to leave a gap.

The problem I see what going close to INT_MAX is that the ERROR you
mention is unlikely to work correctly since a list_length() will never
get close to having INT_MAX elements before palloc() would exceed
MaxAllocSize for the elements array.

Something like 1 million seems like a more realistic limit to me.
That might still be on the high side, but it'll likely mean we'd not
need to revisit this for quite a while.

+1

Also, I got reminded of this discussion from not so long ago:

/messages/by-id/16302-e45634e2c0e34e97@postgresql.org

--
Amit Langote
EDB: http://www.enterprisedb.com

#4Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Langote (#3)
Re: Increase value of OUTER_VAR

On Wed, Mar 3, 2021 at 4:57 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml@gmail.com> wrote:

Something like 1 million seems like a more realistic limit to me.
That might still be on the high side, but it'll likely mean we'd not
need to revisit this for quite a while.

+1

Also, I got reminded of this discussion from not so long ago:

/messages/by-id/16302-e45634e2c0e34e97@postgresql.org

+1

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#3)
Re: Increase value of OUTER_VAR

Amit Langote <amitlangote09@gmail.com> writes:

Also, I got reminded of this discussion from not so long ago:

/messages/by-id/16302-e45634e2c0e34e97@postgresql.org

Yeah. Nobody seems to have pursued Peter's idea of changing the magic
values to small negative ones, but that seems like a nicer idea than
arguing over what large positive value is large enough.

(Having said that, I remain pretty dubious that we're anywhere near
getting any real-world use out of such a change.)

regards, tom lane

#6Andrei Lepikhov
lepihov@gmail.com
In reply to: Julien Rouhaud (#4)
Re: Increase value of OUTER_VAR

On 3/3/21 12:52, Julien Rouhaud wrote:

On Wed, Mar 3, 2021 at 4:57 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml@gmail.com> wrote:

Something like 1 million seems like a more realistic limit to me.
That might still be on the high side, but it'll likely mean we'd not
need to revisit this for quite a while.

+1

Also, I got reminded of this discussion from not so long ago:

/messages/by-id/16302-e45634e2c0e34e97@postgresql.org

Thank you

+1

Ok. I changed the value to 1 million and explained this decision in the
comment.
This issue caused by two cases:
1. Range partitioning on a timestamp column.
2. Hash partitioning.
Users use range distribution by timestamp because they want to insert
new data quickly and analyze entire set of data.
Also, in some discussions, I see Oracle users discussing issues with
more than 1e5 partitions.

--
regards,
Andrey Lepikhov
Postgres Professional

Attachments:

0001-Increase-values-of-special-varnos-to-1-million.patchtext/plain; charset=UTF-8; name=0001-Increase-values-of-special-varnos-to-1-million.patch; x-mac-creator=0; x-mac-type=0Download+3-4
#7Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andrei Lepikhov (#6)
Re: Increase value of OUTER_VAR

On 3/4/21 8:43 AM, Andrey Lepikhov wrote:

On 3/3/21 12:52, Julien Rouhaud wrote:

On Wed, Mar 3, 2021 at 4:57 PM Amit Langote <amitlangote09@gmail.com>
wrote:

On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml@gmail.com>
wrote:

Something like 1 million seems like a more realistic limit to me.
That might still be on the high side, but it'll likely mean we'd not
need to revisit this for quite a while.

+1

Also, I got reminded of this discussion from not so long ago:

/messages/by-id/16302-e45634e2c0e34e97@postgresql.org

Thank you

+1

Ok. I changed the value to 1 million and explained this decision in the
comment.

IMO just bumping up the constants from ~65k to 1M is a net loss, for
most users. We add this to bitmapsets, which means we're using ~8kB with
the current values, but this jumps to 128kB with this higher value. This
also means bms_next_member etc. have to walk much more memory, which is
bound to have some performance impact for everyone.

Switching to small negative values is a much better idea, but it's going
to be more invasive - we'll have to offset the values in the bitmapsets,
or we'll have to invent a new bitmapset variant that can store negative
values directly (e.g. by keeping two separate bitmaps internally, one
for negative and one for positive values). But that complicates other
stuff too (e.g. bms_next_member now returns -1 to signal "end").

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#7)
Re: Increase value of OUTER_VAR

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

IMO just bumping up the constants from ~65k to 1M is a net loss, for
most users. We add this to bitmapsets, which means we're using ~8kB with
the current values, but this jumps to 128kB with this higher value. This
also means bms_next_member etc. have to walk much more memory, which is
bound to have some performance impact for everyone.

Hmm, do we really have any places that include OUTER_VAR etc in
bitmapsets? They shouldn't appear in relid sets, for sure.
I agree though that if they did, this would have bad performance
consequences.

I still think the negative-special-values approach is better.
If there are any places that that would break, we'd find out about
it in short order, rather than having a silent performance lossage.

regards, tom lane

#9Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#8)
Re: Increase value of OUTER_VAR

On 3/4/21 4:16 PM, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

IMO just bumping up the constants from ~65k to 1M is a net loss, for
most users. We add this to bitmapsets, which means we're using ~8kB with
the current values, but this jumps to 128kB with this higher value. This
also means bms_next_member etc. have to walk much more memory, which is
bound to have some performance impact for everyone.

Hmm, do we really have any places that include OUTER_VAR etc in
bitmapsets? They shouldn't appear in relid sets, for sure.
I agree though that if they did, this would have bad performance
consequences.

Hmmm, I don't know. I mostly assumed that if I do pull_varnos() it would
include those values. But maybe that's not supposed to happen.

I still think the negative-special-values approach is better.
If there are any places that that would break, we'd find out about
it in short order, rather than having a silent performance lossage.

OK

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#9)
Re: Increase value of OUTER_VAR

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 3/4/21 4:16 PM, Tom Lane wrote:

Hmm, do we really have any places that include OUTER_VAR etc in
bitmapsets? They shouldn't appear in relid sets, for sure.
I agree though that if they did, this would have bad performance
consequences.

Hmmm, I don't know. I mostly assumed that if I do pull_varnos() it would
include those values. But maybe that's not supposed to happen.

But (IIRC) those varnos are never used till setrefs.c fixes up the plan
to replace normal Vars with references to lower plan nodes' outputs.
I'm not sure why anyone would be doing pull_varnos() after that;
it would not give very meaningful results.

regards, tom lane

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#10)
Re: Increase value of OUTER_VAR

Just as a proof of concept, I tried the attached, and it passes
check-world. So if there's anyplace trying to stuff OUTER_VAR and
friends into bitmapsets, it's pretty far off the beaten track.

The main loose ends that'd have to be settled seem to be:

(1) What data type do we want Var.varno to be declared as? In the
previous thread, Robert opined that plain "int" isn't a good choice,
but I'm not sure I agree. There's enough "int" for rangetable indexes
all over the place that it'd be a fool's errand to try to make it
uniformly something different.

(2) Does that datatype change need to propagate anywhere besides
what I touched here? I did not make any effort to search for
other places.

regards, tom lane

Attachments:

remove-64k-rangetable-limit-wip.patchtext/x-diff; charset=us-ascii; name=remove-64k-rangetable-limit-wip.patchDownload+7-17
#12Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#11)
Re: Increase value of OUTER_VAR

On 04.03.21 20:01, Tom Lane wrote:

Just as a proof of concept, I tried the attached, and it passes
check-world. So if there's anyplace trying to stuff OUTER_VAR and
friends into bitmapsets, it's pretty far off the beaten track.

The main loose ends that'd have to be settled seem to be:

(1) What data type do we want Var.varno to be declared as? In the
previous thread, Robert opined that plain "int" isn't a good choice,
but I'm not sure I agree. There's enough "int" for rangetable indexes
all over the place that it'd be a fool's errand to try to make it
uniformly something different.

int seems fine.

(2) Does that datatype change need to propagate anywhere besides
what I touched here? I did not make any effort to search for
other places.

I think

Var.varnosyn
CurrentOfExpr.cvarno

should also have their type changed.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#12)
Re: Increase value of OUTER_VAR

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 04.03.21 20:01, Tom Lane wrote:

(2) Does that datatype change need to propagate anywhere besides
what I touched here? I did not make any effort to search for
other places.

I think

Var.varnosyn
CurrentOfExpr.cvarno

should also have their type changed.

Agreed as to CurrentOfExpr.cvarno. But I think the entire point of
varnosyn is that it saves the original rangetable reference and
*doesn't* get overwritten with OUTER_VAR etc. So that one is a
different animal, and I'm inclined to leave it as Index.

regards, tom lane

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#13)
Re: Increase value of OUTER_VAR

On 06.03.21 15:59, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 04.03.21 20:01, Tom Lane wrote:

(2) Does that datatype change need to propagate anywhere besides
what I touched here? I did not make any effort to search for
other places.

I think

Var.varnosyn
CurrentOfExpr.cvarno

should also have their type changed.

Agreed as to CurrentOfExpr.cvarno. But I think the entire point of
varnosyn is that it saves the original rangetable reference and
*doesn't* get overwritten with OUTER_VAR etc. So that one is a
different animal, and I'm inclined to leave it as Index.

Can we move forward with this?

I suppose there was still some uncertainty about whether all the places
that need changing have been identified, but do we have a better idea
how to find them?

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#14)
Re: Increase value of OUTER_VAR

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

Can we move forward with this?

I suppose there was still some uncertainty about whether all the places
that need changing have been identified, but do we have a better idea
how to find them?

We could just push the change and see what happens. But I was thinking
more in terms of doing that early in the v15 cycle. I remain skeptical
that we need a near-term fix.

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#15)
Re: Increase value of OUTER_VAR

I wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

Can we move forward with this?

We could just push the change and see what happens. But I was thinking
more in terms of doing that early in the v15 cycle. I remain skeptical
that we need a near-term fix.

To make sure we don't forget, I added an entry to the next CF for this.

regards, tom lane

#17Andrei Lepikhov
lepihov@gmail.com
In reply to: Tom Lane (#16)
Re: Increase value of OUTER_VAR

On 4/8/21 8:13 AM, Tom Lane wrote:

I wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

Can we move forward with this?

We could just push the change and see what happens. But I was thinking
more in terms of doing that early in the v15 cycle. I remain skeptical
that we need a near-term fix.

To make sure we don't forget, I added an entry to the next CF for this.

Thanks for your efforts.

I tried to dive deeper: replace ROWID_VAR with -4 and explicitly change
types of varnos in the description of functions that can only work with
special varnos.
Use cases of OUTER_VAR looks simple (i guess). Use cases of INNER_VAR is
more complex because of the map_variable_attnos(). It is needed to
analyze how negative value of INNER_VAR can affect on this function.

INDEX_VAR causes potential problem:
in ExecInitForeignScan() and ExecInitForeignScan() we do
tlistvarno = INDEX_VAR;

here tlistvarno has non-negative type.

ROWID_VAR caused two problems in the check-world tests:
set_pathtarget_cost_width():
if (var->varno < root->simple_rel_array_size)
{
RelOptInfo *rel = root->simple_rel_array[var->varno];
...

and

replace_nestloop_params_mutator():
if (!bms_is_member(var->varno, root->curOuterRels))

I skipped this problems to see other weak points, but check-world
couldn't find another.

--
regards,
Andrey Lepikhov
Postgres Professional

Attachments:

0001-remove-64k-rangetable-limit-wip.patchtext/x-patch; charset=UTF-8; name=0001-remove-64k-rangetable-limit-wip.patchDownload+23-31
#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrei Lepikhov (#17)
Re: Increase value of OUTER_VAR

Here's a more fleshed-out version of this patch. I ran around and
fixed all the places where INNER_VAR etc. were being assigned directly to
a variable or parameter of type Index, and also grepped for 'Index.*varno'
to find suspicious declarations. (I didn't change every last instance
of the latter though; just places that could possibly be looking at
post-setrefs.c Vars.)

I concluded that we don't really need to change the type of
CurrentOfExpr.cvarno, because that's never set to a special value.

The main thing I remain concerned about is whether there are more
places like set_pathtarget_cost_width(), where we could be making
an inequality comparison on "varno" that would now be wrong.
I tried to catch this by enabling -Wsign-compare and -Wsign-conversion,
but that produced so many thousands of uninteresting warnings that
I soon gave up. I'm not sure there's any good way to catch remaining
places like that except to commit the patch and wait for trouble
reports.

So I'm inclined to propose pushing this and seeing what happens.

regards, tom lane

Attachments:

remove-64k-rangetable-limit-1.patchtext/x-diff; charset=us-ascii; name=remove-64k-rangetable-limit-1.patchDownload+48-57
#19David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#18)
Re: Increase value of OUTER_VAR

On Sat, 3 Jul 2021 at 06:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

So I'm inclined to propose pushing this and seeing what happens.

Is this really sane?

As much as I would like to see the 65k limit removed, I just have
reservations about fixing it in this way. Even if we get all the
cases fixed in core, there's likely a whole bunch of extensions
that'll have bugs as a result of this for many years to come.

"git grep \sIndex\s -- *.[ch] | wc -l" is showing me 77 matches in the
Citus code. That's not the only extension that uses the planner hook.

I'm really just not sure it's worth all the dev hours fixing the
fallout. To me, it seems much safer to jump bump 65k up to 1m. It'll
be a while before anyone complains about that.

It's also not that great to see the number of locations that you
needed to add run-time checks for negative varnos. That's not going to
come for free.

David

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#19)
Re: Increase value of OUTER_VAR

David Rowley <dgrowleyml@gmail.com> writes:

Is this really sane?

As much as I would like to see the 65k limit removed, I just have
reservations about fixing it in this way. Even if we get all the
cases fixed in core, there's likely a whole bunch of extensions
that'll have bugs as a result of this for many years to come.

Maybe. I'm not that concerned about planner hacking: almost all of
the planner is only concerned with pre-setrefs.c representations and
will never see these values. Still, the fact that we had to inject
a couple of explicit IS_SPECIAL_VARNO tests is a bit worrisome.
(I'm more surprised really that noplace in the executor needed it.)
FWIW, experience with those places says that such bugs will be
exposed immediately; it's not like they'd lurk undetected "for years".

You might argue that the int-vs-Index declaration changes are
something that would be much harder to get right, but in reality
those are almost entirely cosmetic. We could make them completely
so by changing the macro to

#define IS_SPECIAL_VARNO(varno) ((int) (varno) < 0)

so that it'd still do the right thing when applied to a variable
declared as Index. (In the light of morning, I'm not sure why
I didn't do that already.) But we've always been extremely
cavalier about whether RT indexes should be declared as int or
Index, so I felt that standardizing on the former was actually
a good side-effect of the patch.

Anyway, to address your point more directly: as I recall, the main
objection to just increasing the values of these constants was the
fear that it'd bloat bitmapsets containing these values. Now on
the one hand, this patch has proven that noplace in the core code
does that today. On the other hand, there's no certainty that
someone might not try to do that tomorrow (if we don't fix it as
per this patch); or extensions might be doing so.

I'm really just not sure it's worth all the dev hours fixing the
fallout. To me, it seems much safer to jump bump 65k up to 1m. It'll
be a while before anyone complains about that.

TBH, if we're to approach it that way, I'd be inclined to go for
broke and raise the values to ~2B. Then (a) we'll be shut of the
problem pretty much permanently, and (b) if someone does try to
make a bitmapset containing these values, hopefully they'll see
performance bad enough to expose the issue immediately.

It's also not that great to see the number of locations that you
needed to add run-time checks for negative varnos. That's not going to
come for free.

Since the test is just "< 0", I pretty much disbelieve that argument.
There are only two such places in the patch, and neither of them
are *that* performance-sensitive.

Anyway, the raise-the-values solution does have the advantage of
being a four-liner, so I can live with it if that's the consensus.
But I do think this way is cleaner in the long run, and I doubt
the argument that it'll create any hard-to-detect bugs.

regards, tom lane

#21Andrei Lepikhov
lepihov@gmail.com
In reply to: Tom Lane (#18)
#22Aleksander Alekseev
aleksander@timescale.com
In reply to: Andrei Lepikhov (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Aleksander Alekseev (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrei Lepikhov (#21)
#25Andrei Lepikhov
lepihov@gmail.com
In reply to: Tom Lane (#23)
#26Aleksander Alekseev
aleksander@timescale.com
In reply to: Andrei Lepikhov (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrei Lepikhov (#25)
#28Andrei Lepikhov
lepihov@gmail.com
In reply to: Aleksander Alekseev (#26)
#29Aleksander Alekseev
aleksander@timescale.com
In reply to: Andrei Lepikhov (#28)