"New" bug?? Serious - crashes backend.

Started by ryanover 25 years ago22 messages
#1ryan
ryan@bel.bc.ca
1 attachment(s)

============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================

Your name : Ryan Rawson
Your email address : ryan@bel.bc.ca

System Configuration
---------------------
Architecture (example: Intel Pentium) :
Intel Pentium II

Operating System (example: Linux 2.0.26 ELF) :
Linux 2.2.16 ELF Debian/2.2

PostgreSQL version (example: PostgreSQL-7.1):
7.0 release 1
7.0.2

Compiler used (example: gcc 2.8.0) :
unknown

Please enter a FULL description of your problem:
------------------------------------------------
I have a database which reliable crashes the database system. When you
try to do insert/update on the table 'machines' the backend crashes with
what I think is signal 11.

Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------

I'm attaching a file which builds the database which crashes the
backend.

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
unknown.

-ryan

--
Ryan Rawson
System Administrator
Binary Environments Ltd.
ryan@bel.bc.ca

Attachments:

be.dump.july.6text/plain; charset=us-ascii; name=be.dump.july.6Download
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: ryan (#1)
Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

ryan <ryan@bel.bc.ca> writes:

I have a database which reliable crashes the database system. When you
try to do insert/update on the table 'machines' the backend crashes with
what I think is signal 11.

The problem seems to be that you have a foreign-key trigger on
'machines' which refers to a non-existent primary-key relation:

CREATE CONSTRAINT TRIGGER "siteid" AFTER INSERT OR UPDATE ON
"machines" NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE
PROCEDURE "RI_FKey_check_ins" ('siteid', 'machines', 'site',
'UNSPECIFIED', 'sites', 'siteid');

The 'site' argument is the name of the referenced table, and you
have no table named site.

There are at least two bugs here: the immediate cause of the crash
is lack of a check for heap_openr() failure in the RI trigger code,
but a larger question is why the system let you drop a table that
is the target of a referential integrity check (which I assume is
what you did to get into this state).

Anyway, dropping the siteid trigger, as well as any others that
refer to gone tables, ought to get you out of trouble for now.
Meanwhile the foreign-key boys have some work to do ...

regards, tom lane

#3Noname
JanWieck@t-online.de
In reply to: Tom Lane (#2)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

Tom Lane wrote:

There are at least two bugs here: the immediate cause of the crash
is lack of a check for heap_openr() failure in the RI trigger code,

Exactly where is that check missing (if it still is)?

but a larger question is why the system let you drop a table that
is the target of a referential integrity check (which I assume is
what you did to get into this state).

For me too.

Anyway, dropping the siteid trigger, as well as any others that
refer to gone tables, ought to get you out of trouble for now.
Meanwhile the foreign-key boys have some work to do ...

That's exactly the purpose of pg_trigger.tgconstrrelid, which
is filled with the opposite relations Oid for constraint
triggers. In RelationRemoveTriggers(), which is called
during DROP TABLE, theres a scan for it. That's where the

DROP TABLE implicitly drops referential ...

NOTICE message comes from. So I wonder how he got into that
state?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#3)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

JanWieck@t-online.de (Jan Wieck) writes:

Tom Lane wrote:

There are at least two bugs here: the immediate cause of the crash
is lack of a check for heap_openr() failure in the RI trigger code,

Exactly where is that check missing (if it still is)?

The heap_openr calls with NoLock --- the way heap_open[r] are set up
is that there's an elog on open failure iff you request a lock, but
if you don't then you have to check for a NULL return explicitly.
Perhaps this coding convention is too error-prone and ought to be
changed to have two different routine names, say "heap_open[r]"
and "heap_open[r]_noerr". Opinions anyone?

I had a note to myself that ri_triggers' use of NoLock was probably
a bug anyway. Shouldn't it be acquiring *some* kind of lock on the
referenced relation? Else someone might be deleting it out from
under you.

but a larger question is why the system let you drop a table that
is the target of a referential integrity check (which I assume is
what you did to get into this state).

For me too.

What about renaming as opposed to dropping? Since the triggers are set
up to use names rather than OIDs, seems like they are vulnerable to a
rename. Maybe they should be using table OIDs in their parameter lists.
(That'd make pg_dump's life harder however...)

regards, tom lane

#5Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#4)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

JanWieck@t-online.de (Jan Wieck) writes:

Tom Lane wrote:

There are at least two bugs here: the immediate cause of the crash
is lack of a check for heap_openr() failure in the RI trigger code,

Exactly where is that check missing (if it still is)?

The heap_openr calls with NoLock --- the way heap_open[r] are set up
is that there's an elog on open failure iff you request a lock, but
if you don't then you have to check for a NULL return explicitly.
Perhaps this coding convention is too error-prone and ought to be
changed to have two different routine names, say "heap_open[r]"
and "heap_open[r]_noerr". Opinions anyone?

We already have heap_open and heap_openr. Seems another is too hard.
Better to give them a parameter to control it. The API is confusing enough.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Perhaps this coding convention is too error-prone and ought to be
changed to have two different routine names, say "heap_open[r]"
and "heap_open[r]_noerr". Opinions anyone?

We already have heap_open and heap_openr. Seems another is too hard.
Better to give them a parameter to control it. The API is confusing
enough.

It is confusing, but we have here graphic evidence that the way it's
currently done is confusing. In a quick check, I found several other
cases of the same error that must have crept in over the past year or
so. So I'm now convinced that we'd better change the API of these
routines to make it crystal-clear whether you are getting a check for
open failure or not.

I like a different routine name better than a check-or-no-check
parameter. If you invoke the no-check case then you *MUST* have a check
for failure return --- forgetting to do this is exactly the problem.
So I think it should be harder to get at the no-check case, and you
should have to write something that reminds you that the routine is not
checking for you. Thus "heap_open_noerr" (I'm not particularly wedded
to that suffix, though, if anyone has a better idea for what to call
it). A parameter would only be useful if the same calling code might
reasonably do different things at different times --- but either there's
a check following the call, or there's not.

regards, tom lane

#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#6)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

I like a different routine name better than a check-or-no-check
parameter. If you invoke the no-check case then you *MUST* have a check
for failure return --- forgetting to do this is exactly the problem.
So I think it should be harder to get at the no-check case, and you
should have to write something that reminds you that the routine is not
checking for you. Thus "heap_open_noerr" (I'm not particularly wedded
to that suffix, though, if anyone has a better idea for what to call
it). A parameter would only be useful if the same calling code might
reasonably do different things at different times --- but either there's
a check following the call, or there's not.

OK.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Noname
JanWieck@t-online.de
In reply to: Tom Lane (#4)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

Tom Lane wrote:

but a larger question is why the system let you drop a table that
is the target of a referential integrity check (which I assume is
what you did to get into this state).

For me too.

What about renaming as opposed to dropping? Since the triggers are set
up to use names rather than OIDs, seems like they are vulnerable to a
rename. Maybe they should be using table OIDs in their parameter lists.
(That'd make pg_dump's life harder however...)

That at least shows how he might have gotten there. And yes,
they need to either keep track of renamings or use OID's.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#9Stephan Szabo
sszabo@kick.com
In reply to: Noname (#3)
Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

but a larger question is why the system let you drop a table that
is the target of a referential integrity check (which I assume is
what you did to get into this state).

For me too.

Anyway, dropping the siteid trigger, as well as any others that
refer to gone tables, ought to get you out of trouble for now.
Meanwhile the foreign-key boys have some work to do ...

That's exactly the purpose of pg_trigger.tgconstrrelid, which
is filled with the opposite relations Oid for constraint
triggers. In RelationRemoveTriggers(), which is called
during DROP TABLE, theres a scan for it. That's where the

DROP TABLE implicitly drops referential ...

NOTICE message comes from. So I wonder how he got into that
state?

I don't know in his case, but I think you could get into this state
from a partial restore from pg_dump. If you restore one of the
two tables, and create the constraint trigger for the RI_FKey_check_ins
but the other table doesn't really exist, it will crash. I just tried it on
a 7.0.2 system by making a table with an int and then defining the
check_ins trigger manually with create constraint trigger with a bad
referenced table.

Also, I realized something else that is a little wierd. When a temporary
table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

#10Noname
JanWieck@t-online.de
In reply to: Stephan Szabo (#9)
Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

Stephan Szabo wrote:

Also, I realized something else that is a little wierd. When a temporary
table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

Outch - that hurts. Haven't checked it yet, but from what I
have in memory it should be a possibility to violate
constraints.

Damned.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#11Stephan Szabo
sszabo@kick.com
In reply to: Noname (#10)
Re: Foreign key bugs + other problems

Stephan Szabo wrote:

Also, I realized something else that is a little wierd. When a

temporary

table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

Outch - that hurts. Haven't checked it yet, but from what I
have in memory it should be a possibility to violate
constraints.

Yeah, I realized it when I was going in on AlterTableAddConstraint that I
need to prevent constraints referencing temporary tables on permanent
tables, and then I realized that the shadowing is a problem. Also, this is
a problem for other users too, what about people who log things to
other tables via rules and triggers? At least you can't shadow the
system catalogs :-)

I think that schemas might help, if you assume that at creation time of a
rule or constraint you must qualify any tables being used in a way that
prevents misunderstandings, since temporary tables live in a different
system defined schema assuming that schema.tablename is not
shadowed, only the unadorned tablename. In the FK case, I think
that the idea of moving to keeping oids would probably be the way
to go (that way the table is very explicitly defined as a particular one).
Not that this will help right now since I don't think we can make an
SPI request that will handle it.

Or, you might be able to make a case that you CANNOT shadow a table
that is referenced by a constraint (due to the permanent table constraints
cannot reference a temporary table restriction). Since the creation of
the temp table would break the restriction, it is illegal.

-----

In an unrelated problem. Well, I was thinking, well, maybe we could for
this transaction during the execution of the trigger, rename the temp table
and then rename it back. Noone else should see the change (right?) because
we're not comitted and that user isn't doing anything else while we're
checking.
However, this tickles another problem. It seems that you can't rename a
temp
table. In fact it does something bad:

create table z (a int);
create temp table z (a int);
alter table z rename to zz;
select * from z;
ERROR: relation 'z' does not exist
select * from zz;
- 0 rows
\q
<enter again>
select * from z;
NOTICE: mdopen: couldn't open z: No such file or directory
NOTICE: RelationIdBuildRelation: smgropen(z): Input/output error
NOTICE: mdopen: couldn't open z: No such file or directory
NOTICE: mdopen: couldn't open z: No such file or directory
NOTICE: mdopen: couldn't open z: No such file or directory
NOTICE: mdopen: couldn't open z: No such file or directory
ERROR: cannot open relation z
select * from zz;
- 0 rows

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#8)
Re: [HACKERS] Foreign key bugs (Re: "New" bug?? Serious - crashes backend.)

JanWieck@t-online.de (Jan Wieck) writes:

Tom Lane wrote:

What about renaming as opposed to dropping? Since the triggers are set
up to use names rather than OIDs, seems like they are vulnerable to a
rename. Maybe they should be using table OIDs in their parameter lists.
(That'd make pg_dump's life harder however...)

That at least shows how he might have gotten there. And yes,
they need to either keep track of renamings or use OID's.

I got mail from Ryan earlier admitting that he'd hand-edited a dump file
and reloaded it, so that was how his triggers got out of sync with the
table names. But still, it sounds like we need to fix the RENAME case.

regards, tom lane

#13Noname
JanWieck@t-online.de
In reply to: Stephan Szabo (#11)
Re: Foreign key bugs + other problems

Stephan Szabo wrote:

Stephan Szabo wrote:

Also, I realized something else that is a little wierd. When a

temporary

table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

Outch - that hurts. Haven't checked it yet, but from what I
have in memory it should be a possibility to violate
constraints.

Yeah, I realized it when I was going in on AlterTableAddConstraint that I
need to prevent constraints referencing temporary tables on permanent
tables, and then I realized that the shadowing is a problem. Also, this is
a problem for other users too, what about people who log things to
other tables via rules and triggers? At least you can't shadow the
system catalogs :-)

I think triggers are in general problematic in this context.
They usually use SPI and name the tables in the querystring.
If a trigger uses saved plans (as RI does as much as
possible), the problem is gone if the trigger has been
invoked once and prepared all the plans. But if it's invoked
the first time while a temp table exists, it'll do the wrong
things and save the wrong plan for future invocations.

Rules aren't affected, because they refer to tables by OID
allways.

[...]

Or, you might be able to make a case that you CANNOT shadow a table
that is referenced by a constraint (due to the permanent table constraints
cannot reference a temporary table restriction). Since the creation of
the temp table would break the restriction, it is illegal.

Good point. What does the standard say about it? Can a table,
referred to by foreign key constraints or referential
actions, be shadowed by a temp table?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#9)
Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

"Stephan Szabo" <sszabo@kick.com> writes:

Also, I realized something else that is a little wierd. When a temporary
table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

Seems to me it should certainly be going to the permanent table, which
is another argument in favor of making the link via OID not table name.
The existing code will get this wrong.

regards, tom lane

#15Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Noname (#13)
Re: Foreign key bugs + other problems

Or, you might be able to make a case that you CANNOT shadow a table
that is referenced by a constraint (due to the permanent table constraints
cannot reference a temporary table restriction). Since the creation of
the temp table would break the restriction, it is illegal.

Good point. What does the standard say about it? Can a table,
referred to by foreign key constraints or referential
actions, be shadowed by a temp table?

Maybe we should prevent shadowing of a table that is used as a foreign
key.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#16Stephan Szabo
sszabo@kick.com
In reply to: Noname (#13)
Re: Foreign key bugs + other problems

Rules aren't affected, because they refer to tables by OID
always.

Ah, that's good. I hadn't tried that variation because I don't really
use rules that much other than select rules for views.

Or, you might be able to make a case that you CANNOT shadow a table
that is referenced by a constraint (due to the permanent table

constraints

cannot reference a temporary table restriction). Since the creation of
the temp table would break the restriction, it is illegal.

Good point. What does the standard say about it? Can a table,
referred to by foreign key constraints or referential
actions, be shadowed by a temp table?

Well, that's the question. I don't see anything in the spec saying that it
can't precisely. It's a question of wording on the rules about the
constraint.

11.8 Syntax rule 4 Case a)
If the referencing table is a permanent base table, then the referenced
table
shall be a persistant base table.

So, if you shadowed the table by a temp table, would this no longer be true
(and therefore the create is illegal) or are you supposed to imply that you
still reference the persistant base table despite the fact it is shadowed?
I'd
guess the latter because it's a syntax rule (I had thought it was a general
rule until I went to look it up).

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#11)
Re: Foreign key bugs + other problems

"Stephan Szabo" <sszabo@kick.com> writes:

However, this tickles another problem. It seems that you can't rename
a temp table. In fact it does something bad:

Are you using current CVS? I thought I'd fixed that recently.

regards, tom lane

#18Noname
JanWieck@t-online.de
In reply to: Tom Lane (#14)
Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

Tom Lane wrote:

"Stephan Szabo" <sszabo@kick.com> writes:

Also, I realized something else that is a little wierd. When a temporary
table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

Seems to me it should certainly be going to the permanent table, which
is another argument in favor of making the link via OID not table name.
The existing code will get this wrong.

But even if the trigger knows the OID of the table to query,
can it prepare a plan to do so via SPI? I think no.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#19Stephan Szabo
sszabo@kick.com
In reply to: Noname (#3)
Re: Foreign key bugs (Re: [BUGS] "New" bug?? Serious - crashesbackend.)

"Stephan Szabo" <sszabo@kick.com> writes:

Also, I realized something else that is a little wierd. When a

temporary

table shadows a permanent table that you've made a foreign key reference
to, which table should it be going to check the constraint?

Seems to me it should certainly be going to the permanent table, which
is another argument in favor of making the link via OID not table name.
The existing code will get this wrong.

Can I force the SPI query that's being generated to use the permanent
table rather than the shadowed table when they have the same name?
If not, then storing the oid isn't sufficient without moving away from
SPI. I do agree that storing the oids is a good idea (and am planning to
change it unless someone comes up with a compelling reason not to)
since the only way via something like SPI that I can think of is once
we have schemas, using schemaname.tablename which may not be
shadowed by the temp table and it'll just be easier for everyone
involved if we store the oid.

#20Stephan Szabo
sszabo@kick.com
In reply to: Noname (#10)
Re: Foreign key bugs + other problems

The machine I tried it on is only a 7.0.2 system. My CVS system
is at home on a unconnected dialup, so I can't try it from work.
I'll try it when I get back tonight.

Show quoted text

"Stephan Szabo" <sszabo@kick.com> writes:

However, this tickles another problem. It seems that you can't rename
a temp table. In fact it does something bad:

Are you using current CVS? I thought I'd fixed that recently.

#21Noname
JanWieck@t-online.de
In reply to: Bruce Momjian (#15)
Re: Foreign key bugs + other problems

Bruce Momjian wrote:

Or, you might be able to make a case that you CANNOT shadow a table
that is referenced by a constraint (due to the permanent table constraints
cannot reference a temporary table restriction). Since the creation of
the temp table would break the restriction, it is illegal.

Good point. What does the standard say about it? Can a table,
referred to by foreign key constraints or referential
actions, be shadowed by a temp table?

Maybe we should prevent shadowing of a table that is used as a foreign
key.

And open another DOS attack possibility? Each user could
potentially create a permanent table of a name used as a temp
table somewhere inside of an application. Then create another
referencing it and boom - the application goes down. Not that
good IMHO.

I think we need to hand OID's to the RI triggers and have
some special syntax to tell the parser that a table or
attribute is named by OID in the actual query.

Will cause alot of headaches for the next pg_dump/restore
cycle.

(Bruce: still waiting for your call)

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#22Mike Mascari
mascarm@mascari.com
In reply to: Noname (#21)
Re: Foreign key bugs + other problems

Jan Wieck wrote:

Maybe we should prevent shadowing of a table that is used as a foreign
key.

And open another DOS attack possibility? Each user could
potentially create a permanent table of a name used as a temp
table somewhere inside of an application. Then create another
referencing it and boom - the application goes down. Not that
good IMHO.

I think we need to hand OID's to the RI triggers and have
some special syntax to tell the parser that a table or
attribute is named by OID in the actual query.

Will cause alot of headaches for the next pg_dump/restore
cycle.

(Bruce: still waiting for your call)

Jan

Just curious, but couldn't this also be done in the TRUNCATE
TABLE case? Where the consensus was to disallow TRUNCATE TABLE on
a table whose attribute was the foreign key of another relation?
It seems that proper implementation of SCHEMA and a finer
granularity of GRANT/REVOKE is need to address these issues.

Mike Mascari