BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

Started by Joe Van Dykalmost 14 years ago9 messagesbugs
Jump to latest
#1Joe Van Dyk
joe@tanga.com

The following bug has been logged on the website:

Bug reference: 6699
Logged by: Joe Van Dyk
Email address: joe@tanga.com
PostgreSQL version: 9.1.4
Operating system: OSX
Description:

$ pg_restore -O -j 4 ~/tanga.dump -d tanga_dev_full_backup

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 417; 1259 66296 VIEW
site_channels monkey
pg_restore: [archiver (db)] could not execute query: ERROR: column
"channels.start_at" must appear in the GROUP BY clause or be used in an
aggregate function
LINE 2: SELECT channels.id, channels.start_at, channels.end_at, ...
^
Command was: CREATE VIEW site_channels AS
SELECT channels.id, channels.start_at, channels.end_at, channels.title,
channels.descriptio...

site_channels view definition:

View definition:
SELECT channels.id, channels.start_at, channels.end_at, channels.title
FROM channels
LEFT JOIN channels_products cp ON cp.channel_id = channels.id
LEFT JOIN buyable_products bp ON bp.id = cp.product_id
GROUP BY channels.id;

channels.id is a primary key.

#2Ryan Kelly
rpkelly22@gmail.com
In reply to: Joe Van Dyk (#1)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

On Tue, Jun 19, 2012 at 07:49:20PM +0000, joe@tanga.com wrote:

The following bug has been logged on the website:

Bug reference: 6699
Logged by: Joe Van Dyk
Email address: joe@tanga.com
PostgreSQL version: 9.1.4
Operating system: OSX
Description:

$ pg_restore -O -j 4 ~/tanga.dump -d tanga_dev_full_backup

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 417; 1259 66296 VIEW
site_channels monkey
pg_restore: [archiver (db)] could not execute query: ERROR: column
"channels.start_at" must appear in the GROUP BY clause or be used in an
aggregate function
LINE 2: SELECT channels.id, channels.start_at, channels.end_at, ...
^
Command was: CREATE VIEW site_channels AS
SELECT channels.id, channels.start_at, channels.end_at, channels.title,
channels.descriptio...

site_channels view definition:

View definition:
SELECT channels.id, channels.start_at, channels.end_at, channels.title
FROM channels
LEFT JOIN channels_products cp ON cp.channel_id = channels.id
LEFT JOIN buyable_products bp ON bp.id = cp.product_id
GROUP BY channels.id;

channels.id is a primary key.

Attached is a test case to reproduce the problem, courtesy of the
original reporter.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

-Ryan

Attachments:

gistfile1.txttext/plain; charset=us-asciiDownload
#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Ryan Kelly (#2)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

Excerpts from Ryan Kelly's message of mar jun 19 16:20:58 -0400 2012:

On Tue, Jun 19, 2012 at 07:49:20PM +0000, joe@tanga.com wrote:

View definition:
SELECT channels.id, channels.start_at, channels.end_at, channels.title
FROM channels
LEFT JOIN channels_products cp ON cp.channel_id = channels.id
LEFT JOIN buyable_products bp ON bp.id = cp.product_id
GROUP BY channels.id;

channels.id is a primary key.

Attached is a test case to reproduce the problem, courtesy of the
original reporter.

The reason this doesn't work is that the primary key is not defined
until later in the restore process.

I think the fix is to make the view dependant on the primary key in the
dump file.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

Alvaro Herrera <alvherre@commandprompt.com> writes:

Excerpts from Ryan Kelly's message of mar jun 19 16:20:58 -0400 2012:

On Tue, Jun 19, 2012 at 07:49:20PM +0000, joe@tanga.com wrote:

SELECT channels.id, channels.start_at, channels.end_at, channels.title
FROM channels
LEFT JOIN channels_products cp ON cp.channel_id = channels.id
LEFT JOIN buyable_products bp ON bp.id = cp.product_id
GROUP BY channels.id;

The reason this doesn't work is that the primary key is not defined
until later in the restore process.

I think the fix is to make the view dependant on the primary key in the
dump file.

Hmm ... check_functional_grouping does add the PK's OID to the query's
constraintDeps list. Apparently we're losing that dependency knowledge
somewhere between the parser and pg_dump?

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

I wrote:

Hmm ... check_functional_grouping does add the PK's OID to the query's
constraintDeps list. Apparently we're losing that dependency knowledge
somewhere between the parser and pg_dump?

I looked into this a bit. The dependency does exist in pg_depend, but
it is shown as a dependency from the view's _RETURN rule to the
constraint, not as a dependency of the view itself. Using the
simplified example

create table t1 (f1 int primary key, f2 text);
create view v1 as select * from t1 group by f1;

what you get from "pg_dump -Fc | pg_restore -l -v" is

; Selected TOC Entries:
;
1923; 0 0 ENCODING - ENCODING
1924; 0 0 STDSTRINGS - STDSTRINGS
1925; 1262 41967 DATABASE - refbug postgres
5; 2615 2200 SCHEMA - public postgres
1926; 0 0 COMMENT - SCHEMA public postgres
; depends on: 5
1927; 0 0 ACL - public postgres
; depends on: 5
170; 3079 11727 EXTENSION - plpgsql
1928; 0 0 COMMENT - EXTENSION plpgsql
; depends on: 170
168; 1259 41968 TABLE public t1 postgres
; depends on: 5
1921; 2606 41975 CONSTRAINT public t1_pkey postgres
; depends on: 168 168
169; 1259 41976 VIEW public v1 postgres
; depends on: 1919 5
1922; 0 41968 TABLE DATA public t1 postgres
; depends on: 168

So the view is shown as depending on "object 1919", which is nowhere to
be seen, because it is the _RETURN rule which did not get dumped
separately. There is therefore no way at all for pg_restore to know
that the view has to be restored after the constraint. (pg_dump does
know that, since it was working with full dependency info, which is why
the constraint comes first in the dump order. But the info isn't
exposed where pg_restore can see it.)

Clearly, this is a bug in the way pg_dump emits dependency info.
It never mattered before, but parallel pg_restore really needs accurate
dependencies.

We could possibly hack something for the special case of rules, but
I don't think this would be the last time we hear about this type of
issue. I'm inclined to think that the best solution would be to add
generic logic to pg_dump that "looks through" any dependency references
to objects that are not going to be dumped, and replaces them with the
IDs of any objects they depend on that are going to be dumped.

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#5)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

I wrote:

We could possibly hack something for the special case of rules, but
I don't think this would be the last time we hear about this type of
issue. I'm inclined to think that the best solution would be to add
generic logic to pg_dump that "looks through" any dependency references
to objects that are not going to be dumped, and replaces them with the
IDs of any objects they depend on that are going to be dumped.

I wrote a trial patch that does things that way (attached) and it
appears to work and solve the problem. However, it's not committable
as-is because it breaks the build of pg_restore:

ld: Unsatisfied symbols:
findObjectByDumpId (code)

since findObjectByDumpId is in common.c which isn't linked into
pg_restore. I guess a brute force solution would be to link it,
but probably we ought to think about refactoring the code to avoid
that. I'm not coming up with any very nice ideas about exactly how
to refactor, though. Thoughts?

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#5)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

I wrote:

168; 1259 41968 TABLE public t1 postgres
; depends on: 5
1921; 2606 41975 CONSTRAINT public t1_pkey postgres
; depends on: 168 168
169; 1259 41976 VIEW public v1 postgres
; depends on: 1919 5
1922; 0 41968 TABLE DATA public t1 postgres
; depends on: 168

BTW, there's another pretty unpleasant thing going on here, which is
that the t1_pkey constraint is getting hoisted up to before t1's table
data because it is a dependency of v1. That means the index will be
created before the data is loaded, which is not what we want.

Parallel pg_restore actually has a hack that should work around that,
namely repoint_table_dependencies(). That doesn't help for plain serial
restores though. I'm thinking we really ought to bite the bullet and do
something comparable to repoint_table_dependencies() at an appropriate
point in pg_dump, so that the dependencies are sane to begin with.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

I wrote:

We could possibly hack something for the special case of rules, but
I don't think this would be the last time we hear about this type of
issue. I'm inclined to think that the best solution would be to add
generic logic to pg_dump that "looks through" any dependency references
to objects that are not going to be dumped, and replaces them with the
IDs of any objects they depend on that are going to be dumped.

I wrote a trial patch that does things that way (attached) and it
appears to work and solve the problem. However, it's not committable
as-is because it breaks the build of pg_restore:

ld: Unsatisfied symbols:
findObjectByDumpId (code)

since findObjectByDumpId is in common.c which isn't linked into
pg_restore. I guess a brute force solution would be to link it,
but probably we ought to think about refactoring the code to avoid
that.

Actually, that brute force solution doesn't work at all, because
common.c contains many call-backs into pg_dump.c, and we certainly
don't want to link pg_dump.c into pg_restore. I thought for a little
bit about breaking common.c into two pieces, but am not excited about
doing major refactoring on the spur of the moment. (Mind you, this
code probably *needs* wholesale refactoring; I just don't want to do
it as part of a bug fix.)

Another problem I noticed after further thought is that in a partial or
data-only dump, this patch could rearrange dependencies that pg_restore
has hardwired knowledge of, such as the dependency of a TABLE DATA item
on its table. I'm not certain that that would break anything, but it
seems pretty darn dangerous. I think that we don't want to rearrange
any dependencies that are "built in" by pg_dump itself rather than being
obtained from pg_depend --- this includes the TABLE DATA case as well
as COMMENT, ACL, and SECURITY LABEL items' dependencies on their parent
objects.

So attached is another try that puts the work into pg_dump.c. There are
a couple things I don't especially like, but don't see an easy way to
improve on:

* This code would be the first in pg_dump.c that looks through Archive
to ArchiveHandle. Although we've muttered before that that distinction
is useless, doing this here still seems a bit out of place. But we
can't put it in pg_backup_archiver.c because of the findObjectByDumpId
dependence, so I see no better answer.

* To handle the "built in" dependency problem mentioned above, I
introduced a convention that only "built in" dependencies should be
explicitly passed to ArchiveEntry(); everything with regular
dependencies should pass NULL/0, and then we build the real dependencies
at the very end of the archive construction process. This is annoying
because it's easy to do the wrong thing: if you forget and use the old
style of passing the DumpableObject's dependencies to ArchiveEntry, it
will *appear* to work, and only break when/if one of these indirection
cases comes up. A less error-prone convention would be nice, but I
can't think of a good one offhand.

Another thing that might be a showstopper is that this will only work
as-is in HEAD and 9.2, because only since commit
4317e0246c645f60c39e6572644cff1cb03b4c65 do we have explicit advance
marking of which TOC entries are going to get dumped. If we want to
fix this problem this way in older branches, we're going to have to
back-port parts of that commit. I don't actually see any alternative
way to fix it, though. How concerned are we about making this work
in released branches?

Comments?

regards, tom lane

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#8)
Re: BUG #6699: pg_restore with -j -- doesn't restore view that groups by primary key

I wrote:

We could possibly hack something for the special case of rules, but
I don't think this would be the last time we hear about this type of
issue.

BTW, after comparing the dependencies emitted by this patch to those of
HEAD for the regression database, I am convinced that indeed view rules
are not the only issue; we have got a boatload of problems of this ilk,
for instance:

* objects dependent on array types link to an array-type entry that is
not in the dump, not to the base type that is

* composite types link to their underlying "DO_TABLE" entry (ie, the
pg_class entry for the composite type) and thus miss any dependencies
on, say, column data types

* objects dependent on extension members have dangling links; notably,
every single plpgsql function fails to have a visible dependency on the
plpgsql extension

* foreign key constraints have dependencies on PK indexes, but if the PK
index was declared as a constraint, it's not visible under its own ID in
the dump, so the dependency dangles.

The proposed patch fixes all these cases. I am not sure whether any of
the first three are hazards for parallel pg_restore, but the last one
is: AFAICS, it will result in all such FK constraints being restored
serially at the end, because the parallel loop never manages to clear
all of their dependencies. Surprising nobody's complained about that.

regards, tom lane