combining semi-duplicate rows

Started by hjenkinsover 18 years ago2 messagesgeneral

hjenkins@uvic.ca

over 18 years ago

Hello,

I have a table of rows which partially duplicate one another and need to
be merged and moved into a table with a primary key.

As an additional complication, some of the duplicates contain different
information, ex.:
schema1.datatable: key1 None None 3 4
schema2.datatable: key1 1 2 7 None

desired result:
schema1.datatable: key1 1 2 3 4

I looked for a specific function that would do this sort of merge and
failed. So I tried:

UPDATE schema1.datatable SET schema1.datatable.field1 =
schema2.datatable.field1 FROM schema2.datatable
WHERE schema2.datatable.keyfield = schema1.datatable.keyfield
AND schema1.datatable.field1 = None;

Which is suboptimal because I'd need a command for each field, but it
would be a start. However, the schema names are not recognised. I get
error messages to the effect that "cross-database references are not
implemented" or "relation "schema1/2" does not exist.Even the much simpler

SELECT DISTINCT schema2.datatable INTO schema1.datatable;

...gives me these messages. Qualifying right up to the database level
produces "improper qualified name (too many dotted names)".

I'm pretty sure that this isn't a capitalization/quoting problem as
described in the FAQ. Is it not possible to use these functions between
schemas? Or am I misusing the functions in a more basic way?

The problem is somewhat similar to this one:
http://archives.postgresql.org/pgsql-sql/2007-02/msg00055.php

Namely,

Regards,
H.Jenkins

Tom Lane

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: hjenkins (#1)

Re: combining semi-duplicate rows

"hjenkins" <hjenkins@uvic.ca> writes:

So I tried:

UPDATE schema1.datatable SET schema1.datatable.field1 =
schema2.datatable.field1 FROM schema2.datatable
WHERE schema2.datatable.keyfield = schema1.datatable.keyfield
AND schema1.datatable.field1 = None;

Which is suboptimal because I'd need a command for each field, but it
would be a start. However, the schema names are not recognised. I get
error messages to the effect that "cross-database references are not
implemented" or "relation "schema1/2" does not exist.

The target column of a SET clause can't be qualified with the relation
name; it would introduce ambiguity in the case of composite-type fields,
and it's useless anyway since the target relation was already given.
Your example works for me (syntactically at least) as

regression=# UPDATE schema1.datatable SET field1 =
schema2.datatable.field1 FROM schema2.datatable
WHERE schema2.datatable.keyfield = schema1.datatable.keyfield
AND schema1.datatable.field1 = 'None';
UPDATE 0

Personally, though, I'd use some aliases to improve readability and
forestall the onset of carpal tunnel syndrome:

regression=# UPDATE schema1.datatable t SET field1 =
s.field1 FROM schema2.datatable s
WHERE s.keyfield = t.keyfield
AND t.field1 = 'None';
UPDATE 0

regards, tom lane