Subtle pg_dump problem...

Started by Christopher Kings-Lynneover 21 years ago18 messages
#1Christopher Kings-Lynne
chriskl@familyhealth.com.au

I have a table with a tsearch2 index on it. Now, I have all the
tsearch2 stuff installed into a 'contrib' schema. I have had to change
the default database schema to include the contrib schema as
behind-the-scenes, tsearch2 looks for its tables, and cannot find them
even if the function itself is schema-qualfified. This might well be a
tsearc2 bug.

Anyway, this means the table is dumped like this:

SET SESSION AUTHORIZATION 'auadmin';

SET search_path = public, pg_catalog;

COPY ...

Which give this error upon restoring:

ERROR: relation "pg_ts_cfg" does not exist
CONTEXT: COPY food_categories, line 1: "79 102 Vegetables,
Salads & Legumes \N 'legum':3 'salad':2 'veget':1"

It's because the search_path needs to be like this for it to work:

SET search_path = public, contrib, pg_catalog;

Chris

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#1)
Re: Subtle pg_dump problem...

Thanks Christopher,
we'll look into the issue.

Oleg
On Fri, 7 May 2004, Christopher Kings-Lynne wrote:

I have a table with a tsearch2 index on it. Now, I have all the
tsearch2 stuff installed into a 'contrib' schema. I have had to change
the default database schema to include the contrib schema as
behind-the-scenes, tsearch2 looks for its tables, and cannot find them
even if the function itself is schema-qualfified. This might well be a
tsearc2 bug.

Anyway, this means the table is dumped like this:

SET SESSION AUTHORIZATION 'auadmin';

SET search_path = public, pg_catalog;

COPY ...

Which give this error upon restoring:

ERROR: relation "pg_ts_cfg" does not exist
CONTEXT: COPY food_categories, line 1: "79 102 Vegetables,
Salads & Legumes \N 'legum':3 'salad':2 'veget':1"

It's because the search_path needs to be like this for it to work:

SET search_path = public, contrib, pg_catalog;

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#3Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Oleg Bartunov (#2)
Re: Subtle pg_dump problem...

Did you guys find any solution to this in the end?

Chris

Oleg Bartunov wrote:

Show quoted text

Thanks Christopher,
we'll look into the issue.

Oleg
On Fri, 7 May 2004, Christopher Kings-Lynne wrote:

I have a table with a tsearch2 index on it. Now, I have all the
tsearch2 stuff installed into a 'contrib' schema. I have had to change
the default database schema to include the contrib schema as
behind-the-scenes, tsearch2 looks for its tables, and cannot find them
even if the function itself is schema-qualfified. This might well be a
tsearc2 bug.

Anyway, this means the table is dumped like this:

SET SESSION AUTHORIZATION 'auadmin';

SET search_path = public, pg_catalog;

COPY ...

Which give this error upon restoring:

ERROR: relation "pg_ts_cfg" does not exist
CONTEXT: COPY food_categories, line 1: "79 102 Vegetables,
Salads & Legumes \N 'legum':3 'salad':2 'veget':1"

It's because the search_path needs to be like this for it to work:

SET search_path = public, contrib, pg_catalog;

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#4Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#3)
Re: Subtle pg_dump problem...

On Wed, 12 May 2004, Christopher Kings-Lynne wrote:

Did you guys find any solution to this in the end?

Not yet. Could you send me a simple test suite ? I'm not
experienced with schema and don't know how to load tsearch2 into
specific schema.

Oleg

Chris

Oleg Bartunov wrote:

Thanks Christopher,
we'll look into the issue.

Oleg
On Fri, 7 May 2004, Christopher Kings-Lynne wrote:

I have a table with a tsearch2 index on it. Now, I have all the
tsearch2 stuff installed into a 'contrib' schema. I have had to change
the default database schema to include the contrib schema as
behind-the-scenes, tsearch2 looks for its tables, and cannot find them
even if the function itself is schema-qualfified. This might well be a
tsearc2 bug.

Anyway, this means the table is dumped like this:

SET SESSION AUTHORIZATION 'auadmin';

SET search_path = public, pg_catalog;

COPY ...

Which give this error upon restoring:

ERROR: relation "pg_ts_cfg" does not exist
CONTEXT: COPY food_categories, line 1: "79 102 Vegetables,
Salads & Legumes \N 'legum':3 'salad':2 'veget':1"

It's because the search_path needs to be like this for it to work:

SET search_path = public, contrib, pg_catalog;

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#5Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#3)
Re: Subtle pg_dump problem...

Christopher, I don't quite understand the problem
Did you move pg_ts_* table to schema 'contrib' ?
What functions you schema-qualified and how ?
It's always required to set search_path properly.

Oleg
On Wed, 12 May 2004, Christopher Kings-Lynne wrote:

Did you guys find any solution to this in the end?

Chris

Oleg Bartunov wrote:

Thanks Christopher,
we'll look into the issue.

Oleg
On Fri, 7 May 2004, Christopher Kings-Lynne wrote:

I have a table with a tsearch2 index on it. Now, I have all the
tsearch2 stuff installed into a 'contrib' schema. I have had to change
the default database schema to include the contrib schema as
behind-the-scenes, tsearch2 looks for its tables, and cannot find them
even if the function itself is schema-qualfified. This might well be a
tsearc2 bug.

Anyway, this means the table is dumped like this:

SET SESSION AUTHORIZATION 'auadmin';

SET search_path = public, pg_catalog;

COPY ...

Which give this error upon restoring:

ERROR: relation "pg_ts_cfg" does not exist
CONTEXT: COPY food_categories, line 1: "79 102 Vegetables,
Salads & Legumes \N 'legum':3 'salad':2 'veget':1"

It's because the search_path needs to be like this for it to work:

SET search_path = public, contrib, pg_catalog;

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#6Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Oleg Bartunov (#5)
Re: Subtle pg_dump problem...

OK, I'll try to explain it better.

1. Tsearch2 requires access to several tables.

2. You can edit the tsearch2.sql script and change the "set schema =
..." to contrib.

3. You load all the tsearch2 objects into contrib.

4. You create a table in the public schema with a column of type
contrib.vector, and a trigger of contrib.tsearch2.

5. You pg_dump that table, you get:

SET search_path = public, pg_catalog;

COPY ...

(Because the table is in the public schema)

6. However, it is now not possible to restore the sql script as it was
dumped, as you get this error:

ERROR: relation "pg_ts_cfg" does not exist

7. You get this error because the tsearch2 code depends on the current
search path, and since contrib is not in the search path, the restore fails.

8. This problem occurs because tsearch2 is dependent on the current
user's search_path. Instead, it should be independent of the current
user's search path, and instead try to find its configuration tables in
the same schema in which the vector type or the tsearch2 trigger
function resides.

This assumes that the user has installed all the tsearch2 objects into
the same schema, which I think is reasonable.

This problem will occur for anyone who has multiple schemas and tries to
create vector columns in tables that refer to the vector type in another
schema.

Does that make sense?

Chris

#7Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#6)
Re: Subtle pg_dump problem...

Christopher,

On Wed, 12 May 2004, Christopher Kings-Lynne wrote:

OK, I'll try to explain it better.

1. Tsearch2 requires access to several tables.

2. You can edit the tsearch2.sql script and change the "set schema =
..." to contrib.

Aha, this is what I thought about.

3. You load all the tsearch2 objects into contrib.

createdb qq
psql qq -c "create schema contrib"
psql qq < tsearch2_contrib.sql

4. You create a table in the public schema with a column of type
contrib.vector, and a trigger of contrib.tsearch2.

qq=# create table test ( a text, fts contrib.tsvector);
CREATE TABLE

5. You pg_dump that table, you get:

SET search_path = public, pg_catalog;

COPY ...

(Because the table is in the public schema)

done.

6. However, it is now not possible to restore the sql script as it was
dumped, as you get this error:

ERROR: relation "pg_ts_cfg" does not exist

No problem,

megera@mira:~/app/pgsql/tsearch2/test_scheme$ createdb qq
CREATE DATABASE
megera@mira:~/app/pgsql/tsearch2/test_scheme$ psql qq -c "create schema contrib"
CREATE SCHEMA
psql qq < ./tsearch2_contrib.sql
psql qq < ./test.dump
SET
SET
SET
SET
CREATE TABLE

But I get error later:

qq=# insert into test(a) values( 'the hot dog');
INSERT 3478544 1
qq=# update test set fts=contrib.to_tsvector(a);
ERROR: relation "pg_ts_cfg" does not exist

after setting proper search_path it worked:

qq=# set search_path to public,contrib;
SET
qq=# update test set fts=contrib.to_tsvector(a);
UPDATE 1

7. You get this error because the tsearch2 code depends on the current
search path, and since contrib is not in the search path, the restore fails.

8. This problem occurs because tsearch2 is dependent on the current
user's search_path. Instead, it should be independent of the current
user's search path, and instead try to find its configuration tables in
the same schema in which the vector type or the tsearch2 trigger
function resides.

This assumes that the user has installed all the tsearch2 objects into
the same schema, which I think is reasonable.

This problem will occur for anyone who has multiple schemas and tries to
create vector columns in tables that refer to the vector type in another
schema.

Does that make sense?

Sorry, I don't see the problem. I just pg_dump whole db and recreated without
any problem. For working with tsearch2 I should set correct search_path,
but what's wrong with this ?

pg_dump qq > qq.dump
dropdb qq
createdb qq
psql qq < qq.dump
qq=# set search_path to public,contrib;
SET
qq=# update test set fts=contrib.to_tsvector(a);
UPDATE 1

works like a charm :)

One remark:

I applied regprocedure_7.4.patch.gz to be able dump/restore
without issue with OIDs. Upgrading existed tsearch2 installation
should be easy ( thanks Andrew for his script ):
Actually, for playing with schema I added set search_path = contrib; to his script.
Original script is available from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/regprocedure_update.sql

psql qq < regprocedure_update.sql ( first line is "set search_path = contrib;")

now, database qq could be dumped/restored without problem.

Chris

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#8Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Oleg Bartunov (#7)
Re: Subtle pg_dump problem...

6. However, it is now not possible to restore the sql script as it was
dumped, as you get this error:

ERROR: relation "pg_ts_cfg" does not exist

No problem,

megera@mira:~/app/pgsql/tsearch2/test_scheme$ createdb qq
CREATE DATABASE
megera@mira:~/app/pgsql/tsearch2/test_scheme$ psql qq -c "create schema contrib"
CREATE SCHEMA
psql qq < ./tsearch2_contrib.sql
psql qq < ./test.dump
SET
SET
SET
SET
CREATE TABLE

Is that because you didn't insert any data into the table before dumping
it? You will get the same error that follows:

But I get error later:

qq=# insert into test(a) values( 'the hot dog');
INSERT 3478544 1
qq=# update test set fts=contrib.to_tsvector(a);
ERROR: relation "pg_ts_cfg" does not exist

after setting proper search_path it worked:

qq=# set search_path to public,contrib;
SET
qq=# update test set fts=contrib.to_tsvector(a);
UPDATE 1

My point is that if you pg_dump a table that has data in it, pg_dump
will set yoru search_path for you, and so the restore will fail.

pg_dump qq > qq.dump
dropdb qq
createdb qq
psql qq < qq.dump
qq=# set search_path to public,contrib;
SET
qq=# update test set fts=contrib.to_tsvector(a);
UPDATE 1

works like a charm :)

I bet you don't have any data in the table.

Chris

#9Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Oleg Bartunov (#7)
Re: Subtle pg_dump problem...

No problem,

Actually, I did some more testing and I properly understand the problem
now - and it won't happen in the general restoring case.

What fails is if you "pg_dump -a" to just dump the DATA from a table
containing a tsearch2 trigger that is in a different schema.

Then you delete all the rows from the table.

Then you try to execute the sql script created from pg_dump to restore
the data.

It will fail because the sql script will automatically set the
search_path to public, pg_catalog. And then as the COPY command inserts
each row, it will fail immediately as the tsearch2 trigger will not be
able to find its config table.

Does that make sense?

Chris

#10Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#8)
Re: Subtle pg_dump problem...

Christopher,

here is a cut'n paste from test script (patch applied):

dropdb qq
createdb qq
psql qq -c "create schema contrib;"
psql qq < tsearch2_contrib-2.sql
psql qq -c "create table test ( a text, fts contrib.tsvector);"
psql qq -c "insert into test(a) values ('I hit a dog');"
psql qq -c "set search_path = public,contrib; update test set fts = to_tsvector(a);"
pg_dump qq > qq.dump

There's certainly one record and after restoring I could use tsearch2
as usual (of course, setting search_path properly).

Oleg
On Wed, 12 May 2004, Christopher Kings-Lynne wrote:

6. However, it is now not possible to restore the sql script as it was
dumped, as you get this error:

ERROR: relation "pg_ts_cfg" does not exist

No problem,

megera@mira:~/app/pgsql/tsearch2/test_scheme$ createdb qq
CREATE DATABASE
megera@mira:~/app/pgsql/tsearch2/test_scheme$ psql qq -c "create schema contrib"
CREATE SCHEMA
psql qq < ./tsearch2_contrib.sql
psql qq < ./test.dump
SET
SET
SET
SET
CREATE TABLE

Is that because you didn't insert any data into the table before dumping
it? You will get the same error that follows:

But I get error later:

qq=# insert into test(a) values( 'the hot dog');
INSERT 3478544 1
qq=# update test set fts=contrib.to_tsvector(a);
ERROR: relation "pg_ts_cfg" does not exist

after setting proper search_path it worked:

qq=# set search_path to public,contrib;
SET
qq=# update test set fts=contrib.to_tsvector(a);
UPDATE 1

My point is that if you pg_dump a table that has data in it, pg_dump
will set yoru search_path for you, and so the restore will fail.

pg_dump qq > qq.dump
dropdb qq
createdb qq
psql qq < qq.dump
qq=# set search_path to public,contrib;
SET
qq=# update test set fts=contrib.to_tsvector(a);
UPDATE 1

works like a charm :)

I bet you don't have any data in the table.

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#11Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#9)
Re: Subtle pg_dump problem...

On Wed, 12 May 2004, Christopher Kings-Lynne wrote:

No problem,

Actually, I did some more testing and I properly understand the problem
now - and it won't happen in the general restoring case.

What fails is if you "pg_dump -a" to just dump the DATA from a table
containing a tsearch2 trigger that is in a different schema.

Then you delete all the rows from the table.

Then you try to execute the sql script created from pg_dump to restore
the data.

It will fail because the sql script will automatically set the
search_path to public, pg_catalog. And then as the COPY command inserts
each row, it will fail immediately as the tsearch2 trigger will not be
able to find its config table.

Does that make sense?

Hmm, what other hackers thinks ? This is not just a tsearch2 problem,
it could happens with any such kind of things, like defining user defined
type in one scheme, using it in another, dumping separate data.
Could pg_dump be enough smart to set search_path properly ?

Chris

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oleg Bartunov (#11)
Re: Subtle pg_dump problem...

Oleg Bartunov <oleg@sai.msu.su> writes:

Hmm, what other hackers thinks ? This is not just a tsearch2 problem,
it could happens with any such kind of things, like defining user defined
type in one scheme, using it in another, dumping separate data.
Could pg_dump be enough smart to set search_path properly ?

It could not. I think the fundamental point here is that it is a real
bad idea for the tsearch routines to make any assumptions about the
current search path. What I would suggest is that the internal objects
used by the tsearch routines (such as pg_ts_cfg) should be required to
live in a specific schema ("tsearch2" seems like a good name) and that
all the internal references inside the tsearch functions should be fully
qualified names.

You could perhaps make this private schema name be selectable at the
time tsearch is built ... but I'm not sure it's worth the trouble.

regards, tom lane

#13Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Tom Lane (#12)
Re: Subtle pg_dump problem...

It could not. I think the fundamental point here is that it is a real
bad idea for the tsearch routines to make any assumptions about the
current search path. What I would suggest is that the internal objects
used by the tsearch routines (such as pg_ts_cfg) should be required to
live in a specific schema ("tsearch2" seems like a good name) and that
all the internal references inside the tsearch functions should be fully
qualified names.

I think a better solution is to change tsearch2 to have two assumptions:

1. All tsearch2 objects will be loaded in the same schema, name not
important.

2. When an object foo is called and needs to refer to another object
bar, it should assume that bar exists in the same schema as foo, and NOT
in the current search_path.

Chris

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#13)
Re: Subtle pg_dump problem...

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

2. When an object foo is called and needs to refer to another object
bar, it should assume that bar exists in the same schema as foo, and NOT
in the current search_path.

That would be great if a C function could find out what schema it had
been declared in, but I don't think it can readily do so.

regards, tom lane

#15Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Tom Lane (#14)
Re: Subtle pg_dump problem...

That would be great if a C function could find out what schema it had
been declared in, but I don't think it can readily do so.

There's no context information available to it at all? Even if you go
contrib.tsearch2 qualfication?

How about making it so that the default context for functions is their
own schema? :)

Chris

#16Oleg Bartunov
oleg@sai.msu.su
In reply to: Tom Lane (#14)
Re: Subtle pg_dump problem...

On Wed, 12 May 2004, Tom Lane wrote:

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

2. When an object foo is called and needs to refer to another object
bar, it should assume that bar exists in the same schema as foo, and NOT
in the current search_path.

That would be great if a C function could find out what schema it had
been declared in, but I don't think it can readily do so.

TODO candidate ?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#17Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Oleg Bartunov (#16)
Re: Subtle pg_dump problem...

That would be great if a C function could find out what schema it had
been declared in, but I don't think it can readily do so.

TODO candidate ?

Seems like it would be a good thing.

Chris

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#17)
Re: Subtle pg_dump problem...

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

That would be great if a C function could find out what schema it had
been declared in, but I don't think it can readily do so.

TODO candidate ?

Seems like it would be a good thing.

I take that back: you can find it out if you really want to. You can
get your own function OID from the fcinfo->flinfo struct, look that up
in pg_proc, and get the pronamespace field. If you want the schema
name, and not just its OID, that'll cost you a second cache lookup.
Not too many lines of code, though it might be wise to fix things so you
need not repeat this each time through the function.

regards, tom lane