BUG #15552: Unexpected error in COPY to a foreign table in a transaction
The following bug has been logged on the website:
Bug reference: 15552
Logged by: Luis M Carril
Email address: luis.carril@swarm64.com
PostgreSQL version: 11.1
Operating system: Ubuntu 16.04
Description:
Hi,
Postgres throws a "could not open file" error when inside a transaction
we create a foreign table and copy data into it.
Reproduction (code based on tests in postgres_fdw test suite):
-----------------------
CREATE EXTENSION postgres_fdw;
CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
END;
$d$;
CREATE USER MAPPING FOR public SERVER testserver1
OPTIONS (user 'value', password 'value');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
create table loct1 (a int check (a in (1)), b text);
begin;
create foreign table remp1 (a int check (a in (1)), b text) server loopback
options (table_name 'loct1');
copy remp1 from stdin delimiter ',';
1,f
\.
-----------------------
Observed behavior:
ERROR: could not open file "base/16385/16460": No such file or
directory
-----------------------
For what I saw, the error is triggered when synchronizing the heap in
CopyFrom (backend/commands/copy.c:2890), see the following stacktrace (when
setting a breakpoint at errcode_for_file_access():
#0 errcode_for_file_access () at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/utils/error/elog.c:600
#1 0x00005636212d33d2 in mdopen (reln=<optimized out>,
forknum=forknum@entry=MAIN_FORKNUM, behavior=behavior@entry=1)
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:606
#2 0x00005636212d3961 in mdopen (behavior=1, forknum=MAIN_FORKNUM,
reln=0x563622ba8a88) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:922
#3 mdnblocks (reln=0x563622ba8a88, forknum=MAIN_FORKNUM) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:875
#4 0x00005636212d39b9 in mdimmedsync (reln=0x563622ba8a88,
forknum=MAIN_FORKNUM) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:1033
#5 0x000056362103257c in heap_sync (rel=0x7f0e3e681aa8) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/access/heap/heapam.c:9408
#6 0x0000563621115e89 in CopyFrom (cstate=cstate@entry=0x563622ba2040) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/commands/copy.c:2890
#7 0x000056362111629b in DoCopy (pstate=pstate@entry=0x563622ad08c0,
stmt=stmt@entry=0x563622a9c0c8, stmt_location=0, stmt_len=59,
processed=processed@entry=0x7fff92a85aa0)
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/commands/copy.c:992
#8 0x00005636212dfe95 in standard_ProcessUtility (pstmt=0x563622a9c198,
queryString=0x563622a9b460 "COPY test_table (col1, col2, col3) FROM STDIN
DELIMITER ',';", context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
queryEnv=0x0, dest=0x563622a9c518, completionTag=0x7fff92a86480 "") at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/tcop/utility.c:551
#9 0x00007f0f5b027c89 in Db::Psql::ProcessUtilityHook
(pstmt=0x563622a9c198, queryString=0x563622a9b460 "COPY test_table (col1,
col2, col3) FROM STDIN DELIMITER ',';", context=PROCESS_UTILITY_TOPLEVEL,
paramListInfo=0x0, queryEnvironment=0x0, destReceiver=0x563622a9c518,
completionTag=0x7fff92a86480 "") at
/home/luis/main-dev/db/psql/src/utility_hook.cpp:1024
#10 0x00005636212dc9c9 in PortalRunUtility (portal=0x563622b290a0,
pstmt=0x563622a9c198, isTopLevel=<optimized out>, setHoldSnapshot=<optimized
out>, dest=0x563622a9c518, completionTag=0x7fff92a86480 "")
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/tcop/pquery.c:1178
#11 0x00005636212dd4f8 in PortalRunMulti
(portal=portal@entry=0x563622b290a0, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x563622a9c518,
altdest=altdest@entry=0x563622a9c518,
completionTag=completionTag@entry=0x7fff92a86480 "") at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/tcop/pquery.c:1331
#12 0x00005636212de265 in PortalRun (portal=portal@entry=0x563622b290a0,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x563622a9c518,
altdest=altdest@entry=0x563622a9c518, completionTag=0x7fff92a86480 "")
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/tcop/pquery.c:799
#13 0x00005636212d9d21 in exec_simple_query (query_string=0x563622a9b460
"COPY test_table (col1, col2, col3) FROM STDIN DELIMITER ',';")
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/tcop/postgres.c:1145
#14 0x00005636212db24b in PostgresMain (argc=<optimized out>,
argv=argv@entry=0x563622add4f8, dbname=0x563622add3a8 "docker-user",
username=<optimized out>)
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/tcop/postgres.c:4182
#15 0x0000563620fec961 in BackendRun (port=0x563622ad17f0) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/postmaster/postmaster.c:4361
#16 BackendStartup (port=0x563622ad17f0) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/postmaster/postmaster.c:4033
#17 ServerLoop () at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/postmaster/postmaster.c:1706
#18 0x0000563621266054 in PostmasterMain (argc=5, argv=<optimized out>) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/postmaster/postmaster.c:1379
#19 0x0000563620fedd05 in main (argc=5, argv=0x563622a95f80) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/main/main.c:228
(gdb) f 6
#6 0x0000563621115e89 in CopyFrom (cstate=cstate@entry=0x563622ba2040) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/commands/copy.c:2890
If someone gives me a hint on the expected behavior here, I would gladly
submit a patch myself.
Cheers
Luis M. Carril
Hi,
On 2018/12/14 19:42, PG Bug reporting form wrote:
The following bug has been logged on the website:
Bug reference: 15552
Logged by: Luis M Carril
Email address: luis.carril@swarm64.com
PostgreSQL version: 11.1
Operating system: Ubuntu 16.04
Description:Hi,
Postgres throws a "could not open file" error when inside a transaction
we create a foreign table and copy data into it.Reproduction (code based on tests in postgres_fdw test suite):
-----------------------
CREATE EXTENSION postgres_fdw;CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
END;
$d$;CREATE USER MAPPING FOR public SERVER testserver1
OPTIONS (user 'value', password 'value');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;create table loct1 (a int check (a in (1)), b text);
begin;
create foreign table remp1 (a int check (a in (1)), b text) server loopback
options (table_name 'loct1');
copy remp1 from stdin delimiter ',';
1,f
\.
-----------------------Observed behavior:
ERROR: could not open file "base/16385/16460": No such file or
directory-----------------------
Thanks for the report. I can reproduce this both in 11 stable branch and
in HEAD.
For what I saw, the error is triggered when synchronizing the heap in
CopyFrom (backend/commands/copy.c:2890), see the following stacktrace (when
setting a breakpoint at errcode_for_file_access():#0 errcode_for_file_access () at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/utils/error/elog.c:600
#1 0x00005636212d33d2 in mdopen (reln=<optimized out>,
forknum=forknum@entry=MAIN_FORKNUM, behavior=behavior@entry=1)
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:606
#2 0x00005636212d3961 in mdopen (behavior=1, forknum=MAIN_FORKNUM,
reln=0x563622ba8a88) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:922
#3 mdnblocks (reln=0x563622ba8a88, forknum=MAIN_FORKNUM) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:875
#4 0x00005636212d39b9 in mdimmedsync (reln=0x563622ba8a88,
forknum=MAIN_FORKNUM) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:1033
#5 0x000056362103257c in heap_sync (rel=0x7f0e3e681aa8) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/access/heap/heapam.c:9408
#6 0x0000563621115e89 in CopyFrom (cstate=cstate@entry=0x563622ba2040) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/commands/copy.c:2890
[ ... ]
If someone gives me a hint on the expected behavior here, I would gladly
submit a patch myself.
heap_sync should only be called for relations that actually have files to
sync, which isn't true for foreign tables. So, a simple relkind check
before calling heap_sync() in CopyFrom would suffice I think. Although,
we might also need such a check higher up in CopyFrom where some
optimizations that are specific to "heap" relations are enabled. For
example, this piece of code:
if (cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId))
{
hi_options |= HEAP_INSERT_SKIP_FSM;
if (!XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
}
Note here that we check that the relation being copied into is not a
partitioned table, because a partitioned table doesn't have storage itself.
Thanks,
Amit
Hi,
heap_sync should only be called for relations that actually have files to
sync, which isn't true for foreign tables. So, a simple relkind check
before calling heap_sync() in CopyFrom would suffice I think. Although,
we might also need such a check higher up in CopyFrom where some
optimizations that are specific to "heap" relations are enabled. For
example, this piece of code:
thanks for the input, so it seems that is enough with adding the check as you suggested:
if (cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
cstate->rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId))
{
hi_options |= HEAP_INSERT_SKIP_FSM;
if (!XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
}
Should a regression test be added (at postgres_fdw.sql)? If yes, how should be implemented, as max_wal_senders and wal_level need to be changed (to 0 and minimal) to trigger the bug, which is only possible on server start.
Cheers,
Dr. Luis M. Carril Rodríguez
Senior Software Engineer
luis.carril@swarm64.com<mailto:lisa@swarm64.com>
Swarm64 AS
Parkveien 41 B | 0258 Oslo | Norway
Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
CEO/Geschäftsführer (Daglig Leder): Dr. Karsten Rönner; Chairman/Vorsitzender (Styrets Leder): Dr. Sverre Munck
Swarm64 AS Zweigstelle Hive
Ullsteinstr. 120 | 12109 Berlin | Germany
Registered at Amtsgericht Charlottenburg - HRB 154382 B
[cid:9642fa37-86ca-4a3b-af9b-3a297ed0bc4b]
________________________________
From: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Sent: Monday, December 17, 2018 3:18:14 AM
To: Luis Carril; pgsql-bugs@lists.postgresql.org; PG Bug reporting form
Subject: Re: BUG #15552: Unexpected error in COPY to a foreign table in a transaction
Hi,
On 2018/12/14 19:42, PG Bug reporting form wrote:
The following bug has been logged on the website:
Bug reference: 15552
Logged by: Luis M Carril
Email address: luis.carril@swarm64.com
PostgreSQL version: 11.1
Operating system: Ubuntu 16.04
Description:Hi,
Postgres throws a "could not open file" error when inside a transaction
we create a foreign table and copy data into it.Reproduction (code based on tests in postgres_fdw test suite):
-----------------------
CREATE EXTENSION postgres_fdw;CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
END;
$d$;CREATE USER MAPPING FOR public SERVER testserver1
OPTIONS (user 'value', password 'value');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;create table loct1 (a int check (a in (1)), b text);
begin;
create foreign table remp1 (a int check (a in (1)), b text) server loopback
options (table_name 'loct1');
copy remp1 from stdin delimiter ',';
1,f
\.
-----------------------Observed behavior:
ERROR: could not open file "base/16385/16460": No such file or
directory-----------------------
Thanks for the report. I can reproduce this both in 11 stable branch and
in HEAD.
For what I saw, the error is triggered when synchronizing the heap in
CopyFrom (backend/commands/copy.c:2890), see the following stacktrace (when
setting a breakpoint at errcode_for_file_access():#0 errcode_for_file_access () at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/utils/error/elog.c:600
#1 0x00005636212d33d2 in mdopen (reln=<optimized out>,
forknum=forknum@entry=MAIN_FORKNUM, behavior=behavior@entry=1)
at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:606
#2 0x00005636212d3961 in mdopen (behavior=1, forknum=MAIN_FORKNUM,
reln=0x563622ba8a88) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:922
#3 mdnblocks (reln=0x563622ba8a88, forknum=MAIN_FORKNUM) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:875
#4 0x00005636212d39b9 in mdimmedsync (reln=0x563622ba8a88,
forknum=MAIN_FORKNUM) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/storage/smgr/md.c:1033
#5 0x000056362103257c in heap_sync (rel=0x7f0e3e681aa8) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/access/heap/heapam.c:9408
#6 0x0000563621115e89 in CopyFrom (cstate=cstate@entry=0x563622ba2040) at
/build/postgresql-11-9gVVK7/postgresql-11-11.1/build/../src/backend/commands/copy.c:2890
[ ... ]
If someone gives me a hint on the expected behavior here, I would gladly
submit a patch myself.
heap_sync should only be called for relations that actually have files to
sync, which isn't true for foreign tables. So, a simple relkind check
before calling heap_sync() in CopyFrom would suffice I think. Although,
we might also need such a check higher up in CopyFrom where some
optimizations that are specific to "heap" relations are enabled. For
example, this piece of code:
if (cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId))
{
hi_options |= HEAP_INSERT_SKIP_FSM;
if (!XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
}
Note here that we check that the relation being copied into is not a
partitioned table, because a partitioned table doesn't have storage itself.
Thanks,
Amit
Hi,
On 2018/12/17 22:12, Luis Carril wrote:
Hi,
heap_sync should only be called for relations that actually have files to
sync, which isn't true for foreign tables. So, a simple relkind check
before calling heap_sync() in CopyFrom would suffice I think. Although,
we might also need such a check higher up in CopyFrom where some
optimizations that are specific to "heap" relations are enabled. For
example, this piece of code:thanks for the input, so it seems that is enough with adding the check as you suggested:
if (cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
cstate->rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId))
{
hi_options |= HEAP_INSERT_SKIP_FSM;
if (!XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
}
I think that would do the trick.
Should a regression test be added (at postgres_fdw.sql)? If yes, how should be implemented, as max_wal_senders and wal_level need to be changed (to 0 and minimal) to trigger the bug, which is only possible on server start.
I noticed that too. As you say, it is not possible to exercise a test we
might add for this with `make check`, because it runs with a fixed value
of wal_level (= replica). But it is possible with `make installcheck` on
a cluster that's running with wal_level = minimal. Maybe, something like
this:
+ -- Test that COPY FROM works correctly when the foreign table is created in
+ -- the same transaction
+ create table test_copy_same_txn_loc (f1 int, f2 text);
+ alter table test_copy_same_txn_loc set (autovacuum_enabled = 'false');
+ begin;
+ create foreign table test_copy_same_txn_rem (f1 int, f2 text) server
loopback options(table_name 'test_copy_same_txn_loc');
+ copy test_copy_same_txn_rem from stdin;
+ ERROR: could not open file "base/19888/20234": No such file or directory
+ commit;
+ drop table test_copy_same_txn_loc;
+ drop foreign table test_copy_same_txn_rem;
+ ERROR: foreign table "test_copy_same_txn_rem" does not exist
-- ===================================================================
-- test IMPORT FOREIGN SCHEMA
-- ===================================================================
From the above output, your patch will make "ERROR: could not open file
xxx" go away.
Thanks,
Amit
On Tue, Dec 18, 2018 at 12:24:54PM +0900, Amit Langote wrote:
I noticed that too. As you say, it is not possible to exercise a test we
might add for this with `make check`, because it runs with a fixed value
of wal_level (= replica). But it is possible with `make installcheck` on
a cluster that's running with wal_level = minimal. Maybe, something like
this:[...]
From the above output, your patch will make "ERROR: could not open file
xxx" go away.
Last time we discussed about adding a test case in this area, we came up
with a TAP test, so this could apply here as well:
/messages/by-id/f20181114.124736.206988673.horiguchi.kyotaro@lab.ntt.co.jp
What scares me a lot about complicating this code is that we are not
seeing the end of it with this so-said optimization in removing the
relfilenode like that... There are other fancy cases where the failure
can happen (please see patch 0001).
--
Michael
On 2018/12/18 14:04, Michael Paquier wrote:
On Tue, Dec 18, 2018 at 12:24:54PM +0900, Amit Langote wrote:
I noticed that too. As you say, it is not possible to exercise a test we
might add for this with `make check`, because it runs with a fixed value
of wal_level (= replica). But it is possible with `make installcheck` on
a cluster that's running with wal_level = minimal. Maybe, something like
this:[...]
From the above output, your patch will make "ERROR: could not open file
xxx" go away.Last time we discussed about adding a test case in this area, we came up
with a TAP test, so this could apply here as well:
/messages/by-id/f20181114.124736.206988673.horiguchi.kyotaro@lab.ntt.co.jpWhat scares me a lot about complicating this code is that we are not
seeing the end of it with this so-said optimization in removing the
relfilenode like that... There are other fancy cases where the failure
can happen (please see patch 0001).
The link you shared is broken, so I couldn't read that email to understand
the relation of relfilenode removal optimization to the bug here or how a
TAP test was deployed for it.
To clarify, the bug in this case is that the optimization to skip writing
WAL if a "heap" relation being copied into is created in same
sub-transaction is mistakenly applied to foreign tables.
Thanks,
Amit
On Tue, Dec 18, 2018 at 02:51:10PM +0900, Amit Langote wrote:
The link you shared is broken, so I couldn't read that email to understand
the relation of relfilenode removal optimization to the bug here or how a
TAP test was deployed for it.
Oops. And here you go:
/messages/by-id/20181114.124736.206988673.horiguchi.kyotaro@lab.ntt.co.jp
--
Michael
On 2018/12/18 15:02, Michael Paquier wrote:
On Tue, Dec 18, 2018 at 02:51:10PM +0900, Amit Langote wrote:
The link you shared is broken, so I couldn't read that email to understand
the relation of relfilenode removal optimization to the bug here or how a
TAP test was deployed for it.Oops. And here you go:
/messages/by-id/20181114.124736.206988673.horiguchi.kyotaro@lab.ntt.co.jp
Thanks. I looked at:
/messages/by-id/attachment/65997/v4-0001-TAP-test-for-copy-truncation-optimization.patch
and I now get it. Adding a TAP test to src/test/recovery might make
sense, but I'm not completely sure. In this case, we don't really want to
see if the WAL-skipping optimization works correctly but that it's
correctly *disabled* for foreign tables.
Thanks,
Amit
Hi Luis and Amit,
Thanks for the report, Luis! Thanks for the discussion, Amit!
(2018/12/18 12:24), Amit Langote wrote:
On 2018/12/17 22:12, Luis Carril wrote:
heap_sync should only be called for relations that actually have files to
sync, which isn't true for foreign tables. So, a simple relkind check
before calling heap_sync() in CopyFrom would suffice I think. Although,
we might also need such a check higher up in CopyFrom where some
optimizations that are specific to "heap" relations are enabled. For
example, this piece of code:thanks for the input, so it seems that is enough with adding the check as you suggested:
if (cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE&&
cstate->rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE&&
(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId))
{
hi_options |= HEAP_INSERT_SKIP_FSM;
if (!XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
}I think that would do the trick.
I think so too.
FDWs would not look at heap_insert options, so another option would be
to 1) leave that options as-is for foreign tables and 2) if the target
is a foreign table, just skip heap_sync at the bottom of CopyFrom, or
just return without doing anything in heap_sync.
Best regards,
Etsuro Fujita
Hi everyone,
thanks for the input and the discussion, attached you can find the patch along a TAP test as Michael suggested.
Would you like something else there?
Another question about the procedure: should I submit the patch directly to the commitfest or should it be published first in psql-hacking?
Cheers
Luis M Carril
Attachments:
no_heap_opt_fdw_copy.patchtext/x-patch; name=no_heap_opt_fdw_copy.patchDownload+80-0
Fujita-san,
On 2018/12/18 21:48, Etsuro Fujita wrote:
FDWs would not look at heap_insert options, so another option would be to
1) leave that options as-is for foreign tables and 2) if the target is a
foreign table, just skip heap_sync at the bottom of CopyFrom, or just
return without doing anything in heap_sync.
I'd considered 2, but 1 might be better because it both fixes the problem
at hand and doesn't lead to misleadingly setting heap_insert options.
About adding guards in heap_sync itself to make sure that it becomes a
no-op for non-heap relations, I think that would make sense too.
Although, I wonder why it doesn't return without doing anything already,
given that it has this:
heap_sync(Relation rel)
{
/* non-WAL-logged tables never need fsync */
if (!RelationNeedsWAL(rel))
return;
where,
#define RelationNeedsWAL(relation) \
((relation)->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT)
Maybe, we never paid enough attention to the semantics of repersistence
property of foreign tables, though I admit that that's not something we
should set out to fix on this thread.
Thanks,
Amit
On Wed, Dec 19, 2018 at 09:44:42AM +0900, Amit Langote wrote:
About adding guards in heap_sync itself to make sure that it becomes a
no-op for non-heap relations, I think that would make sense too.
Although, I wonder why it doesn't return without doing anything already,
given that it has this:heap_sync(Relation rel)
{
/* non-WAL-logged tables never need fsync */
if (!RelationNeedsWAL(rel))
return;
I think that you should be careful here as we want heap_sync to remain a
rather low-level routine. For example:
/messages/by-id/20180919214858.65bwponiuqb3rnn2@alap3.anarazel.de
--
Michael
Hi,
On 2018/12/18 22:41, Luis Carril wrote:
Hi everyone,
thanks for the input and the discussion, attached you can find the patch along a TAP test as Michael suggested.
Would you like something else there?
Thanks for updating the patch. Some comments:
diff --git a/contrib/postgres_fdw/t/001_copy_same_txn_rem.pl
b/contrib/postgres_fdw/t/001_copy_same_txn_rem.pl
Not sure that's a good name to give to the test file, although since I'm
not greatly enthusiastic about adding a TAP test for this, I don't have a
good suggestion. Maybe, choose a more generic name? Michael, any
suggestions?
+ CREATE SERVER testserver FOREIGN DATA WRAPPER postgres_fdw;
...
+ CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
These two lines are not really necessary.
@@ -2401,6 +2401,7 @@ CopyFrom(CopyState cstate)
^I */
^I/* createSubid is creation check, newRelfilenodeSubid is truncation
check */
^Iif (cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
+^I cstate->rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
^I^I(cstate->rel->rd_createSubid != InvalidSubTransactionId ||
^I^I cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId))
^I{
It seems you've used spaces to indent the newly added line. Please,
change that to a tab like neighboring lines.
Another question about the procedure: should I submit the patch directly to the commitfest or should it be published first in psql-hacking?
Generally, it makes sense to create a CF entry *after* you've sent an
email containing the patch, so that you can link to it immediately from
the entry. Since you've posted the patch here, you can create the CF
entry under the topic Bug Fixes. You'll need a PostgreSQL community
account to do that.
Thanks,
Amit
On 2018/12/19 10:19, Michael Paquier wrote:
On Wed, Dec 19, 2018 at 09:44:42AM +0900, Amit Langote wrote:
About adding guards in heap_sync itself to make sure that it becomes a
no-op for non-heap relations, I think that would make sense too.
Although, I wonder why it doesn't return without doing anything already,
given that it has this:heap_sync(Relation rel)
{
/* non-WAL-logged tables never need fsync */
if (!RelationNeedsWAL(rel))
return;I think that you should be careful here as we want heap_sync to remain a
rather low-level routine. For example:
/messages/by-id/20180919214858.65bwponiuqb3rnn2@alap3.anarazel.de
I agree. Though, Andres also said this:
=====
All the other callers of heap_sync don't care about partitioned
tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE.
Or rather, it should assert the expected relkinds?
======
which is what I was thinking. Instead of specifically preventing
partitioned tables, or foreign tables, or views, we could assert that only
relations having heap files are passed.
Thanks,
Amit
On 2018/12/19 10:24, Amit Langote wrote:
which is what I was thinking. Instead of specifically preventing
partitioned tables, or foreign tables, or views, we could assert that only
relations having heap files are passed.
Sorry, that's not what I'd said in my last email. I'd said we should add
guards so that it becomes a no-op for unsupported relkinds, which is not
the same thing as the above, so maybe we shouldn't do that. We should fix
the callers so that heap_sync is called only for heap relations. So, the
patch posted by Luis is on a good path.
Thanks,
Amit
On Wed, Dec 19, 2018 at 10:19:58AM +0900, Amit Langote wrote:
Not sure that's a good name to give to the test file, although since I'm
not greatly enthusiastic about adding a TAP test for this, I don't have a
good suggestion. Maybe, choose a more generic name? Michael, any
suggestions?
I have looked at the proposal seriously myself, and adding a TAP test
for that edge case is overdoing it. So why not just adding a test to
postgres_fdw.sql? The failure can be triggered with installcheck and
wal_level =3D minimal. Not everybody tests with non-default options, but
some of us do.
@@ -2401,6 +2401,7 @@ CopyFrom(CopyState cstate) ^I */ ^I/* createSubid is creation check, newRelfilenodeSubid is truncation check */ ^Iif (cstate->rel->rd_rel->relkind !=3D RELKIND_PARTITIONED_TABLE && +^I cstate->rel->rd_rel->relkind !=3D RELKIND_FOREIGN_TABLE && ^I^I(cstate->rel->rd_createSubid !=3D InvalidSubTransactionId || ^I^I cstate->rel->rd_newRelfilenodeSubid !=3D InvalidSubTransactionId)) ^I{ =20 It seems you've used spaces to indent the newly added line. Please, change that to a tab like neighboring lines.
Indeed.
Attached is a more simple, updated, patch which adds a new test and a
comment. I would rather not play with the semantics of heap_sync() on
the back branches as well as on a thread dealing with a bug about COPY
with foreign tables. Such discussions deserve a larger audience on
-hackers.
it is a bit funny to see COPY FREEZE working for foreign tables
actually, but perhaps this has some use-cases for some FDWs, so I'd
rather not touch it.
--
Michael
Attachments:
no_heap_opt_fdw_copy-v4.patchtext/x-diff; charset=us-asciiDownload+34-0
On 2018/12/19 11:38, Michael Paquier wrote:
On Wed, Dec 19, 2018 at 10:19:58AM +0900, Amit Langote wrote:
Not sure that's a good name to give to the test file, although since I'm
not greatly enthusiastic about adding a TAP test for this, I don't have a
good suggestion. Maybe, choose a more generic name? Michael, any
suggestions?I have looked at the proposal seriously myself, and adding a TAP test
for that edge case is overdoing it. So why not just adding a test to
postgres_fdw.sql? The failure can be triggered with installcheck and
wal_level =3D minimal. Not everybody tests with non-default options, but
some of us do.
Yeah, I said that upthread.
Attached is a more simple, updated, patch which adds a new test and a
comment. I would rather not play with the semantics of heap_sync() on
the back branches as well as on a thread dealing with a bug about COPY
with foreign tables. Such discussions deserve a larger audience on
-hackers.
+1
Patch looks good, by the way.
it is a bit funny to see COPY FREEZE working for foreign tables
actually, but perhaps this has some use-cases for some FDWs, so I'd
rather not touch it.
Hmm, note that we don't pass CopyState->freeze option to the FDW drivers
today, though we may in the future when we add an *actual* COPY interface
to FDWs. Wouldn't it be a good idea to prevent specifying the FREEZE
option for foreign tables as COPY targets until then? Of course, that's
something for -hackers to consider.
Thanks,
Amit
On Wed, Dec 19, 2018 at 11:52:09AM +0900, Amit Langote wrote:
Hmm, note that we don't pass CopyState->freeze option to the FDW drivers
today, though we may in the future when we add an *actual* COPY interface
to FDWs. Wouldn't it be a good idea to prevent specifying the FREEZE
option for foreign tables as COPY targets until then? Of course, that's
something for -hackers to consider.
No idea if that will happen. One argument for doing nothing yet is that
this command does not fail now so changing it may cause compatibility
problems.
--
Michael
On 2018/12/19 12:51, Michael Paquier wrote:
On Wed, Dec 19, 2018 at 11:52:09AM +0900, Amit Langote wrote:
Hmm, note that we don't pass CopyState->freeze option to the FDW drivers
today, though we may in the future when we add an *actual* COPY interface
to FDWs. Wouldn't it be a good idea to prevent specifying the FREEZE
option for foreign tables as COPY targets until then? Of course, that's
something for -hackers to consider.No idea if that will happen.
It may happen sooner that you may think. :)
Currently, COPY into foreign tables uses APIs used by INSERT, so one row
at a time interface. It'd certainly be better to allow FDWs to be able to
do bulk transfer. I think Fujita-san has said he wants to work on that:
/messages/by-id/5AB4F190.1000804@lab.ntt.co.jp
One argument for doing nothing yet is that
this command does not fail now so changing it may cause compatibility
problems.
Just FYI, COPYing into foreign tables is only supported as of PG 11.
Thanks,
Amit
On Wed, Dec 19, 2018 at 01:01:01PM +0900, Amit Langote wrote:
It may happen sooner that you may think. :)
Currently, COPY into foreign tables uses APIs used by INSERT, so one row
at a time interface. It'd certainly be better to allow FDWs to be able to
do bulk transfer. I think Fujita-san has said he wants to work on that:
That's interesting.
One argument for doing nothing yet is that
this command does not fail now so changing it may cause compatibility
problems.Just FYI, COPYing into foreign tables is only supported as of PG 11.
Which has already been released.. Anyway.
--
Michael