Add pg_file_sync() to adminpack

Started by Fujii Masaoabout 6 years ago37 messages
#1Fujii Masao
masao.fujii@gmail.com
1 attachment(s)

Hi,

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

Regards,

--
Fujii Masao

Attachments:

pg_file_sync_v1.patchapplication/octet-stream; name=pg_file_sync_v1.patchDownload
diff --git a/contrib/adminpack/Makefile b/contrib/adminpack/Makefile
index 9a464b11bc..630fea7726 100644
--- a/contrib/adminpack/Makefile
+++ b/contrib/adminpack/Makefile
@@ -7,7 +7,8 @@ OBJS = \
 PG_CPPFLAGS = -I$(libpq_srcdir)
 
 EXTENSION = adminpack
-DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql
+DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql\
+	adminpack--2.0--2.1.sql
 PGFILEDESC = "adminpack - support functions for pgAdmin"
 
 REGRESS = adminpack
diff --git a/contrib/adminpack/adminpack--2.0--2.1.sql b/contrib/adminpack/adminpack--2.0--2.1.sql
new file mode 100644
index 0000000000..b190d4f3ac
--- /dev/null
+++ b/contrib/adminpack/adminpack--2.0--2.1.sql
@@ -0,0 +1,17 @@
+/* contrib/adminpack/adminpack--2.0--2.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION adminpack UPDATE TO '2.1'" to load this file. \quit
+
+/* ***********************************************
+ * Administrative functions for PostgreSQL
+ * *********************************************** */
+
+/* generic file access functions */
+
+CREATE OR REPLACE FUNCTION pg_catalog.pg_file_sync(text)
+RETURNS boolean
+AS 'MODULE_PATHNAME', 'pg_file_sync'
+LANGUAGE C VOLATILE STRICT;
+
+REVOKE EXECUTE ON FUNCTION pg_catalog.pg_file_sync(text) FROM PUBLIC;
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index e88f0a7a16..0cea2fe164 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -43,6 +43,7 @@ PG_MODULE_MAGIC;
 
 PG_FUNCTION_INFO_V1(pg_file_write);
 PG_FUNCTION_INFO_V1(pg_file_write_v1_1);
+PG_FUNCTION_INFO_V1(pg_file_sync);
 PG_FUNCTION_INFO_V1(pg_file_rename);
 PG_FUNCTION_INFO_V1(pg_file_rename_v1_1);
 PG_FUNCTION_INFO_V1(pg_file_unlink);
@@ -215,6 +216,30 @@ pg_file_write_internal(text *file, text *data, bool replace)
 	return (count);
 }
 
+/* ------------------------------------
+ * pg_file_sync
+ *
+ * We REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ */
+Datum
+pg_file_sync(PG_FUNCTION_ARGS)
+{
+	char	   *filename;
+	struct stat	fst;
+
+	filename = convert_and_check_filename(PG_GETARG_TEXT_PP(0), false);
+
+	if (stat(filename, &fst) < 0)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not stat file \"%s\": %m", filename)));
+
+	fsync_fname(filename, S_ISDIR(fst.st_mode));
+
+	PG_RETURN_BOOL(true);
+}
+
 /* ------------------------------------
  * pg_file_rename - old version
  *
diff --git a/contrib/adminpack/adminpack.control b/contrib/adminpack/adminpack.control
index 12569dcdd7..ae35d22156 100644
--- a/contrib/adminpack/adminpack.control
+++ b/contrib/adminpack/adminpack.control
@@ -1,6 +1,6 @@
 # adminpack extension
 comment = 'administrative functions for PostgreSQL'
-default_version = '2.0'
+default_version = '2.1'
 module_pathname = '$libdir/adminpack'
 relocatable = false
 schema = pg_catalog
diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out
index 8747ac69a2..a72e29d657 100644
--- a/contrib/adminpack/expected/adminpack.out
+++ b/contrib/adminpack/expected/adminpack.out
@@ -56,6 +56,21 @@ RESET ROLE;
 REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+ pg_file_sync 
+--------------
+ t
+(1 row)
+
+SELECT pg_file_sync('global'); -- sync directory
+ pg_file_sync 
+--------------
+ t
+(1 row)
+
+SELECT pg_file_sync('test_file2'); -- not there
+ERROR:  could not stat file "test_file2": No such file or directory
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
  pg_file_rename 
@@ -142,6 +157,8 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 SELECT pg_file_write('test_file0', 'test0', false);
 ERROR:  permission denied for function pg_file_write
+SELECT pg_file_sync('test_file0');
+ERROR:  permission denied for function pg_file_sync
 SELECT pg_file_rename('test_file0', 'test_file0');
 ERROR:  permission denied for function pg_file_rename
 CONTEXT:  SQL function "pg_file_rename" statement 1
diff --git a/contrib/adminpack/sql/adminpack.sql b/contrib/adminpack/sql/adminpack.sql
index 1525f0a82b..6543fca8c1 100644
--- a/contrib/adminpack/sql/adminpack.sql
+++ b/contrib/adminpack/sql/adminpack.sql
@@ -29,6 +29,11 @@ REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
 
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+SELECT pg_file_sync('global'); -- sync directory
+SELECT pg_file_sync('test_file2'); -- not there
+
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
 SELECT pg_read_file('test_file1');  -- not there
@@ -58,6 +63,7 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 
 SELECT pg_file_write('test_file0', 'test0', false);
+SELECT pg_file_sync('test_file0');
 SELECT pg_file_rename('test_file0', 'test_file0');
 SELECT pg_file_unlink('test_file0');
 SELECT pg_logdir_ls();
diff --git a/doc/src/sgml/adminpack.sgml b/doc/src/sgml/adminpack.sgml
index 2655417366..f787a6d514 100644
--- a/doc/src/sgml/adminpack.sgml
+++ b/doc/src/sgml/adminpack.sgml
@@ -43,6 +43,13 @@
       Write, or append to, a text file
      </entry>
     </row>
+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>boolean</type></entry>
+     <entry>
+      Sync a file or directory
+     </entry>
+    </row>
     <row>
      <entry><function>pg_catalog.pg_file_rename(oldname text, newname text <optional>, archivename text</optional>)</function></entry>
      <entry><type>boolean</type></entry>
@@ -79,6 +86,15 @@
   Returns the number of bytes written.
  </para>
 
+ <indexterm>
+  <primary>pg_file_sync</primary>
+ </indexterm>
+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  Returns true on success,
+  an error is thrown otherwise (e.g., the specified file is not present).
+ </para>
+
  <indexterm>
   <primary>pg_file_rename</primary>
  </indexterm>
#2Julien Rouhaud
rjuju123@gmail.com
In reply to: Fujii Masao (#1)
Re: Add pg_file_sync() to adminpack

On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao <masao.fujii@gmail.com> wrote:

Hi,

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

+1, that seems like a useful wrapper. Looking at existing functions,
I see that there's a pg_file_rename() in adminpack, but it doesn't use
durable_rename nor does it try to perform any fsync. Same for
pg_file_unlink vs. durable_unlink. It's probably worth fixing that at
the same time?

#3Arthur Zakirov
zaartur@gmail.com
In reply to: Julien Rouhaud (#2)
Re: Add pg_file_sync() to adminpack

Hello,

On 2019/12/25 23:12, Julien Rouhaud wrote:

On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao <masao.fujii@gmail.com> wrote:

Hi,

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

+1 too. I have a thought, but maybe it is just a nitpicking.

pg_file_sync() calls fsync_fname() function from fd.c. And I think it
might bring problems because fsync_fname() uses data_sync_elevel() to
get elevel. As a result if data_sync_retry GUC is false fsync_fname()
might raise PANIC message.

It isn't case if a file doesn't exist. But if there are no permissions
on the file:

PANIC: could not open file "testfile": Permissions denied
server closed the connection unexpectedly

It could be fixed by implementing a function like
pg_file_sync_internal() or by making the function fsync_fname_ext()
external.

+1, that seems like a useful wrapper. Looking at existing functions,
I see that there's a pg_file_rename() in adminpack, but it doesn't use
durable_rename nor does it try to perform any fsync. Same for
pg_file_unlink vs. durable_unlink. It's probably worth fixing that at
the same time?

I think it might be a different patch.

--
Arthur

#4Michael Paquier
michael@paquier.xyz
In reply to: Arthur Zakirov (#3)
Re: Add pg_file_sync() to adminpack

On Mon, Jan 06, 2020 at 03:20:13PM +0900, Arthur Zakirov wrote:

It isn't case if a file doesn't exist. But if there are no permissions on
the file:

PANIC: could not open file "testfile": Permissions denied
server closed the connection unexpectedly

It could be fixed by implementing a function like pg_file_sync_internal() or
by making the function fsync_fname_ext() external.

The patch uses stat() to make sure that the file exists and has no
issues. Though it could be a problem with any kind of TOCTOU-like
issues (looking at you, Windows, for ENOPERM), so I agree that it
would make more sense to use pg_fsync() here with a fd opened first.
--
Michael

#5Fujii Masao
masao.fujii@gmail.com
In reply to: Michael Paquier (#4)
1 attachment(s)
Re: Add pg_file_sync() to adminpack

On Mon, Jan 6, 2020 at 3:42 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jan 06, 2020 at 03:20:13PM +0900, Arthur Zakirov wrote:

It isn't case if a file doesn't exist. But if there are no permissions on
the file:

PANIC: could not open file "testfile": Permissions denied
server closed the connection unexpectedly

It could be fixed by implementing a function like pg_file_sync_internal() or
by making the function fsync_fname_ext() external.

The patch uses stat() to make sure that the file exists and has no
issues. Though it could be a problem with any kind of TOCTOU-like
issues (looking at you, Windows, for ENOPERM), so I agree that it
would make more sense to use pg_fsync() here with a fd opened first.

I agree that it's not good for pg_file_sync() to cause a PANIC.
I updated the patch so that pg_file_sync() uses fsync_fname_ext()
instead of fsync_fname() as Arthur suggested.

It's one of ideas to make pg_file_sync() open the file and directly call
pg_fsync(). But fsync_fname_ext() has already such code and
I'd like to avoid the code duplication.

Attached is the updated version of the patch.

Regards,

--
Fujii Masao

Attachments:

pg_file_sync_v2.patchapplication/octet-stream; name=pg_file_sync_v2.patchDownload
diff --git a/contrib/adminpack/Makefile b/contrib/adminpack/Makefile
index 9a464b11bc..630fea7726 100644
--- a/contrib/adminpack/Makefile
+++ b/contrib/adminpack/Makefile
@@ -7,7 +7,8 @@ OBJS = \
 PG_CPPFLAGS = -I$(libpq_srcdir)
 
 EXTENSION = adminpack
-DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql
+DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql\
+	adminpack--2.0--2.1.sql
 PGFILEDESC = "adminpack - support functions for pgAdmin"
 
 REGRESS = adminpack
diff --git a/contrib/adminpack/adminpack--2.0--2.1.sql b/contrib/adminpack/adminpack--2.0--2.1.sql
new file mode 100644
index 0000000000..b190d4f3ac
--- /dev/null
+++ b/contrib/adminpack/adminpack--2.0--2.1.sql
@@ -0,0 +1,17 @@
+/* contrib/adminpack/adminpack--2.0--2.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION adminpack UPDATE TO '2.1'" to load this file. \quit
+
+/* ***********************************************
+ * Administrative functions for PostgreSQL
+ * *********************************************** */
+
+/* generic file access functions */
+
+CREATE OR REPLACE FUNCTION pg_catalog.pg_file_sync(text)
+RETURNS boolean
+AS 'MODULE_PATHNAME', 'pg_file_sync'
+LANGUAGE C VOLATILE STRICT;
+
+REVOKE EXECUTE ON FUNCTION pg_catalog.pg_file_sync(text) FROM PUBLIC;
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index 710f4ea32d..ba22be2022 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -43,6 +43,7 @@ PG_MODULE_MAGIC;
 
 PG_FUNCTION_INFO_V1(pg_file_write);
 PG_FUNCTION_INFO_V1(pg_file_write_v1_1);
+PG_FUNCTION_INFO_V1(pg_file_sync);
 PG_FUNCTION_INFO_V1(pg_file_rename);
 PG_FUNCTION_INFO_V1(pg_file_rename_v1_1);
 PG_FUNCTION_INFO_V1(pg_file_unlink);
@@ -215,6 +216,30 @@ pg_file_write_internal(text *file, text *data, bool replace)
 	return (count);
 }
 
+/* ------------------------------------
+ * pg_file_sync
+ *
+ * We REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ */
+Datum
+pg_file_sync(PG_FUNCTION_ARGS)
+{
+	char	   *filename;
+	struct stat	fst;
+
+	filename = convert_and_check_filename(PG_GETARG_TEXT_PP(0), false);
+
+	if (stat(filename, &fst) < 0)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not stat file \"%s\": %m", filename)));
+
+	fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
+
+	PG_RETURN_BOOL(true);
+}
+
 /* ------------------------------------
  * pg_file_rename - old version
  *
diff --git a/contrib/adminpack/adminpack.control b/contrib/adminpack/adminpack.control
index 12569dcdd7..ae35d22156 100644
--- a/contrib/adminpack/adminpack.control
+++ b/contrib/adminpack/adminpack.control
@@ -1,6 +1,6 @@
 # adminpack extension
 comment = 'administrative functions for PostgreSQL'
-default_version = '2.0'
+default_version = '2.1'
 module_pathname = '$libdir/adminpack'
 relocatable = false
 schema = pg_catalog
diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out
index 8747ac69a2..a72e29d657 100644
--- a/contrib/adminpack/expected/adminpack.out
+++ b/contrib/adminpack/expected/adminpack.out
@@ -56,6 +56,21 @@ RESET ROLE;
 REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+ pg_file_sync 
+--------------
+ t
+(1 row)
+
+SELECT pg_file_sync('global'); -- sync directory
+ pg_file_sync 
+--------------
+ t
+(1 row)
+
+SELECT pg_file_sync('test_file2'); -- not there
+ERROR:  could not stat file "test_file2": No such file or directory
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
  pg_file_rename 
@@ -142,6 +157,8 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 SELECT pg_file_write('test_file0', 'test0', false);
 ERROR:  permission denied for function pg_file_write
+SELECT pg_file_sync('test_file0');
+ERROR:  permission denied for function pg_file_sync
 SELECT pg_file_rename('test_file0', 'test_file0');
 ERROR:  permission denied for function pg_file_rename
 CONTEXT:  SQL function "pg_file_rename" statement 1
diff --git a/contrib/adminpack/sql/adminpack.sql b/contrib/adminpack/sql/adminpack.sql
index 1525f0a82b..6543fca8c1 100644
--- a/contrib/adminpack/sql/adminpack.sql
+++ b/contrib/adminpack/sql/adminpack.sql
@@ -29,6 +29,11 @@ REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
 
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+SELECT pg_file_sync('global'); -- sync directory
+SELECT pg_file_sync('test_file2'); -- not there
+
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
 SELECT pg_read_file('test_file1');  -- not there
@@ -58,6 +63,7 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 
 SELECT pg_file_write('test_file0', 'test0', false);
+SELECT pg_file_sync('test_file0');
 SELECT pg_file_rename('test_file0', 'test_file0');
 SELECT pg_file_unlink('test_file0');
 SELECT pg_logdir_ls();
diff --git a/doc/src/sgml/adminpack.sgml b/doc/src/sgml/adminpack.sgml
index 2655417366..f787a6d514 100644
--- a/doc/src/sgml/adminpack.sgml
+++ b/doc/src/sgml/adminpack.sgml
@@ -43,6 +43,13 @@
       Write, or append to, a text file
      </entry>
     </row>
+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>boolean</type></entry>
+     <entry>
+      Sync a file or directory
+     </entry>
+    </row>
     <row>
      <entry><function>pg_catalog.pg_file_rename(oldname text, newname text <optional>, archivename text</optional>)</function></entry>
      <entry><type>boolean</type></entry>
@@ -79,6 +86,15 @@
   Returns the number of bytes written.
  </para>
 
+ <indexterm>
+  <primary>pg_file_sync</primary>
+ </indexterm>
+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  Returns true on success,
+  an error is thrown otherwise (e.g., the specified file is not present).
+ </para>
+
  <indexterm>
   <primary>pg_file_rename</primary>
  </indexterm>
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index fa79b45f63..b5f4df6a48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -319,7 +319,6 @@ static void pre_sync_fname(const char *fname, bool isdir, int elevel);
 static void datadir_fsync_fname(const char *fname, bool isdir, int elevel);
 static void unlink_if_exists_fname(const char *fname, bool isdir, int elevel);
 
-static int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 static int	fsync_parent_path(const char *fname, int elevel);
 
 
@@ -3376,7 +3375,7 @@ unlink_if_exists_fname(const char *fname, bool isdir, int elevel)
  *
  * Returns 0 if the operation succeeded, -1 otherwise.
  */
-static int
+int
 fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel)
 {
 	int			fd;
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index c6ce7eacf2..51e2ece3c9 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -145,6 +145,7 @@ extern int	pg_fsync_writethrough(int fd);
 extern int	pg_fdatasync(int fd);
 extern void pg_flush_data(int fd, off_t offset, off_t amount);
 extern void fsync_fname(const char *fname, bool isdir);
+extern int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 extern int	durable_rename(const char *oldfile, const char *newfile, int loglevel);
 extern int	durable_unlink(const char *fname, int loglevel);
 extern int	durable_link_or_rename(const char *oldfile, const char *newfile, int loglevel);
#6Fujii Masao
masao.fujii@gmail.com
In reply to: Julien Rouhaud (#2)
Re: Add pg_file_sync() to adminpack

On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao <masao.fujii@gmail.com> wrote:

Hi,

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

+1, that seems like a useful wrapper. Looking at existing functions,
I see that there's a pg_file_rename() in adminpack, but it doesn't use
durable_rename nor does it try to perform any fsync. Same for
pg_file_unlink vs. durable_unlink. It's probably worth fixing that at
the same time?

I don't think that's a bug. I'm not sure if every users of those functions
need durable rename and unlink at the expense of performance.
So IMO it's better to add new argument like "durable" to those functions
and durable_rename or _unlink is used only if it's true.

Regards,

--
Fujii Masao

#7Julien Rouhaud
rjuju123@gmail.com
In reply to: Fujii Masao (#6)
Re: Add pg_file_sync() to adminpack

On Thu, Jan 9, 2020 at 7:43 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao <masao.fujii@gmail.com> wrote:

Hi,

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

+1, that seems like a useful wrapper. Looking at existing functions,
I see that there's a pg_file_rename() in adminpack, but it doesn't use
durable_rename nor does it try to perform any fsync. Same for
pg_file_unlink vs. durable_unlink. It's probably worth fixing that at
the same time?

I don't think that's a bug. I'm not sure if every users of those functions
need durable rename and unlink at the expense of performance.
So IMO it's better to add new argument like "durable" to those functions
and durable_rename or _unlink is used only if it's true.

It's probably a POLA violation. I'm pretty sure that most people
using those functions would expect that a successful call to
pg_file_unlink() mean that the file cannot raise from the dead even
with certain unlikely circumstances, at least I'd expect so. If
performance is a problem here, I'd rather have a new wrapper with a
sync flag that defaults to true so it's possible to disable it if
needed instead of calling a different function. That being said, I
agree with Arthur, it should be handled in a different patch.

#8Julien Rouhaud
rjuju123@gmail.com
In reply to: Fujii Masao (#5)
Re: Add pg_file_sync() to adminpack

On Thu, Jan 9, 2020 at 7:31 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, Jan 6, 2020 at 3:42 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jan 06, 2020 at 03:20:13PM +0900, Arthur Zakirov wrote:

It isn't case if a file doesn't exist. But if there are no permissions on
the file:

PANIC: could not open file "testfile": Permissions denied
server closed the connection unexpectedly

It could be fixed by implementing a function like pg_file_sync_internal() or
by making the function fsync_fname_ext() external.

The patch uses stat() to make sure that the file exists and has no
issues. Though it could be a problem with any kind of TOCTOU-like
issues (looking at you, Windows, for ENOPERM), so I agree that it
would make more sense to use pg_fsync() here with a fd opened first.

I agree that it's not good for pg_file_sync() to cause a PANIC.
I updated the patch so that pg_file_sync() uses fsync_fname_ext()
instead of fsync_fname() as Arthur suggested.

It's one of ideas to make pg_file_sync() open the file and directly call
pg_fsync(). But fsync_fname_ext() has already such code and
I'd like to avoid the code duplication.

This looks good to me.

Attached is the updated version of the patch.

+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>boolean</type></entry>
+     <entry>
+      Sync a file or directory
+     </entry>
+    </row>

"Flush to disk" looks better than "sync" here.

+/* ------------------------------------
+ * pg_file_sync
+ *
+ * We REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ */

I think that pg_write_server_files should be allowed to call that
function by default.

#9Stephen Frost
sfrost@snowman.net
In reply to: Julien Rouhaud (#7)
Re: Add pg_file_sync() to adminpack

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Thu, Jan 9, 2020 at 7:43 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao <masao.fujii@gmail.com> wrote:

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

+1, that seems like a useful wrapper. Looking at existing functions,
I see that there's a pg_file_rename() in adminpack, but it doesn't use
durable_rename nor does it try to perform any fsync. Same for
pg_file_unlink vs. durable_unlink. It's probably worth fixing that at
the same time?

I don't think that's a bug. I'm not sure if every users of those functions
need durable rename and unlink at the expense of performance.
So IMO it's better to add new argument like "durable" to those functions
and durable_rename or _unlink is used only if it's true.

It's probably a POLA violation. I'm pretty sure that most people
using those functions would expect that a successful call to
pg_file_unlink() mean that the file cannot raise from the dead even
with certain unlikely circumstances, at least I'd expect so. If
performance is a problem here, I'd rather have a new wrapper with a
sync flag that defaults to true so it's possible to disable it if
needed instead of calling a different function. That being said, I
agree with Arthur, it should be handled in a different patch.

Why would you expect that when it isn't the case for the filesystem
itself..? I agree with Fujii on this- you should have to explicitly ask
for us to do more than the equivilant filesystem-level operation. We
shouldn't be forcing that on you.

Thanks,

Stephen

#10Julien Rouhaud
rjuju123@gmail.com
In reply to: Stephen Frost (#9)
Re: Add pg_file_sync() to adminpack

On Thu, Jan 9, 2020 at 6:16 PM Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Thu, Jan 9, 2020 at 7:43 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao <masao.fujii@gmail.com> wrote:

I'd like to propose to add pg_file_sync() function into contrib/adminpack.
This function fsyncs the specified file or directory named by its argument.
IMO this is useful, for example, when you want to fsync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability. Thought?

+1, that seems like a useful wrapper. Looking at existing functions,
I see that there's a pg_file_rename() in adminpack, but it doesn't use
durable_rename nor does it try to perform any fsync. Same for
pg_file_unlink vs. durable_unlink. It's probably worth fixing that at
the same time?

I don't think that's a bug. I'm not sure if every users of those functions
need durable rename and unlink at the expense of performance.
So IMO it's better to add new argument like "durable" to those functions
and durable_rename or _unlink is used only if it's true.

It's probably a POLA violation. I'm pretty sure that most people
using those functions would expect that a successful call to
pg_file_unlink() mean that the file cannot raise from the dead even
with certain unlikely circumstances, at least I'd expect so. If
performance is a problem here, I'd rather have a new wrapper with a
sync flag that defaults to true so it's possible to disable it if
needed instead of calling a different function. That being said, I
agree with Arthur, it should be handled in a different patch.

Why would you expect that when it isn't the case for the filesystem
itself..?

Just a usual habit with durable property.

I agree with Fujii on this- you should have to explicitly ask
for us to do more than the equivilant filesystem-level operation. We
shouldn't be forcing that on you.

I just checked other somehow related cases and saw that for instance
COPY TO doesn't flush data either, so it's indeed the expected
behavior.

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#10)
Re: Add pg_file_sync() to adminpack

Julien Rouhaud <rjuju123@gmail.com> writes:

On Thu, Jan 9, 2020 at 6:16 PM Stephen Frost <sfrost@snowman.net> wrote:

Why would you expect that when it isn't the case for the filesystem
itself..?

Just a usual habit with durable property.

I tend to agree with Stephen on this, mainly because the point of
these adminpack functions is to expose filesystem access. If these
functions were more "database-y" and less "filesystem-y", I'd agree
with trying to impose database-like consistency requirements.

We don't have to expose every wart of the filesystem semantics
--- for example, it would be reasonable to make pg_file_sync()
Do The Right Thing when applied to a directory, even if the
particular platform we're on doesn't behave sanely for that.
But having fsync separated from write is a pretty fundamental part
of most filesystems' semantics, so we ought not try to hide that.

regards, tom lane

#12Fujii Masao
masao.fujii@gmail.com
In reply to: Julien Rouhaud (#8)
1 attachment(s)
Re: Add pg_file_sync() to adminpack

On Thu, Jan 9, 2020 at 10:39 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Thu, Jan 9, 2020 at 7:31 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, Jan 6, 2020 at 3:42 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jan 06, 2020 at 03:20:13PM +0900, Arthur Zakirov wrote:

It isn't case if a file doesn't exist. But if there are no permissions on
the file:

PANIC: could not open file "testfile": Permissions denied
server closed the connection unexpectedly

It could be fixed by implementing a function like pg_file_sync_internal() or
by making the function fsync_fname_ext() external.

The patch uses stat() to make sure that the file exists and has no
issues. Though it could be a problem with any kind of TOCTOU-like
issues (looking at you, Windows, for ENOPERM), so I agree that it
would make more sense to use pg_fsync() here with a fd opened first.

I agree that it's not good for pg_file_sync() to cause a PANIC.
I updated the patch so that pg_file_sync() uses fsync_fname_ext()
instead of fsync_fname() as Arthur suggested.

It's one of ideas to make pg_file_sync() open the file and directly call
pg_fsync(). But fsync_fname_ext() has already such code and
I'd like to avoid the code duplication.

This looks good to me.

Attached is the updated version of the patch.

+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>boolean</type></entry>
+     <entry>
+      Sync a file or directory
+     </entry>
+    </row>

"Flush to disk" looks better than "sync" here.

I changed the doc that way. Thanks for the review!

I think that pg_write_server_files should be allowed to call that
function by default.

But pg_write_server_files users are not allowed to execute
other functions like pg_file_write() by default. So doing that
change only for pg_file_sync() looks strange to me.

Regards,

--
Fujii Masao

Attachments:

pg_file_sync_v3.patchapplication/octet-stream; name=pg_file_sync_v3.patchDownload
diff --git a/contrib/adminpack/Makefile b/contrib/adminpack/Makefile
index 9a464b11bc..630fea7726 100644
--- a/contrib/adminpack/Makefile
+++ b/contrib/adminpack/Makefile
@@ -7,7 +7,8 @@ OBJS = \
 PG_CPPFLAGS = -I$(libpq_srcdir)
 
 EXTENSION = adminpack
-DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql
+DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql\
+	adminpack--2.0--2.1.sql
 PGFILEDESC = "adminpack - support functions for pgAdmin"
 
 REGRESS = adminpack
diff --git a/contrib/adminpack/adminpack--2.0--2.1.sql b/contrib/adminpack/adminpack--2.0--2.1.sql
new file mode 100644
index 0000000000..b190d4f3ac
--- /dev/null
+++ b/contrib/adminpack/adminpack--2.0--2.1.sql
@@ -0,0 +1,17 @@
+/* contrib/adminpack/adminpack--2.0--2.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION adminpack UPDATE TO '2.1'" to load this file. \quit
+
+/* ***********************************************
+ * Administrative functions for PostgreSQL
+ * *********************************************** */
+
+/* generic file access functions */
+
+CREATE OR REPLACE FUNCTION pg_catalog.pg_file_sync(text)
+RETURNS boolean
+AS 'MODULE_PATHNAME', 'pg_file_sync'
+LANGUAGE C VOLATILE STRICT;
+
+REVOKE EXECUTE ON FUNCTION pg_catalog.pg_file_sync(text) FROM PUBLIC;
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index 710f4ea32d..ba22be2022 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -43,6 +43,7 @@ PG_MODULE_MAGIC;
 
 PG_FUNCTION_INFO_V1(pg_file_write);
 PG_FUNCTION_INFO_V1(pg_file_write_v1_1);
+PG_FUNCTION_INFO_V1(pg_file_sync);
 PG_FUNCTION_INFO_V1(pg_file_rename);
 PG_FUNCTION_INFO_V1(pg_file_rename_v1_1);
 PG_FUNCTION_INFO_V1(pg_file_unlink);
@@ -215,6 +216,30 @@ pg_file_write_internal(text *file, text *data, bool replace)
 	return (count);
 }
 
+/* ------------------------------------
+ * pg_file_sync
+ *
+ * We REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ */
+Datum
+pg_file_sync(PG_FUNCTION_ARGS)
+{
+	char	   *filename;
+	struct stat	fst;
+
+	filename = convert_and_check_filename(PG_GETARG_TEXT_PP(0), false);
+
+	if (stat(filename, &fst) < 0)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not stat file \"%s\": %m", filename)));
+
+	fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
+
+	PG_RETURN_BOOL(true);
+}
+
 /* ------------------------------------
  * pg_file_rename - old version
  *
diff --git a/contrib/adminpack/adminpack.control b/contrib/adminpack/adminpack.control
index 12569dcdd7..ae35d22156 100644
--- a/contrib/adminpack/adminpack.control
+++ b/contrib/adminpack/adminpack.control
@@ -1,6 +1,6 @@
 # adminpack extension
 comment = 'administrative functions for PostgreSQL'
-default_version = '2.0'
+default_version = '2.1'
 module_pathname = '$libdir/adminpack'
 relocatable = false
 schema = pg_catalog
diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out
index 8747ac69a2..a72e29d657 100644
--- a/contrib/adminpack/expected/adminpack.out
+++ b/contrib/adminpack/expected/adminpack.out
@@ -56,6 +56,21 @@ RESET ROLE;
 REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+ pg_file_sync 
+--------------
+ t
+(1 row)
+
+SELECT pg_file_sync('global'); -- sync directory
+ pg_file_sync 
+--------------
+ t
+(1 row)
+
+SELECT pg_file_sync('test_file2'); -- not there
+ERROR:  could not stat file "test_file2": No such file or directory
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
  pg_file_rename 
@@ -142,6 +157,8 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 SELECT pg_file_write('test_file0', 'test0', false);
 ERROR:  permission denied for function pg_file_write
+SELECT pg_file_sync('test_file0');
+ERROR:  permission denied for function pg_file_sync
 SELECT pg_file_rename('test_file0', 'test_file0');
 ERROR:  permission denied for function pg_file_rename
 CONTEXT:  SQL function "pg_file_rename" statement 1
diff --git a/contrib/adminpack/sql/adminpack.sql b/contrib/adminpack/sql/adminpack.sql
index 1525f0a82b..6543fca8c1 100644
--- a/contrib/adminpack/sql/adminpack.sql
+++ b/contrib/adminpack/sql/adminpack.sql
@@ -29,6 +29,11 @@ REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
 
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+SELECT pg_file_sync('global'); -- sync directory
+SELECT pg_file_sync('test_file2'); -- not there
+
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
 SELECT pg_read_file('test_file1');  -- not there
@@ -58,6 +63,7 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 
 SELECT pg_file_write('test_file0', 'test0', false);
+SELECT pg_file_sync('test_file0');
 SELECT pg_file_rename('test_file0', 'test_file0');
 SELECT pg_file_unlink('test_file0');
 SELECT pg_logdir_ls();
diff --git a/doc/src/sgml/adminpack.sgml b/doc/src/sgml/adminpack.sgml
index 2655417366..b244d02b8a 100644
--- a/doc/src/sgml/adminpack.sgml
+++ b/doc/src/sgml/adminpack.sgml
@@ -43,6 +43,13 @@
       Write, or append to, a text file
      </entry>
     </row>
+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>boolean</type></entry>
+     <entry>
+      Flush a file or directory to disk
+     </entry>
+    </row>
     <row>
      <entry><function>pg_catalog.pg_file_rename(oldname text, newname text <optional>, archivename text</optional>)</function></entry>
      <entry><type>boolean</type></entry>
@@ -79,6 +86,15 @@
   Returns the number of bytes written.
  </para>
 
+ <indexterm>
+  <primary>pg_file_sync</primary>
+ </indexterm>
+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  Returns true on success,
+  an error is thrown otherwise (e.g., the specified file is not present).
+ </para>
+
  <indexterm>
   <primary>pg_file_rename</primary>
  </indexterm>
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index fa79b45f63..b5f4df6a48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -319,7 +319,6 @@ static void pre_sync_fname(const char *fname, bool isdir, int elevel);
 static void datadir_fsync_fname(const char *fname, bool isdir, int elevel);
 static void unlink_if_exists_fname(const char *fname, bool isdir, int elevel);
 
-static int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 static int	fsync_parent_path(const char *fname, int elevel);
 
 
@@ -3376,7 +3375,7 @@ unlink_if_exists_fname(const char *fname, bool isdir, int elevel)
  *
  * Returns 0 if the operation succeeded, -1 otherwise.
  */
-static int
+int
 fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel)
 {
 	int			fd;
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index c6ce7eacf2..51e2ece3c9 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -145,6 +145,7 @@ extern int	pg_fsync_writethrough(int fd);
 extern int	pg_fdatasync(int fd);
 extern void pg_flush_data(int fd, off_t offset, off_t amount);
 extern void fsync_fname(const char *fname, bool isdir);
+extern int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 extern int	durable_rename(const char *oldfile, const char *newfile, int loglevel);
 extern int	durable_unlink(const char *fname, int loglevel);
 extern int	durable_link_or_rename(const char *oldfile, const char *newfile, int loglevel);
#13Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#12)
Re: Add pg_file_sync() to adminpack

On Fri, Jan 10, 2020 at 06:50:12PM +0900, Fujii Masao wrote:

I changed the doc that way. Thanks for the review!

+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  Returns true on success,
+  an error is thrown otherwise (e.g., the specified file is not present).
+ </para>
What's the point of having a function that returns a boolean if it
just returns true all the time?  Wouldn't it be better to have a set
of semantics closer to the unlink() part, where the call of stat()
fails with an ERROR for (errno != ENOENT) and the fsync call returns
false with a WARNING?
+SELECT pg_file_sync('global'); -- sync directory
+ pg_file_sync
+--------------
+ t
+(1 row)
installcheck deployments may not like that.
--
Michael
#14Julien Rouhaud
rjuju123@gmail.com
In reply to: Fujii Masao (#12)
Re: Add pg_file_sync() to adminpack

On Fri, Jan 10, 2020 at 10:50 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Thu, Jan 9, 2020 at 10:39 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

I think that pg_write_server_files should be allowed to call that
function by default.

But pg_write_server_files users are not allowed to execute
other functions like pg_file_write() by default. So doing that
change only for pg_file_sync() looks strange to me.

Ah indeed. I'm wondering if that's an oversight of the original
default role patch or voluntary.

#15Fujii Masao
masao.fujii@gmail.com
In reply to: Michael Paquier (#13)
Re: Add pg_file_sync() to adminpack

On Fri, Jan 10, 2020 at 8:16 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Jan 10, 2020 at 06:50:12PM +0900, Fujii Masao wrote:

I changed the doc that way. Thanks for the review!

Thanks for the review!

+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  Returns true on success,
+  an error is thrown otherwise (e.g., the specified file is not present).
+ </para>
What's the point of having a function that returns a boolean if it
just returns true all the time?  Wouldn't it be better to have a set
of semantics closer to the unlink() part, where the call of stat()
fails with an ERROR for (errno != ENOENT) and the fsync call returns
false with a WARNING?

I'm not sure if returning false with WARNING only in some error cases
is really good idea or not. At least for me, it's more intuitive to
return true on success and emit an ERROR otherwise. I'd like to hear
more opinions about this. Also if returning true on success is rather
confusing, we can change its return type to void.

+SELECT pg_file_sync('global'); -- sync directory
+ pg_file_sync
+--------------
+ t
+(1 row)
installcheck deployments may not like that.

Could you elaborate why? But if it's not good to sync the existing directory
in the regression test, we may need to give up testing the sync of directory.
Another idea is to add another function like pg_mkdir() into adminpack
and use the directory that we newly created by using that function,
for the test. Or better idea?

Regards,

--
Fujii Masao

#16Artur Zakirov
zaartur@gmail.com
In reply to: Fujii Masao (#15)
Re: Add pg_file_sync() to adminpack

Hello,

On Sat, Jan 11, 2020 at 2:12 AM Fujii Masao <masao.fujii@gmail.com> wrote:

+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  Returns true on success,
+  an error is thrown otherwise (e.g., the specified file is not present).
+ </para>
What's the point of having a function that returns a boolean if it
just returns true all the time?  Wouldn't it be better to have a set
of semantics closer to the unlink() part, where the call of stat()
fails with an ERROR for (errno != ENOENT) and the fsync call returns
false with a WARNING?

I'm not sure if returning false with WARNING only in some error cases
is really good idea or not. At least for me, it's more intuitive to
return true on success and emit an ERROR otherwise. I'd like to hear
more opinions about this. Also if returning true on success is rather
confusing, we can change its return type to void.

I think it would be more consistent to pg_file_unlink().

Other functions throw an ERROR and return a number or set of records
except pg_file_rename(), which in some cases throws a WARNING and
returns a boolean result.

--
Arthur

#17Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#15)
Re: Add pg_file_sync() to adminpack

On Sat, Jan 11, 2020 at 02:12:15AM +0900, Fujii Masao wrote:

I'm not sure if returning false with WARNING only in some error cases
is really good idea or not. At least for me, it's more intuitive to
return true on success and emit an ERROR otherwise. I'd like to hear
more opinions about this. Also if returning true on success is rather
confusing, we can change its return type to void.

An advantage of not issuing an ERROR if that when working on a list of
files (for example a WITH RECURSIVE on the whole data directory?), you
can then know which files could not be synced instead of seeing one
ERROR about one file, while being unsure about the state of the
others.

Could you elaborate why? But if it's not good to sync the existing directory
in the regression test, we may need to give up testing the sync of directory.
Another idea is to add another function like pg_mkdir() into adminpack
and use the directory that we newly created by using that function,
for the test. Or better idea?

We should avoid potentially costly tests in any regression scenario if
we have a way to do so. I like your idea of having a pg_mkdir(), that
feels more natural to have as there is already pg_file_write().
--
Michael

#18Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#17)
Re: Add pg_file_sync() to adminpack

On Mon, Jan 13, 2020 at 2:46 PM Michael Paquier <michael@paquier.xyz> wrote:

On Sat, Jan 11, 2020 at 02:12:15AM +0900, Fujii Masao wrote:

I'm not sure if returning false with WARNING only in some error cases
is really good idea or not. At least for me, it's more intuitive to
return true on success and emit an ERROR otherwise. I'd like to hear
more opinions about this. Also if returning true on success is rather
confusing, we can change its return type to void.

An advantage of not issuing an ERROR if that when working on a list of
files (for example a WITH RECURSIVE on the whole data directory?), you
can then know which files could not be synced instead of seeing one
ERROR about one file, while being unsure about the state of the
others.

Actually, can't it create a security hazard, for instance if you call
pg_file_sync() on a heap file and the calls errors out, since it's
bypassing data_sync_retry?

#19Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#18)
Re: Add pg_file_sync() to adminpack

On Mon, Jan 13, 2020 at 03:39:32PM +0100, Julien Rouhaud wrote:

Actually, can't it create a security hazard, for instance if you call
pg_file_sync() on a heap file and the calls errors out, since it's
bypassing data_sync_retry?

Are you mistaking security with durability here? By default, the
function proposed is only executable by a superuser, so that's not
really a security concern.. But I agree that failing to detect a
PANIC on a fsync for a sensitive Postgres file could lead to
corruptions. That's why we PANIC these days.
--
Michael

#20Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#19)
Re: Add pg_file_sync() to adminpack

On Tue, Jan 14, 2020 at 7:18 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jan 13, 2020 at 03:39:32PM +0100, Julien Rouhaud wrote:

Actually, can't it create a security hazard, for instance if you call
pg_file_sync() on a heap file and the calls errors out, since it's
bypassing data_sync_retry?

Are you mistaking security with durability here?

Yes, data durability sorry.

By default, the
function proposed is only executable by a superuser, so that's not
really a security concern.. But I agree that failing to detect a
PANIC on a fsync for a sensitive Postgres file could lead to
corruptions. That's why we PANIC these days.

Exactly. My concern is that some superuser may not be aware that
pg_file_sync could actually corrupt data, so there should be a big red
warning explaining that.

#21Atsushi Torikoshi
atorik@gmail.com
In reply to: Fujii Masao (#15)
Re: Add pg_file_sync() to adminpack

Hello,

On Sut, Jan 11, 2020 at 2:12 Fujii Masao <masao.fujii@gmail.com>:
I'm not sure if returning false with WARNING only in some error cases
is really good idea or not. At least for me, it's more intuitive to
return true on success and emit an ERROR otherwise. I'd like to hear
more opinions about this.

+1.
As a user, I expect these adminpack functions to do similar behaviors
to the corresponding system calls.
System calls for flushing data to disk(fsync on Linux and FlushFileBuffers
on Windows) return different codes on success and failure, and when it
fails we can get error messages. So it seems straightforward for me to
'return true on success and emit an ERROR otherwise'.

On Thu, Jan 9, 2020 at 10:39 PM Julien Rouhaud <rjuju123@gmail.com>

wrote:

I think that pg_write_server_files should be allowed to call that
function by default.

But pg_write_server_files users are not allowed to execute
other functions like pg_file_write() by default. So doing that
change only for pg_file_sync() looks strange to me.

Ah indeed. I'm wondering if that's an oversight of the original
default role patch or voluntary.

It's not directly related to the patch, but as far as I read the
manual below, I expected pg_write_server_files users could execute
adminpack functions.

| Table 21.1 Default Roles
| pg_write_server_files: Allow writing to files in any location the
database can access on the server with COPY and other file-access functions.

--
Atsushi Torikoshi

#22Julien Rouhaud
rjuju123@gmail.com
In reply to: Atsushi Torikoshi (#21)
Re: Add pg_file_sync() to adminpack

On Tue, Jan 14, 2020 at 4:08 PM Atsushi Torikoshi <atorik@gmail.com> wrote:

On Sut, Jan 11, 2020 at 2:12 Fujii Masao <masao.fujii@gmail.com>:

But pg_write_server_files users are not allowed to execute
other functions like pg_file_write() by default. So doing that
change only for pg_file_sync() looks strange to me.

Ah indeed. I'm wondering if that's an oversight of the original
default role patch or voluntary.

It's not directly related to the patch, but as far as I read the
manual below, I expected pg_write_server_files users could execute
adminpack functions.

| Table 21.1 Default Roles
| pg_write_server_files: Allow writing to files in any location the database can access on the server with COPY and other file-access functions.

Actually, pg_write_server_files has enough privileges to execute those
functions anywhere on the FS as far as C code is concerned, provided
that the user running postgres daemon is allowed to (see
convert_and_check_filename), but won't be allowed to do so by default
as it won't have EXECUTE privilege on the functions.

#23Stephen Frost
sfrost@snowman.net
In reply to: Julien Rouhaud (#14)
Re: Add pg_file_sync() to adminpack

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Fri, Jan 10, 2020 at 10:50 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Thu, Jan 9, 2020 at 10:39 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

I think that pg_write_server_files should be allowed to call that
function by default.

But pg_write_server_files users are not allowed to execute
other functions like pg_file_write() by default. So doing that
change only for pg_file_sync() looks strange to me.

Ah indeed. I'm wondering if that's an oversight of the original
default role patch or voluntary.

It was intentional.

Thanks,

Stephen

#24Julien Rouhaud
rjuju123@gmail.com
In reply to: Stephen Frost (#23)
Re: Add pg_file_sync() to adminpack

Le mar. 14 janv. 2020 à 22:57, Stephen Frost <sfrost@snowman.net> a écrit :

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Fri, Jan 10, 2020 at 10:50 AM Fujii Masao <masao.fujii@gmail.com>

wrote:

On Thu, Jan 9, 2020 at 10:39 PM Julien Rouhaud <rjuju123@gmail.com>

wrote:

I think that pg_write_server_files should be allowed to call that
function by default.

But pg_write_server_files users are not allowed to execute
other functions like pg_file_write() by default. So doing that
change only for pg_file_sync() looks strange to me.

Ah indeed. I'm wondering if that's an oversight of the original
default role patch or voluntary.

It was intentional.

ok, thanks for the clarification.

Show quoted text
#25Atsushi Torikoshi
atorik@gmail.com
In reply to: Julien Rouhaud (#22)
Re: Add pg_file_sync() to adminpack

On Wed, Jan 15, 2020 at 1:49 Julien Rouhaud <rjuju123@gmail.com>:

Actually, pg_write_server_files has enough privileges to execute those
functions anywhere on the FS as far as C code is concerned, provided
that the user running postgres daemon is allowed to (see
convert_and_check_filename), but won't be allowed to do so by default
as it won't have EXECUTE privilege on the functions.

I see, thanks for your explanation.

--
Regards,
Atsushi Torikoshi

#26Robert Haas
robertmhaas@gmail.com
In reply to: Atsushi Torikoshi (#21)
Re: Add pg_file_sync() to adminpack

On Tue, Jan 14, 2020 at 10:08 AM Atsushi Torikoshi <atorik@gmail.com> wrote:

fails we can get error messages. So it seems straightforward for me to
'return true on success and emit an ERROR otherwise'.

I agree with reporting errors using ERROR, but it seems to me that we
ought to then make the function return 'void' rather than 'bool'.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#26)
Re: Add pg_file_sync() to adminpack

On Thu, Jan 16, 2020 at 09:51:24AM -0500, Robert Haas wrote:

On Tue, Jan 14, 2020 at 10:08 AM Atsushi Torikoshi <atorik@gmail.com> wrote:

fails we can get error messages. So it seems straightforward for me to
'return true on success and emit an ERROR otherwise'.

I agree with reporting errors using ERROR, but it seems to me that we
ought to then make the function return 'void' rather than 'bool'.

Yeah, that should be either ERROR and return a void result, or issue a
WARNING/ERROR (depending on the code path, maybe PANIC?) with a
boolean status returned if there is a WARNING. Mixing both concepts
with an ERROR all the time and always a true status is just weird,
because you know that if no errors are raised then the status will be
always true. So there is no point to have a boolean status to begin
with.
--
Michael

#28Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Michael Paquier (#27)
1 attachment(s)
Re: Add pg_file_sync() to adminpack

On 2020/01/17 13:36, Michael Paquier wrote:

On Thu, Jan 16, 2020 at 09:51:24AM -0500, Robert Haas wrote:

On Tue, Jan 14, 2020 at 10:08 AM Atsushi Torikoshi <atorik@gmail.com> wrote:

fails we can get error messages. So it seems straightforward for me to
'return true on success and emit an ERROR otherwise'.

I agree with reporting errors using ERROR, but it seems to me that we
ought to then make the function return 'void' rather than 'bool'.

Yeah, that should be either ERROR and return a void result, or issue a
WARNING/ERROR (depending on the code path, maybe PANIC?) with a
boolean status returned if there is a WARNING. Mixing both concepts
with an ERROR all the time and always a true status is just weird,
because you know that if no errors are raised then the status will be
always true. So there is no point to have a boolean status to begin
with.

OK, so our consensus is to return void on success and throw an error
otherwise. Attached is the updated version of the patch.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachments:

pg_file_sync_v4.patchtext/plain; charset=UTF-8; name=pg_file_sync_v4.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/contrib/adminpack/Makefile b/contrib/adminpack/Makefile
index 9a464b11bc..630fea7726 100644
--- a/contrib/adminpack/Makefile
+++ b/contrib/adminpack/Makefile
@@ -7,7 +7,8 @@ OBJS = \
 PG_CPPFLAGS = -I$(libpq_srcdir)
 
 EXTENSION = adminpack
-DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql
+DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql\
+	adminpack--2.0--2.1.sql
 PGFILEDESC = "adminpack - support functions for pgAdmin"
 
 REGRESS = adminpack
diff --git a/contrib/adminpack/adminpack--2.0--2.1.sql b/contrib/adminpack/adminpack--2.0--2.1.sql
new file mode 100644
index 0000000000..1c6712e816
--- /dev/null
+++ b/contrib/adminpack/adminpack--2.0--2.1.sql
@@ -0,0 +1,17 @@
+/* contrib/adminpack/adminpack--2.0--2.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION adminpack UPDATE TO '2.1'" to load this file. \quit
+
+/* ***********************************************
+ * Administrative functions for PostgreSQL
+ * *********************************************** */
+
+/* generic file access functions */
+
+CREATE OR REPLACE FUNCTION pg_catalog.pg_file_sync(text)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_file_sync'
+LANGUAGE C VOLATILE STRICT;
+
+REVOKE EXECUTE ON FUNCTION pg_catalog.pg_file_sync(text) FROM PUBLIC;
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index 710f4ea32d..7b5a531e08 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -43,6 +43,7 @@ PG_MODULE_MAGIC;
 
 PG_FUNCTION_INFO_V1(pg_file_write);
 PG_FUNCTION_INFO_V1(pg_file_write_v1_1);
+PG_FUNCTION_INFO_V1(pg_file_sync);
 PG_FUNCTION_INFO_V1(pg_file_rename);
 PG_FUNCTION_INFO_V1(pg_file_rename_v1_1);
 PG_FUNCTION_INFO_V1(pg_file_unlink);
@@ -215,6 +216,30 @@ pg_file_write_internal(text *file, text *data, bool replace)
 	return (count);
 }
 
+/* ------------------------------------
+ * pg_file_sync
+ *
+ * We REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ */
+Datum
+pg_file_sync(PG_FUNCTION_ARGS)
+{
+	char	   *filename;
+	struct stat	fst;
+
+	filename = convert_and_check_filename(PG_GETARG_TEXT_PP(0), false);
+
+	if (stat(filename, &fst) < 0)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not stat file \"%s\": %m", filename)));
+
+	fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
+
+	PG_RETURN_VOID();
+}
+
 /* ------------------------------------
  * pg_file_rename - old version
  *
diff --git a/contrib/adminpack/adminpack.control b/contrib/adminpack/adminpack.control
index 12569dcdd7..ae35d22156 100644
--- a/contrib/adminpack/adminpack.control
+++ b/contrib/adminpack/adminpack.control
@@ -1,6 +1,6 @@
 # adminpack extension
 comment = 'administrative functions for PostgreSQL'
-default_version = '2.0'
+default_version = '2.1'
 module_pathname = '$libdir/adminpack'
 relocatable = false
 schema = pg_catalog
diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out
index 8747ac69a2..5738b0f6c4 100644
--- a/contrib/adminpack/expected/adminpack.out
+++ b/contrib/adminpack/expected/adminpack.out
@@ -56,6 +56,21 @@ RESET ROLE;
 REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+ pg_file_sync 
+--------------
+ 
+(1 row)
+
+SELECT pg_file_sync('pg_stat'); -- sync directory
+ pg_file_sync 
+--------------
+ 
+(1 row)
+
+SELECT pg_file_sync('test_file2'); -- not there
+ERROR:  could not stat file "test_file2": No such file or directory
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
  pg_file_rename 
@@ -142,6 +157,8 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 SELECT pg_file_write('test_file0', 'test0', false);
 ERROR:  permission denied for function pg_file_write
+SELECT pg_file_sync('test_file0');
+ERROR:  permission denied for function pg_file_sync
 SELECT pg_file_rename('test_file0', 'test_file0');
 ERROR:  permission denied for function pg_file_rename
 CONTEXT:  SQL function "pg_file_rename" statement 1
diff --git a/contrib/adminpack/sql/adminpack.sql b/contrib/adminpack/sql/adminpack.sql
index 1525f0a82b..918d0bdc65 100644
--- a/contrib/adminpack/sql/adminpack.sql
+++ b/contrib/adminpack/sql/adminpack.sql
@@ -29,6 +29,11 @@ REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
 
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+SELECT pg_file_sync('pg_stat'); -- sync directory
+SELECT pg_file_sync('test_file2'); -- not there
+
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
 SELECT pg_read_file('test_file1');  -- not there
@@ -58,6 +63,7 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 
 SELECT pg_file_write('test_file0', 'test0', false);
+SELECT pg_file_sync('test_file0');
 SELECT pg_file_rename('test_file0', 'test_file0');
 SELECT pg_file_unlink('test_file0');
 SELECT pg_logdir_ls();
diff --git a/doc/src/sgml/adminpack.sgml b/doc/src/sgml/adminpack.sgml
index 2655417366..21bd4e5dd5 100644
--- a/doc/src/sgml/adminpack.sgml
+++ b/doc/src/sgml/adminpack.sgml
@@ -43,6 +43,13 @@
       Write, or append to, a text file
      </entry>
     </row>
+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>void</type></entry>
+     <entry>
+      Flush a file or directory to disk
+     </entry>
+    </row>
     <row>
      <entry><function>pg_catalog.pg_file_rename(oldname text, newname text <optional>, archivename text</optional>)</function></entry>
      <entry><type>boolean</type></entry>
@@ -79,6 +86,15 @@
   Returns the number of bytes written.
  </para>
 
+ <indexterm>
+  <primary>pg_file_sync</primary>
+ </indexterm>
+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  An error is thrown
+  on failure (e.g., the specified file is not present).
+ </para>
+
  <indexterm>
   <primary>pg_file_rename</primary>
  </indexterm>
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index fa79b45f63..b5f4df6a48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -319,7 +319,6 @@ static void pre_sync_fname(const char *fname, bool isdir, int elevel);
 static void datadir_fsync_fname(const char *fname, bool isdir, int elevel);
 static void unlink_if_exists_fname(const char *fname, bool isdir, int elevel);
 
-static int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 static int	fsync_parent_path(const char *fname, int elevel);
 
 
@@ -3376,7 +3375,7 @@ unlink_if_exists_fname(const char *fname, bool isdir, int elevel)
  *
  * Returns 0 if the operation succeeded, -1 otherwise.
  */
-static int
+int
 fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel)
 {
 	int			fd;
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index c6ce7eacf2..51e2ece3c9 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -145,6 +145,7 @@ extern int	pg_fsync_writethrough(int fd);
 extern int	pg_fdatasync(int fd);
 extern void pg_flush_data(int fd, off_t offset, off_t amount);
 extern void fsync_fname(const char *fname, bool isdir);
+extern int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 extern int	durable_rename(const char *oldfile, const char *newfile, int loglevel);
 extern int	durable_unlink(const char *fname, int loglevel);
 extern int	durable_link_or_rename(const char *oldfile, const char *newfile, int loglevel);
#29Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Michael Paquier (#17)
Re: Add pg_file_sync() to adminpack

On 2020/01/13 22:46, Michael Paquier wrote:

On Sat, Jan 11, 2020 at 02:12:15AM +0900, Fujii Masao wrote:

I'm not sure if returning false with WARNING only in some error cases
is really good idea or not. At least for me, it's more intuitive to
return true on success and emit an ERROR otherwise. I'd like to hear
more opinions about this. Also if returning true on success is rather
confusing, we can change its return type to void.

An advantage of not issuing an ERROR if that when working on a list of
files (for example a WITH RECURSIVE on the whole data directory?), you
can then know which files could not be synced instead of seeing one
ERROR about one file, while being unsure about the state of the
others.

Could you elaborate why? But if it's not good to sync the existing directory
in the regression test, we may need to give up testing the sync of directory.
Another idea is to add another function like pg_mkdir() into adminpack
and use the directory that we newly created by using that function,
for the test. Or better idea?

We should avoid potentially costly tests in any regression scenario if
we have a way to do so. I like your idea of having a pg_mkdir(), that
feels more natural to have as there is already pg_file_write().

BTW, in the latest patch that I posted upthread, I changed
the directory to sync for the test from "global" to "pg_stat"
because pg_stat is empty while the server is running,
and syncing it would not be so costly.
Introduing pg_mkdir() (maybe pg_rmdir() would be also necessary)
is an idea, but it's better to do that as a separate patch
if it's really necessary.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#30Arthur Zakirov
zaartur@gmail.com
In reply to: Fujii Masao (#28)
Re: Add pg_file_sync() to adminpack

On 2020/01/17 16:05, Fujii Masao wrote:

On 2020/01/17 13:36, Michael Paquier wrote:

Yeah, that should be either ERROR and return a void result, or issue a
WARNING/ERROR (depending on the code path, maybe PANIC?) with a
boolean status returned if there is a WARNING.� Mixing both concepts
with an ERROR all the time and always a true status is just weird,
because you know that if no errors are raised then the status will be
always true.� So there is no point to have a boolean status to begin
with.

OK, so our consensus is to return void on success and throw an error
otherwise. Attached is the updated version of the patch.

Thank you for the new version!

It is compiled and passes the tests. There is the documentation and it
is built too without an error.

It seems that consensus about the returned type was reached and I marked
the patch as "Ready for Commiter".

--
Arthur

#31Michael Paquier
michael@paquier.xyz
In reply to: Arthur Zakirov (#30)
Re: Add pg_file_sync() to adminpack

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and it is
built too without an error.

It seems that consensus about the returned type was reached and I marked the
patch as "Ready for Commiter".

+       fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here?  Here, fsync failure on heap
file => ERROR => potential data corruption.
--
Michael
#32Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#31)
Re: Add pg_file_sync() to adminpack

On Fri, Jan 24, 2020 at 6:56 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and it is
built too without an error.

It seems that consensus about the returned type was reached and I marked the
patch as "Ready for Commiter".

+ fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here? Here, fsync failure on heap
file => ERROR => potential data corruption.

Definitely yes.

#33Arthur Zakirov
zaartur@gmail.com
In reply to: Michael Paquier (#31)
Re: Add pg_file_sync() to adminpack

On 2020/01/24 14:56, Michael Paquier wrote:

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and it is
built too without an error.

It seems that consensus about the returned type was reached and I marked the
patch as "Ready for Commiter".

+ fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here? Here, fsync failure on heap
file => ERROR => potential data corruption.

Ah, true. It is possible to add couple sentences that pg_file_sync()
doesn't depend on data_sync_retry GUC and doesn't raise a PANIC even for
database files.

--
Arthur

#34Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Arthur Zakirov (#33)
1 attachment(s)
Re: Add pg_file_sync() to adminpack

On 2020/01/24 15:38, Arthur Zakirov wrote:

On 2020/01/24 14:56, Michael Paquier wrote:

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and
it is
built too without an error.

It seems that consensus about the returned type was reached and I
marked the
patch as "Ready for Commiter".

+������ fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here?� Here, fsync failure on heap
file => ERROR => potential data corruption.

Ah, true. It is possible to add couple sentences that pg_file_sync()
doesn't depend on data_sync_retry GUC and doesn't raise a PANIC even for
database files.

Thanks all for the review!

So, what about the attached patch?
In the patch, I added the following note to the doc.

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files.
--------------------

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachments:

pg_file_sync_v5.patchtext/plain; charset=UTF-8; name=pg_file_sync_v5.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/contrib/adminpack/Makefile b/contrib/adminpack/Makefile
index 9a464b11bc..630fea7726 100644
--- a/contrib/adminpack/Makefile
+++ b/contrib/adminpack/Makefile
@@ -7,7 +7,8 @@ OBJS = \
 PG_CPPFLAGS = -I$(libpq_srcdir)
 
 EXTENSION = adminpack
-DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql
+DATA = adminpack--1.0.sql adminpack--1.0--1.1.sql adminpack--1.1--2.0.sql\
+	adminpack--2.0--2.1.sql
 PGFILEDESC = "adminpack - support functions for pgAdmin"
 
 REGRESS = adminpack
diff --git a/contrib/adminpack/adminpack--2.0--2.1.sql b/contrib/adminpack/adminpack--2.0--2.1.sql
new file mode 100644
index 0000000000..1c6712e816
--- /dev/null
+++ b/contrib/adminpack/adminpack--2.0--2.1.sql
@@ -0,0 +1,17 @@
+/* contrib/adminpack/adminpack--2.0--2.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION adminpack UPDATE TO '2.1'" to load this file. \quit
+
+/* ***********************************************
+ * Administrative functions for PostgreSQL
+ * *********************************************** */
+
+/* generic file access functions */
+
+CREATE OR REPLACE FUNCTION pg_catalog.pg_file_sync(text)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_file_sync'
+LANGUAGE C VOLATILE STRICT;
+
+REVOKE EXECUTE ON FUNCTION pg_catalog.pg_file_sync(text) FROM PUBLIC;
diff --git a/contrib/adminpack/adminpack.c b/contrib/adminpack/adminpack.c
index 710f4ea32d..7b5a531e08 100644
--- a/contrib/adminpack/adminpack.c
+++ b/contrib/adminpack/adminpack.c
@@ -43,6 +43,7 @@ PG_MODULE_MAGIC;
 
 PG_FUNCTION_INFO_V1(pg_file_write);
 PG_FUNCTION_INFO_V1(pg_file_write_v1_1);
+PG_FUNCTION_INFO_V1(pg_file_sync);
 PG_FUNCTION_INFO_V1(pg_file_rename);
 PG_FUNCTION_INFO_V1(pg_file_rename_v1_1);
 PG_FUNCTION_INFO_V1(pg_file_unlink);
@@ -215,6 +216,30 @@ pg_file_write_internal(text *file, text *data, bool replace)
 	return (count);
 }
 
+/* ------------------------------------
+ * pg_file_sync
+ *
+ * We REVOKE EXECUTE on the function from PUBLIC.
+ * Users can then grant access to it based on their policies.
+ */
+Datum
+pg_file_sync(PG_FUNCTION_ARGS)
+{
+	char	   *filename;
+	struct stat	fst;
+
+	filename = convert_and_check_filename(PG_GETARG_TEXT_PP(0), false);
+
+	if (stat(filename, &fst) < 0)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not stat file \"%s\": %m", filename)));
+
+	fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
+
+	PG_RETURN_VOID();
+}
+
 /* ------------------------------------
  * pg_file_rename - old version
  *
diff --git a/contrib/adminpack/adminpack.control b/contrib/adminpack/adminpack.control
index 12569dcdd7..ae35d22156 100644
--- a/contrib/adminpack/adminpack.control
+++ b/contrib/adminpack/adminpack.control
@@ -1,6 +1,6 @@
 # adminpack extension
 comment = 'administrative functions for PostgreSQL'
-default_version = '2.0'
+default_version = '2.1'
 module_pathname = '$libdir/adminpack'
 relocatable = false
 schema = pg_catalog
diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out
index 8747ac69a2..5738b0f6c4 100644
--- a/contrib/adminpack/expected/adminpack.out
+++ b/contrib/adminpack/expected/adminpack.out
@@ -56,6 +56,21 @@ RESET ROLE;
 REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+ pg_file_sync 
+--------------
+ 
+(1 row)
+
+SELECT pg_file_sync('pg_stat'); -- sync directory
+ pg_file_sync 
+--------------
+ 
+(1 row)
+
+SELECT pg_file_sync('test_file2'); -- not there
+ERROR:  could not stat file "test_file2": No such file or directory
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
  pg_file_rename 
@@ -142,6 +157,8 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 SELECT pg_file_write('test_file0', 'test0', false);
 ERROR:  permission denied for function pg_file_write
+SELECT pg_file_sync('test_file0');
+ERROR:  permission denied for function pg_file_sync
 SELECT pg_file_rename('test_file0', 'test_file0');
 ERROR:  permission denied for function pg_file_rename
 CONTEXT:  SQL function "pg_file_rename" statement 1
diff --git a/contrib/adminpack/sql/adminpack.sql b/contrib/adminpack/sql/adminpack.sql
index 1525f0a82b..918d0bdc65 100644
--- a/contrib/adminpack/sql/adminpack.sql
+++ b/contrib/adminpack/sql/adminpack.sql
@@ -29,6 +29,11 @@ REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
 DROP ROLE regress_user1;
 
+-- sync
+SELECT pg_file_sync('test_file1'); -- sync file
+SELECT pg_file_sync('pg_stat'); -- sync directory
+SELECT pg_file_sync('test_file2'); -- not there
+
 -- rename file
 SELECT pg_file_rename('test_file1', 'test_file2');
 SELECT pg_read_file('test_file1');  -- not there
@@ -58,6 +63,7 @@ CREATE USER regress_user1;
 SET ROLE regress_user1;
 
 SELECT pg_file_write('test_file0', 'test0', false);
+SELECT pg_file_sync('test_file0');
 SELECT pg_file_rename('test_file0', 'test_file0');
 SELECT pg_file_unlink('test_file0');
 SELECT pg_logdir_ls();
diff --git a/doc/src/sgml/adminpack.sgml b/doc/src/sgml/adminpack.sgml
index 2655417366..977073f7c8 100644
--- a/doc/src/sgml/adminpack.sgml
+++ b/doc/src/sgml/adminpack.sgml
@@ -43,6 +43,13 @@
       Write, or append to, a text file
      </entry>
     </row>
+    <row>
+     <entry><function>pg_catalog.pg_file_sync(filename text)</function></entry>
+     <entry><type>void</type></entry>
+     <entry>
+      Flush a file or directory to disk
+     </entry>
+    </row>
     <row>
      <entry><function>pg_catalog.pg_file_rename(oldname text, newname text <optional>, archivename text</optional>)</function></entry>
      <entry><type>boolean</type></entry>
@@ -79,6 +86,18 @@
   Returns the number of bytes written.
  </para>
 
+ <indexterm>
+  <primary>pg_file_sync</primary>
+ </indexterm>
+ <para>
+  <function>pg_file_sync</function> fsyncs the specified file or directory
+  named by <parameter>filename</parameter>.  An error is thrown
+  on failure (e.g., the specified file is not present). Note that
+  <xref linkend="guc-data-sync-retry"/> has no effect on this function,
+  and therefore a PANIC-level error will not be raised even on failure to
+  flush database files.
+ </para>
+
  <indexterm>
   <primary>pg_file_rename</primary>
  </indexterm>
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index fa79b45f63..b5f4df6a48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -319,7 +319,6 @@ static void pre_sync_fname(const char *fname, bool isdir, int elevel);
 static void datadir_fsync_fname(const char *fname, bool isdir, int elevel);
 static void unlink_if_exists_fname(const char *fname, bool isdir, int elevel);
 
-static int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 static int	fsync_parent_path(const char *fname, int elevel);
 
 
@@ -3376,7 +3375,7 @@ unlink_if_exists_fname(const char *fname, bool isdir, int elevel)
  *
  * Returns 0 if the operation succeeded, -1 otherwise.
  */
-static int
+int
 fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel)
 {
 	int			fd;
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index c6ce7eacf2..51e2ece3c9 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -145,6 +145,7 @@ extern int	pg_fsync_writethrough(int fd);
 extern int	pg_fdatasync(int fd);
 extern void pg_flush_data(int fd, off_t offset, off_t amount);
 extern void fsync_fname(const char *fname, bool isdir);
+extern int	fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
 extern int	durable_rename(const char *oldfile, const char *newfile, int loglevel);
 extern int	durable_unlink(const char *fname, int loglevel);
 extern int	durable_link_or_rename(const char *oldfile, const char *newfile, int loglevel);
#35Julien Rouhaud
rjuju123@gmail.com
In reply to: Fujii Masao (#34)
Re: Add pg_file_sync() to adminpack

On Fri, Jan 24, 2020 at 8:20 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/01/24 15:38, Arthur Zakirov wrote:

On 2020/01/24 14:56, Michael Paquier wrote:

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and
it is
built too without an error.

It seems that consensus about the returned type was reached and I
marked the
patch as "Ready for Commiter".

+ fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here? Here, fsync failure on heap
file => ERROR => potential data corruption.

Ah, true. It is possible to add couple sentences that pg_file_sync()
doesn't depend on data_sync_retry GUC and doesn't raise a PANIC even for
database files.

Thanks all for the review!

So, what about the attached patch?
In the patch, I added the following note to the doc.

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files.
--------------------

We should explicitly mention that this can cause corruption. How about:

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files. If that happens, the underlying database
objects may be corrupted.
--------------------

#36Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Julien Rouhaud (#35)
Re: Add pg_file_sync() to adminpack

On 2020/01/24 16:56, Julien Rouhaud wrote:

On Fri, Jan 24, 2020 at 8:20 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/01/24 15:38, Arthur Zakirov wrote:

On 2020/01/24 14:56, Michael Paquier wrote:

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and
it is
built too without an error.

It seems that consensus about the returned type was reached and I
marked the
patch as "Ready for Commiter".

+ fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here? Here, fsync failure on heap
file => ERROR => potential data corruption.

Ah, true. It is possible to add couple sentences that pg_file_sync()
doesn't depend on data_sync_retry GUC and doesn't raise a PANIC even for
database files.

Thanks all for the review!

So, what about the attached patch?
In the patch, I added the following note to the doc.

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files.
--------------------

We should explicitly mention that this can cause corruption. How about:

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files. If that happens, the underlying database
objects may be corrupted.
--------------------

IMO that's overkill. If we really need such mention for pg_file_sync(),
we also need to add it for other functions like pg_read_file(),
pg_stat_file(), etc. But, again, which looks overkill.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#37Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Fujii Masao (#36)
Re: Add pg_file_sync() to adminpack

On 2020/01/24 17:08, Fujii Masao wrote:

On 2020/01/24 16:56, Julien Rouhaud wrote:

On Fri, Jan 24, 2020 at 8:20 AM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:

On 2020/01/24 15:38, Arthur Zakirov wrote:

On 2020/01/24 14:56, Michael Paquier wrote:

On Fri, Jan 24, 2020 at 01:28:29PM +0900, Arthur Zakirov wrote:

It is compiled and passes the tests. There is the documentation and
it is
built too without an error.

It seems that consensus about the returned type was reached and I
marked the
patch as "Ready for Commiter".

+       fsync_fname_ext(filename, S_ISDIR(fst.st_mode), false, ERROR);
One comment here: should we warn better users in the docs that a fsync
failule will not trigger a PANIC here?  Here, fsync failure on heap
file => ERROR => potential data corruption.

Ah, true. It is possible to add couple sentences that pg_file_sync()
doesn't depend on data_sync_retry GUC and doesn't raise a PANIC even
for
database files.

Thanks all for the review!

So, what about the attached patch?
In the patch, I added the following note to the doc.

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files.
--------------------

We should explicitly mention that this can cause corruption.  How  about:

--------------------
Note that
<xref linkend="guc-data-sync-retry"/> has no effect on this function,
and therefore a PANIC-level error will not be raised even on failure to
flush database files.  If that happens, the underlying database
objects may be corrupted.
--------------------

IMO that's overkill. If we really need such mention for pg_file_sync(),
we also need to add it for other functions like pg_read_file(),
pg_stat_file(), etc. But, again, which looks overkill.

I pushed the v5 of the patch. Thanks all for reviewing the patch!
If the current document is not good yet, let's keep discussing that!

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters