Inconvenience of pg_read_binary_file()
If I want to read a file that I'm not sure of the existence but I want
to read the whole file if exists, I would call
pg_read_binary_file('path', 0, -1, true) but unfortunately this
doesn't work.
Does it make sense to change the function so as to accept the
parameter specification above? Or the arguments could be ('path',
null, null, true) but (0,-1) is simpler considering the
characteristics of the function.
(We could also rearrange the the parameter order as "filename,
missing_ok, offset, length" but that is simply confusing..)
If it is, pg_read_file() is worth receive the same modification and
I'll post the version containing doc part.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
pg_read_binary_file_modify.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 2bf5219256..219203be1c 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -331,10 +331,10 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
seek_offset = PG_GETARG_INT64(1);
bytes_to_read = PG_GETARG_INT64(2);
- if (bytes_to_read < 0)
+ if (bytes_to_read < -1)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("requested length cannot be negative")));
+ errmsg("invalid requested length")));
}
if (PG_NARGS() >= 4)
missing_ok = PG_GETARG_BOOL(3);
On Tue, Jun 07, 2022 at 04:05:20PM +0900, Kyotaro Horiguchi wrote:
If I want to read a file that I'm not sure of the existence but I want
to read the whole file if exists, I would call
pg_read_binary_file('path', 0, -1, true) but unfortunately this
doesn't work.
Yeah, the "normal" cases that I have seen in the past just used an
extra call to pg_stat_file() to retrieve the size of the file before
reading it, but arguably it does not help if the file gets extended
between the stat() call and the read() call (we don't need to care
about this case with pg_rewind that has been the reason why the
missing_ok argument was introduced first, for one, as file extensions
don't matter as we'd replay from the LSN point where the rewind
began, adding the new blocks at replay).
There is also an argument for supporting negative values rather than
just -1. For example -2 could mean to read all the file except the
last byte. Or you could have an extra flavor, as of
pg_read_file(text, bool) to read the whole by default. Supporting
just -1 as special value for the amount of data to read would be
confusing IMO. So I would tend to choose for a set of arguments based
on (text, bool).
--
Michael
At Tue, 7 Jun 2022 16:33:53 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Tue, Jun 07, 2022 at 04:05:20PM +0900, Kyotaro Horiguchi wrote:
If I want to read a file that I'm not sure of the existence but I want
to read the whole file if exists, I would call
pg_read_binary_file('path', 0, -1, true) but unfortunately this
doesn't work.Yeah, the "normal" cases that I have seen in the past just used an
extra call to pg_stat_file() to retrieve the size of the file before
reading it, but arguably it does not help if the file gets extended
between the stat() call and the read() call (we don't need to care
about this case with pg_rewind that has been the reason why the
missing_ok argument was introduced first, for one, as file extensions
don't matter as we'd replay from the LSN point where the rewind
began, adding the new blocks at replay).
Sure.
There is also an argument for supporting negative values rather than
just -1. For example -2 could mean to read all the file except the
last byte. Or you could have an extra flavor, as of
pg_read_file(text, bool) to read the whole by default. Supporting
just -1 as special value for the amount of data to read would be
confusing IMO. So I would tend to choose for a set of arguments based
on (text, bool).
I'm not sure about the negative length smaller than -1, since I don't
find an apprpriate offset that represents (last byte + 1).
pg_read_file(text, bool) makes sense to me, but it doesn't seem like
to be able to share C function with other variations.
pg_read_binary_file() need to accept some out-of-range value for
offset or length to signal that offset and length are not specified.
In the attached pg_read(_binary)_file_all() is modifiedf so that they
have the different body from pg_read(_binary)_file().
(function comments needs to be edited and docs are needed)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
pg_read_binary_file_modify_2.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 2bf5219256..62e16b014e 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -349,7 +349,6 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}
-
/*
* Wrapper functions for the 1 and 3 argument variants of pg_read_file_v2()
* and pg_read_binary_file().
@@ -367,7 +366,21 @@ pg_read_file_off_len(PG_FUNCTION_ARGS)
Datum
pg_read_file_all(PG_FUNCTION_ARGS)
{
- return pg_read_file_v2(fcinfo);
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ bool missing_ok = false;
+ char *filename;
+ text *result;
+
+ if (PG_NARGS() >= 2)
+ missing_ok = PG_GETARG_BOOL(1);
+
+ filename = convert_and_check_filename(filename_t);
+
+ result = read_text_file(filename, 0, -1, missing_ok);
+ if (result)
+ PG_RETURN_TEXT_P(result);
+ else
+ PG_RETURN_NULL();
}
Datum
@@ -379,7 +392,21 @@ pg_read_binary_file_off_len(PG_FUNCTION_ARGS)
Datum
pg_read_binary_file_all(PG_FUNCTION_ARGS)
{
- return pg_read_binary_file(fcinfo);
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ bool missing_ok = false;
+ char *filename;
+ bytea *result;
+
+ if (PG_NARGS() >= 2)
+ missing_ok = PG_GETARG_BOOL(1);
+
+ filename = convert_and_check_filename(filename_t);
+
+ result = read_binary_file(filename, 0, -1, missing_ok);
+ if (result)
+ PG_RETURN_BYTEA_P(result);
+ else
+ PG_RETURN_NULL();
}
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571a33..e180286749 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6423,6 +6423,9 @@
{ oid => '3826', descr => 'read text from a file',
proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
proargtypes => 'text', prosrc => 'pg_read_file_all' },
+{ oid => '8530', descr => 'read text from a file',
+ proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
+ proargtypes => 'text bool', prosrc => 'pg_read_file_all' },
{ oid => '3827', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
proargtypes => 'text int8 int8', prosrc => 'pg_read_binary_file_off_len' },
@@ -6432,6 +6435,9 @@
{ oid => '3828', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
proargtypes => 'text', prosrc => 'pg_read_binary_file_all' },
+{ oid => '8529', descr => 'read bytea from a file',
+ proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
+ proargtypes => 'text bool', prosrc => 'pg_read_binary_file_all' },
{ oid => '2625', descr => 'list all files in a directory',
proname => 'pg_ls_dir', prorows => '1000', proretset => 't',
provolatile => 'v', prorettype => 'text', proargtypes => 'text',
At Tue, 07 Jun 2022 17:29:31 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
pg_read_file(text, bool) makes sense to me, but it doesn't seem like
to be able to share C function with other variations.
pg_read_binary_file() need to accept some out-of-range value for
offset or length to signal that offset and length are not specified.
In this version all the polypmorphic variations share the same body
function. I tempted to add tail-reading feature but it would be
another feature.
(function comments needs to be edited and docs are needed)
- Simplified the implementation (by complexifying argument handling..).
- REVOKEd EXECUTE from the new functions.
- Edited the signature of the two functions.
- pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok boolean ]] ) → text + pg_read_file ( filename text [, offset bigint, length bigint ] [, missing_ok boolean ] ) → text
And registered this to the next CF.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v3-0001-Add-an-argument-variation-to-pg_read_file.patchtext/x-patch; charset=us-asciiDownload
From 5d344fb56cded75e22cb4e56bcf1c10a31f5d4bb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Thu, 30 Jun 2022 10:30:35 +0900
Subject: [PATCH v3] Add an argument variation to pg_read_file
Let the functions pg_read_file and pg_read_binary_file have the
argument variation of f(filename, missing_ok) so that the functions
can read the whole file tolerating the file to be missing.
---
doc/src/sgml/func.sgml | 4 +--
src/backend/catalog/system_functions.sql | 4 +++
src/backend/utils/adt/genfile.c | 33 +++++++++++++++++++-----
src/include/catalog/pg_proc.dat | 6 +++++
4 files changed, 39 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7b652460a1..34d76b74e4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28593,7 +28593,7 @@ SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size
<indexterm>
<primary>pg_read_file</primary>
</indexterm>
- <function>pg_read_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional></optional> )
+ <function>pg_read_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> </optional> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional> )
<returnvalue>text</returnvalue>
</para>
<para>
@@ -28618,7 +28618,7 @@ SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size
<indexterm>
<primary>pg_read_binary_file</primary>
</indexterm>
- <function>pg_read_binary_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional></optional> )
+ <function>pg_read_binary_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> </optional> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional> )
<returnvalue>bytea</returnvalue>
</para>
<para>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 73da687d5d..30a048f6b0 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -659,12 +659,16 @@ REVOKE EXECUTE ON FUNCTION pg_ls_tmpdir(oid) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_file(text,boolean) FROM public;
+
REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint,boolean) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,boolean) FROM public;
+
REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint,boolean) FROM public;
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 2bf5219256..1a09431a79 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -278,6 +278,9 @@ pg_read_file(PG_FUNCTION_ARGS)
*
* No superuser check done here- instead privileges are handled by the
* GRANT system.
+ *
+ * The two-argument variation of this function supposes the second argument as
+ * to be a Boolean.
*/
Datum
pg_read_file_v2(PG_FUNCTION_ARGS)
@@ -290,7 +293,9 @@ pg_read_file_v2(PG_FUNCTION_ARGS)
text *result;
/* handle optional arguments */
- if (PG_NARGS() >= 3)
+ if (PG_NARGS() == 2)
+ missing_ok = PG_GETARG_BOOL(1);
+ else if (PG_NARGS() >= 3)
{
seek_offset = PG_GETARG_INT64(1);
bytes_to_read = PG_GETARG_INT64(2);
@@ -299,9 +304,10 @@ pg_read_file_v2(PG_FUNCTION_ARGS)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("requested length cannot be negative")));
+
+ if (PG_NARGS() >= 4)
+ missing_ok = PG_GETARG_BOOL(3);
}
- if (PG_NARGS() >= 4)
- missing_ok = PG_GETARG_BOOL(3);
filename = convert_and_check_filename(filename_t);
@@ -326,6 +332,8 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
bytea *result;
/* handle optional arguments */
+ if (PG_NARGS() == 2)
+ missing_ok = PG_GETARG_BOOL(1);
if (PG_NARGS() >= 3)
{
seek_offset = PG_GETARG_INT64(1);
@@ -335,9 +343,10 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("requested length cannot be negative")));
+
+ if (PG_NARGS() >= 4)
+ missing_ok = PG_GETARG_BOOL(3);
}
- if (PG_NARGS() >= 4)
- missing_ok = PG_GETARG_BOOL(3);
filename = convert_and_check_filename(filename_t);
@@ -351,7 +360,7 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
/*
- * Wrapper functions for the 1 and 3 argument variants of pg_read_file_v2()
+ * Wrapper functions for the 1 to 3 argument variants of pg_read_file_v2()
* and pg_read_binary_file().
*
* These are necessary to pass the sanity check in opr_sanity, which checks
@@ -364,6 +373,12 @@ pg_read_file_off_len(PG_FUNCTION_ARGS)
return pg_read_file_v2(fcinfo);
}
+Datum
+pg_read_file_all_missing(PG_FUNCTION_ARGS)
+{
+ return pg_read_file_v2(fcinfo);
+}
+
Datum
pg_read_file_all(PG_FUNCTION_ARGS)
{
@@ -376,6 +391,12 @@ pg_read_binary_file_off_len(PG_FUNCTION_ARGS)
return pg_read_binary_file(fcinfo);
}
+Datum
+pg_read_binary_file_all_missing(PG_FUNCTION_ARGS)
+{
+ return pg_read_binary_file(fcinfo);
+}
+
Datum
pg_read_binary_file_all(PG_FUNCTION_ARGS)
{
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571a33..84be83610a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6423,6 +6423,9 @@
{ oid => '3826', descr => 'read text from a file',
proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
proargtypes => 'text', prosrc => 'pg_read_file_all' },
+{ oid => '8025', descr => 'read text from a file',
+ proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
+ proargtypes => 'text bool', prosrc => 'pg_read_file_all_missing' },
{ oid => '3827', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
proargtypes => 'text int8 int8', prosrc => 'pg_read_binary_file_off_len' },
@@ -6432,6 +6435,9 @@
{ oid => '3828', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
proargtypes => 'text', prosrc => 'pg_read_binary_file_all' },
+{ oid => '8026', descr => 'read bytea from a file',
+ proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
+ proargtypes => 'text bool', prosrc => 'pg_read_binary_file_all_missing' },
{ oid => '2625', descr => 'list all files in a directory',
proname => 'pg_ls_dir', prorows => '1000', proretset => 't',
provolatile => 'v', prorettype => 'text', proargtypes => 'text',
--
2.31.1
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
- Simplified the implementation (by complexifying argument handling..).
- REVOKEd EXECUTE from the new functions.
- Edited the signature of the two functions.
- pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok boolean ]] ) $B"*(B text + pg_read_file ( filename text [, offset bigint, length bigint ] [, missing_ok boolean ] ) $B"*(B text
I'm okay with allowing this variant of the functions. Since there's
no implicit cast between bigint and bool, plus the fact that you
can't give just offset without length, there shouldn't be much risk
of confusion as to which variant to invoke.
I don't really like the implementation style though. That mess of
PG_NARGS tests is illegible code already and this makes it worse.
I think it'd be way cleaner to have all the PG_GETARG calls in the
bottom SQL-callable functions (which are already one-per-signature)
and then pass them on to a common function that has an ordinary C
call signature, along the lines of
static Datum
pg_read_file_common(text *filename_t,
int64 seek_offset, int64 bytes_to_read,
bool read_to_eof, bool missing_ok)
{
if (read_to_eof)
bytes_to_read = -1; // or just Assert that it's -1 ?
else if (bytes_to_read < 0)
ereport(...);
...
}
Datum
pg_read_file_off_len(PG_FUNCTION_ARGS)
{
text *filename_t = PG_GETARG_TEXT_PP(0);
int64 seek_offset = PG_GETARG_INT64(1);
int64 bytes_to_read = PG_GETARG_INT64(2);
return pg_read_file_common(filename_t, seek_offset, bytes_to_read,
false, false);
}
regards, tom lane
Thanks for taking a look!
At Thu, 28 Jul 2022 16:22:17 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
- Simplified the implementation (by complexifying argument handling..).
- REVOKEd EXECUTE from the new functions.
- Edited the signature of the two functions.- pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok boolean ]] ) → text + pg_read_file ( filename text [, offset bigint, length bigint ] [, missing_ok boolean ] ) → textI'm okay with allowing this variant of the functions. Since there's
no implicit cast between bigint and bool, plus the fact that you
can't give just offset without length, there shouldn't be much risk
of confusion as to which variant to invoke.
Grad to hear that.
I don't really like the implementation style though. That mess of
PG_NARGS tests is illegible code already and this makes it worse.
Ah..., I have to admit that I faintly felt that feeling while on it...
I think it'd be way cleaner to have all the PG_GETARG calls in the
bottom SQL-callable functions (which are already one-per-signature)
and then pass them on to a common function that has an ordinary C
call signature, along the lines ofstatic Datum
pg_read_file_common(text *filename_t,
int64 seek_offset, int64 bytes_to_read,
bool read_to_eof, bool missing_ok)
{
if (read_to_eof)
bytes_to_read = -1; // or just Assert that it's -1 ?
I prefer assertion since that parameter cannot be passed by users.
else if (bytes_to_read < 0)
ereport(...);
...
}
This function cannot return NULL directly. Without the ability to
return NULL, it is pointless for the function to return Datum. In the
attached the function returns text*.
Datum
pg_read_file_off_len(PG_FUNCTION_ARGS)
{
text *filename_t = PG_GETARG_TEXT_PP(0);
int64 seek_offset = PG_GETARG_INT64(1);
int64 bytes_to_read = PG_GETARG_INT64(2);return pg_read_file_common(filename_t, seek_offset, bytes_to_read,
false, false);
As the result this function need to return NULL or TEXT_P according to
the returned value from pg_read_file_common.
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(ret);
}
# I'm tempted to call read_text_file() directly from each SQL functions..
Please find the attached. I added some regression tests for both
pg_read_file() and pg_read_binary_file().
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v4-0001-Add-an-argument-variation-to-pg_read_file.patchtext/x-patch; charset=us-asciiDownload
From 0edfc4f411b9b1fa5731c5f28d6225256f7077e1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Thu, 30 Jun 2022 10:30:35 +0900
Subject: [PATCH v4] Add an argument variation to pg_read_file
Let the functions pg_read_file and pg_read_binary_file have the
argument variation of f(filename, missing_ok) so that the functions
can read the whole file tolerating the file to be missing.
---
doc/src/sgml/func.sgml | 4 +-
src/backend/catalog/system_functions.sql | 4 +
src/backend/utils/adt/genfile.c | 205 +++++++++++++------
src/include/catalog/pg_proc.dat | 11 +-
src/test/regress/expected/misc_functions.out | 77 +++++++
src/test/regress/sql/misc_functions.sql | 31 +++
6 files changed, 260 insertions(+), 72 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 21f8ab73e2..49ebf6dea9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28636,7 +28636,7 @@ SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size
<indexterm>
<primary>pg_read_file</primary>
</indexterm>
- <function>pg_read_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional></optional> )
+ <function>pg_read_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> </optional> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional> )
<returnvalue>text</returnvalue>
</para>
<para>
@@ -28661,7 +28661,7 @@ SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size
<indexterm>
<primary>pg_read_binary_file</primary>
</indexterm>
- <function>pg_read_binary_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional></optional> )
+ <function>pg_read_binary_file</function> ( <parameter>filename</parameter> <type>text</type> <optional>, <parameter>offset</parameter> <type>bigint</type>, <parameter>length</parameter> <type>bigint</type> </optional> <optional>, <parameter>missing_ok</parameter> <type>boolean</type> </optional> )
<returnvalue>bytea</returnvalue>
</para>
<para>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 73da687d5d..30a048f6b0 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -659,12 +659,16 @@ REVOKE EXECUTE ON FUNCTION pg_ls_tmpdir(oid) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_file(text,boolean) FROM public;
+
REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint,boolean) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text) FROM public;
+REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,boolean) FROM public;
+
REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint) FROM public;
REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint,boolean) FROM public;
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 2bf5219256..7ac6ede42d 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -278,81 +278,49 @@ pg_read_file(PG_FUNCTION_ARGS)
*
* No superuser check done here- instead privileges are handled by the
* GRANT system.
+ *
+ * If read_to_eof is true, bytes_to_read must be -1, otherwise negative values
+ * are not allowed for bytes_to_read.
*/
-Datum
-pg_read_file_v2(PG_FUNCTION_ARGS)
+static text*
+pg_read_file_common(text *filename_t, int64 seek_offset, int64 bytes_to_read,
+ bool read_to_eof, bool missing_ok)
{
- text *filename_t = PG_GETARG_TEXT_PP(0);
- int64 seek_offset = 0;
- int64 bytes_to_read = -1;
- bool missing_ok = false;
- char *filename;
- text *result;
-
- /* handle optional arguments */
- if (PG_NARGS() >= 3)
- {
- seek_offset = PG_GETARG_INT64(1);
- bytes_to_read = PG_GETARG_INT64(2);
-
- if (bytes_to_read < 0)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("requested length cannot be negative")));
- }
- if (PG_NARGS() >= 4)
- missing_ok = PG_GETARG_BOOL(3);
-
- filename = convert_and_check_filename(filename_t);
-
- result = read_text_file(filename, seek_offset, bytes_to_read, missing_ok);
- if (result)
- PG_RETURN_TEXT_P(result);
- else
- PG_RETURN_NULL();
+ if (read_to_eof)
+ Assert(bytes_to_read == -1);
+ else if (bytes_to_read < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested length cannot be negative")));
+
+ return read_text_file(convert_and_check_filename(filename_t),
+ seek_offset, bytes_to_read, missing_ok);
}
/*
- * Read a section of a file, returning it as bytea
+ * Read a section of a file, returning it as bytea. Parameters are interpreted
+ * and restricted the same with pg_read_file_common().
*/
-Datum
-pg_read_binary_file(PG_FUNCTION_ARGS)
+static bytea*
+pg_read_binary_file_common(text *filename_t,
+ int64 seek_offset, int64 bytes_to_read,
+ bool read_to_eof, bool missing_ok)
{
- text *filename_t = PG_GETARG_TEXT_PP(0);
- int64 seek_offset = 0;
- int64 bytes_to_read = -1;
- bool missing_ok = false;
- char *filename;
- bytea *result;
-
- /* handle optional arguments */
- if (PG_NARGS() >= 3)
- {
- seek_offset = PG_GETARG_INT64(1);
- bytes_to_read = PG_GETARG_INT64(2);
-
- if (bytes_to_read < 0)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("requested length cannot be negative")));
- }
- if (PG_NARGS() >= 4)
- missing_ok = PG_GETARG_BOOL(3);
-
- filename = convert_and_check_filename(filename_t);
-
- result = read_binary_file(filename, seek_offset,
- bytes_to_read, missing_ok);
- if (result)
- PG_RETURN_BYTEA_P(result);
- else
- PG_RETURN_NULL();
+ if (read_to_eof)
+ Assert(bytes_to_read == -1);
+ else if (bytes_to_read < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested length cannot be negative")));
+
+ return read_binary_file(convert_and_check_filename(filename_t),
+ seek_offset, bytes_to_read, missing_ok);
}
/*
- * Wrapper functions for the 1 and 3 argument variants of pg_read_file_v2()
- * and pg_read_binary_file().
+ * Wrapper functions for the variants of SQL functions pg_read_file() and
+ * pg_read_binary_file().
*
* These are necessary to pass the sanity check in opr_sanity, which checks
* that all built-in functions that share the implementing C function take
@@ -361,25 +329,126 @@ pg_read_binary_file(PG_FUNCTION_ARGS)
Datum
pg_read_file_off_len(PG_FUNCTION_ARGS)
{
- return pg_read_file_v2(fcinfo);
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ int64 seek_offset = PG_GETARG_INT64(1);
+ int64 bytes_to_read = PG_GETARG_INT64(2);
+ text *ret;
+
+ ret = pg_read_file_common(filename_t, seek_offset, bytes_to_read,
+ false, false);
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(ret);
+}
+
+Datum
+pg_read_file_off_len_missing(PG_FUNCTION_ARGS)
+{
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ int64 seek_offset = PG_GETARG_INT64(1);
+ int64 bytes_to_read = PG_GETARG_INT64(2);
+ bool missing_ok = PG_GETARG_BOOL(3);
+ text *ret;
+
+ ret = pg_read_file_common(filename_t, seek_offset, bytes_to_read,
+ false, missing_ok);
+
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(ret);
}
Datum
pg_read_file_all(PG_FUNCTION_ARGS)
{
- return pg_read_file_v2(fcinfo);
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ text *ret;
+
+ ret = pg_read_file_common(filename_t, 0, -1, true, false);
+
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(ret);
+}
+
+Datum
+pg_read_file_all_missing(PG_FUNCTION_ARGS)
+{
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ bool missing_ok = PG_GETARG_BOOL(1);
+ text *ret;
+
+ ret = pg_read_file_common(filename_t, 0, -1, true, missing_ok);
+
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(ret);
}
Datum
pg_read_binary_file_off_len(PG_FUNCTION_ARGS)
{
- return pg_read_binary_file(fcinfo);
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ int64 seek_offset = PG_GETARG_INT64(1);
+ int64 bytes_to_read = PG_GETARG_INT64(2);
+ text *ret;
+
+ ret = pg_read_binary_file_common(filename_t, seek_offset, bytes_to_read,
+ false, false);
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_BYTEA_P(ret);
+}
+
+Datum
+pg_read_binary_file_off_len_missing(PG_FUNCTION_ARGS)
+{
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ int64 seek_offset = PG_GETARG_INT64(1);
+ int64 bytes_to_read = PG_GETARG_INT64(2);
+ bool missing_ok = PG_GETARG_BOOL(3);
+ text *ret;
+
+ ret = pg_read_binary_file_common(filename_t, seek_offset, bytes_to_read,
+ false, missing_ok);
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_BYTEA_P(ret);
}
Datum
pg_read_binary_file_all(PG_FUNCTION_ARGS)
{
- return pg_read_binary_file(fcinfo);
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ text *ret;
+
+ ret = pg_read_binary_file_common(filename_t, 0, -1, true, false);
+
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_BYTEA_P(ret);
+}
+
+Datum
+pg_read_binary_file_all_missing(PG_FUNCTION_ARGS)
+{
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+ bool missing_ok = PG_GETARG_BOOL(1);
+ text *ret;
+
+ ret = pg_read_binary_file_common(filename_t, 0, -1, true, missing_ok);
+
+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_BYTEA_P(ret);
}
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2e41f4d9e8..63392d24ff 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6415,7 +6415,7 @@
proargtypes => 'text int8 int8', prosrc => 'pg_read_file_off_len' },
{ oid => '3293', descr => 'read text from a file',
proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
- proargtypes => 'text int8 int8 bool', prosrc => 'pg_read_file_v2' },
+ proargtypes => 'text int8 int8 bool', prosrc => 'pg_read_file_off_len_missing' },
{ oid => '4100',
descr => 'read text from a file - old version for adminpack 1.0',
proname => 'pg_read_file_old', provolatile => 'v', prorettype => 'text',
@@ -6423,15 +6423,22 @@
{ oid => '3826', descr => 'read text from a file',
proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
proargtypes => 'text', prosrc => 'pg_read_file_all' },
+{ oid => '8025', descr => 'read text from a file',
+ proname => 'pg_read_file', provolatile => 'v', prorettype => 'text',
+ proargtypes => 'text bool', prosrc => 'pg_read_file_all_missing' },
{ oid => '3827', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
proargtypes => 'text int8 int8', prosrc => 'pg_read_binary_file_off_len' },
{ oid => '3295', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
- proargtypes => 'text int8 int8 bool', prosrc => 'pg_read_binary_file' },
+ proargtypes => 'text int8 int8 bool',
+ prosrc => 'pg_read_binary_file_off_len_missing' },
{ oid => '3828', descr => 'read bytea from a file',
proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
proargtypes => 'text', prosrc => 'pg_read_binary_file_all' },
+{ oid => '8026', descr => 'read bytea from a file',
+ proname => 'pg_read_binary_file', provolatile => 'v', prorettype => 'bytea',
+ proargtypes => 'text bool', prosrc => 'pg_read_binary_file_all_missing' },
{ oid => '2625', descr => 'list all files in a directory',
proname => 'pg_ls_dir', prorows => '1000', proretset => 't',
provolatile => 'v', prorettype => 'text', proargtypes => 'text',
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 01d1ad0b9a..07e9348f97 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -372,6 +372,83 @@ select count(*) >= 0 as ok from pg_ls_archive_statusdir();
t
(1 row)
+-- pg_read_file()
+\pset null <null>
+select substr(pg_read_file('pg_hba.conf'), 1, 30);
+ substr
+--------------------------------
+ # PostgreSQL Client Authentica
+(1 row)
+
+select pg_read_file('pg_hba.conf', 0, 30);
+ pg_read_file
+--------------------------------
+ # PostgreSQL Client Authentica
+(1 row)
+
+-- Test missing_ok
+select pg_read_file('does not exist'); -- error
+ERROR: could not open file "does not exist" for reading: No such file or directory
+select pg_read_file('does not exist', false); -- error
+ERROR: could not open file "does not exist" for reading: No such file or directory
+select pg_read_file('does not exist', 0, 30, false); -- error
+ERROR: could not open file "does not exist" for reading: No such file or directory
+select pg_read_file('does not exist', true); -- ok
+ pg_read_file
+--------------
+ <null>
+(1 row)
+
+select pg_read_file('does not exist', 0, 30, true); -- ok
+ pg_read_file
+--------------
+ <null>
+(1 row)
+
+-- Test invalid argument
+select pg_read_file('does not exist', 0, -1); -- error
+ERROR: requested length cannot be negative
+select pg_read_file('does not exist', 0, -1, true); -- error
+ERROR: requested length cannot be negative
+-- pg_read_binary_file()
+select substr(pg_read_binary_file('pg_hba.conf'), 1, 30);
+ substr
+----------------------------------------------------------------
+ \x2320506f737467726553514c20436c69656e742041757468656e74696361
+(1 row)
+
+select pg_read_binary_file('pg_hba.conf', 0, 30);
+ pg_read_binary_file
+----------------------------------------------------------------
+ \x2320506f737467726553514c20436c69656e742041757468656e74696361
+(1 row)
+
+-- Test missing_ok
+select pg_read_binary_file('does not exist'); -- error
+ERROR: could not open file "does not exist" for reading: No such file or directory
+select pg_read_binary_file('does not exist', false); -- error
+ERROR: could not open file "does not exist" for reading: No such file or directory
+select pg_read_binary_file('does not exist', 0, 30, false); -- error
+ERROR: could not open file "does not exist" for reading: No such file or directory
+select pg_read_binary_file('does not exist', true); -- ok
+ pg_read_binary_file
+---------------------
+ <null>
+(1 row)
+
+select pg_read_binary_file('does not exist', 0, 30, true); -- ok
+ pg_read_binary_file
+---------------------
+ <null>
+(1 row)
+
+-- Test invalid argument
+select pg_read_binary_file('does not exist', 0, -1); -- error
+ERROR: requested length cannot be negative
+select pg_read_binary_file('does not exist', 0, -1, true); -- error
+ERROR: requested length cannot be negative
+\pset null
+-- pg_ls_dir()
select * from (select pg_ls_dir('.') a) a where a = 'base' limit 1;
a
------
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 072fc36a1f..8f5251abbb 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -123,6 +123,35 @@ from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1;
select count(*) >= 0 as ok from pg_ls_archive_statusdir();
+-- pg_read_file()
+\pset null <null>
+select substr(pg_read_file('pg_hba.conf'), 1, 30);
+select pg_read_file('pg_hba.conf', 0, 30);
+-- Test missing_ok
+select pg_read_file('does not exist'); -- error
+select pg_read_file('does not exist', false); -- error
+select pg_read_file('does not exist', 0, 30, false); -- error
+select pg_read_file('does not exist', true); -- ok
+select pg_read_file('does not exist', 0, 30, true); -- ok
+-- Test invalid argument
+select pg_read_file('does not exist', 0, -1); -- error
+select pg_read_file('does not exist', 0, -1, true); -- error
+
+-- pg_read_binary_file()
+select substr(pg_read_binary_file('pg_hba.conf'), 1, 30);
+select pg_read_binary_file('pg_hba.conf', 0, 30);
+-- Test missing_ok
+select pg_read_binary_file('does not exist'); -- error
+select pg_read_binary_file('does not exist', false); -- error
+select pg_read_binary_file('does not exist', 0, 30, false); -- error
+select pg_read_binary_file('does not exist', true); -- ok
+select pg_read_binary_file('does not exist', 0, 30, true); -- ok
+-- Test invalid argument
+select pg_read_binary_file('does not exist', 0, -1); -- error
+select pg_read_binary_file('does not exist', 0, -1, true); -- error
+\pset null
+
+-- pg_ls_dir()
select * from (select pg_ls_dir('.') a) a where a = 'base' limit 1;
-- Test missing_ok (second argument)
select pg_ls_dir('does not exist', false, false); -- error
@@ -133,6 +162,8 @@ select count(*) = 1 as dot_found
select count(*) = 1 as dot_found
from pg_ls_dir('.', false, false) as ls where ls = '.';
+
+
select * from (select (pg_timezone_names()).name) ptn where name='UTC' limit 1;
select count(*) > 0 from
--
2.31.1
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
Please find the attached. I added some regression tests for both
pg_read_file() and pg_read_binary_file().
Yeah, I definitely find this way cleaner even if it's a bit more verbose.
I think that the PG_RETURN_NULL code paths are not reachable in the
wrappers that don't have missing_ok. I concur with your decision
to write them all the same, though.
Pushed after some fooling with the docs and test cases. (Notably,
I do not think we can assume that pg_hba.conf exists in $PGDATA; some
installations keep it elsewhere. I used postgresql.auto.conf instead.)
regards, tom lane
On Fri, Jul 29, 2022 at 03:44:25PM -0400, Tom Lane wrote:
Pushed after some fooling with the docs and test cases. (Notably,
I do not think we can assume that pg_hba.conf exists in $PGDATA; some
installations keep it elsewhere. I used postgresql.auto.conf instead.)
Are you sure that this last part is a good idea? We don't force the
creation of postgresql.auto.conf when starting a server, so this
impacts the portability of the tests with installcheck if one decides
to remove it from the data folder, and it sounds plausible to me that
some distributions do exactly that..
I guess that you could rely on config_file or hba_file instead.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Fri, Jul 29, 2022 at 03:44:25PM -0400, Tom Lane wrote:
Pushed after some fooling with the docs and test cases. (Notably,
I do not think we can assume that pg_hba.conf exists in $PGDATA; some
installations keep it elsewhere. I used postgresql.auto.conf instead.)
Are you sure that this last part is a good idea? We don't force the
creation of postgresql.auto.conf when starting a server, so this
impacts the portability of the tests with installcheck if one decides
to remove it from the data folder, and it sounds plausible to me that
some distributions do exactly that..
Hm. I considered reading PG_VERSION instead, or postmaster.pid.
PG_VERSION would be a very boring test case, but it should certainly
be present in $PGDATA.
regards, tom lane
On Fri, Jul 29, 2022 at 11:35:36PM -0400, Tom Lane wrote:
Hm. I considered reading PG_VERSION instead, or postmaster.pid.
PG_VERSION would be a very boring test case, but it should certainly
be present in $PGDATA.
PG_VERSION would be simpler. Looking at postmaster.pid would require
a lookup at external_pid_file, and as it is not set by default you
would need to add some extra logic in the tests where
external_pid_file = NULL <=> PGDATA/postmaster.pid.
--
Michael
On 2022-Jul-30, Michael Paquier wrote:
PG_VERSION would be simpler. Looking at postmaster.pid would require
a lookup at external_pid_file, and as it is not set by default you
would need to add some extra logic in the tests where
external_pid_file = NULL <=> PGDATA/postmaster.pid.
Hmm, no? as I recall external_pid_file is an *additional* PID file; it
doesn't supplant postmaster.pid.
postmaster.opts is also an option.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"E pur si muove" (Galileo Galilei)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2022-Jul-30, Michael Paquier wrote:
PG_VERSION would be simpler. Looking at postmaster.pid would require
a lookup at external_pid_file, and as it is not set by default you
would need to add some extra logic in the tests where
external_pid_file = NULL <=> PGDATA/postmaster.pid.
Hmm, no? as I recall external_pid_file is an *additional* PID file; it
doesn't supplant postmaster.pid.
Right. postmaster.pid absolutely should be there if the postmaster
is up (and if it ain't, you're going to have lots of other difficulty
in running the regression tests...). It doesn't feel quite as static
as PG_VERSION, though.
regards, tom lane
At Sat, 30 Jul 2022 10:24:39 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2022-Jul-30, Michael Paquier wrote:
PG_VERSION would be simpler. Looking at postmaster.pid would require
a lookup at external_pid_file, and as it is not set by default you
would need to add some extra logic in the tests where
external_pid_file = NULL <=> PGDATA/postmaster.pid.Hmm, no? as I recall external_pid_file is an *additional* PID file; it
doesn't supplant postmaster.pid.Right. postmaster.pid absolutely should be there if the postmaster
is up (and if it ain't, you're going to have lots of other difficulty
in running the regression tests...). It doesn't feel quite as static
as PG_VERSION, though.
Thanks for committing it. Also the revised test (being suggested by
Michael) looks good.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center