pg_execute_from_file, patch v10

Started by Dimitri Fontaineabout 15 years ago27 messages
#1Dimitri Fontaine
dimitri@2ndQuadrant.fr
1 attachment(s)

Hi,

The other infrastructure patch that has been mark ready for commit then
commented further upon by Tom is $subject, even if the function provided
as been renamed to pg_execute_sql_file().

Please find attached the newer version that fixes Tom concerns, removing
the VARIADIC forms of the functions (those placeholders idea).

The git tree already contains a fixed extension code, but the next patch
for that one will have to wait some more (psql refactoring).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachments:

pg_execute_from_file.v10.patchtext/x-patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 14449,14466 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
        </row>
        <row>
         <entry>
!         <literal><function>pg_read_file(<parameter>filename</> <type>text</>, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</>)</function></literal>
         </entry>
         <entry><type>text</type></entry>
         <entry>Return the contents of a text file</entry>
        </row>
        <row>
         <entry>
          <literal><function>pg_stat_file(<parameter>filename</> <type>text</>)</function></literal>
         </entry>
         <entry><type>record</type></entry>
         <entry>Return information about a file</entry>
        </row>
       </tbody>
      </tgroup>
     </table>
--- 14449,14488 ----
        </row>
        <row>
         <entry>
!         <literal><function>pg_read_file(<parameter>filename</> <type>text</>, <parameter>offset</> <type>bigint</> [, <parameter>length</> <type>bigint</>])</function></literal>
         </entry>
         <entry><type>text</type></entry>
         <entry>Return the contents of a text file</entry>
        </row>
        <row>
         <entry>
+         <literal><function>pg_read_binary_file(<parameter>filename</> <type>text</>, <parameter>offset</> <type>bigint</> [, <parameter>length</> <type>bigint</>])</function></literal>
+        </entry>
+        <entry><type>bytea</type></entry>
+        <entry>Return the contents of a file</entry>
+       </row>
+       <row>
+        <entry>
          <literal><function>pg_stat_file(<parameter>filename</> <type>text</>)</function></literal>
         </entry>
         <entry><type>record</type></entry>
         <entry>Return information about a file</entry>
        </row>
+       <row>
+        <entry>
+         <literal><function>pg_execute_sql_string(<parameter>sql</> <type>text</>) )</function></literal>
+        </entry>
+        <entry><type>void</type></entry>
+        <entry>Execute the given string as <acronym>SQL</> commands.</entry>
+       </row>
+       <row>
+        <entry>
+         <literal><function>pg_execute_sql_file(<parameter>filename</> <type>text</> [, <parameter>encoding</parameter> <type>name</type>]) )</function></literal>
+        </entry>
+        <entry><type>void</type></entry>
+        <entry>Execute contents of the given file as <acronym>SQL</> commands,
+        expected in either database encoding or given encoding.</entry>
+       </row>
       </tbody>
      </tgroup>
     </table>
***************
*** 14482,14487 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 14504,14533 ----
      at the given <parameter>offset</>, returning at most <parameter>length</>
      bytes (less if the end of file is reached first).  If <parameter>offset</>
      is negative, it is relative to the end of the file.
+     When the parameter <parameter>length</> is omitted,
+     <function>pg_read_file</> reads until the end of the file.
+     The part of a file must be a valid text in the server encoding.
+    </para>
+ 
+    <indexterm>
+     <primary>pg_read_binary_file</primary>
+    </indexterm>
+    <para>
+     <function>pg_read_binary_file</> returns part of a file as like as
+     <function>pg_read_file</>, but the result is a bytea value.
+ <programlisting>
+ SELECT convert_from(pg_read_binary_file('postgresql.conf', -69), 'utf8');
+                                  convert_from                                  
+ -------------------------------------------------------------------------------
+  #custom_variable_classes = ''           # list of custom variable class names+
+  
+ (1 row)
+ </programlisting>
+    </para>
+    <para>
+     When the parameter <parameter>length</> is
+     omited, <function>pg_read_binary_file</> reads until the end of the
+     file.
     </para>
  
     <indexterm>
***************
*** 14499,14504 **** SELECT (pg_stat_file('filename')).modification;
--- 14545,14582 ----
  </programlisting>
     </para>
  
+    <indexterm>
+     <primary>pg_execute_sql_string</primary>
+    </indexterm>
+    <para>
+     <function>pg_execute_sql_string</> makes the server execute the given string
+     as <acronym>SQL</> commands.
+     The script may contain placeholders that will be replaced by the optional
+     parameters, that are pairs of variable names and values. Here's an example:
+ <programlisting>
+ SELECT pg_execute_sql_string('CREATE SCHEMA utils');
+  pg_execute_sql_string 
+ -----------------------
+  
+ (1 row)
+ 
+ SELECT oid, * from pg_namespace where nspname = 'utils';
+   oid  | nspname | nspowner | nspacl 
+ -------+---------+----------+--------
+  16387 | utils   |       10 | 
+ (2 rows)
+ </programlisting>
+    </para>
+ 
+    <indexterm>
+     <primary>pg_execute_sql_file</primary>
+    </indexterm>
+    <para>
+     <function>pg_execute_sql_file</> makes the server execute contents in a file
+     as <acronym>SQL</> commands. You can give an encoding name of the file to the
+     function. If omitted, the server encoding is used.
+    </para>
+ 
     <para>
      The functions shown in <xref linkend="functions-advisory-locks"> manage
      advisory locks.  For details about proper use of these functions, see
*** a/src/backend/utils/adt/genfile.c
--- b/src/backend/utils/adt/genfile.c
***************
*** 21,31 ****
--- 21,33 ----
  #include <dirent.h>
  
  #include "catalog/pg_type.h"
+ #include "executor/spi.h"
  #include "funcapi.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
  #include "postmaster/syslogger.h"
  #include "storage/fd.h"
+ #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/memutils.h"
  #include "utils/timestamp.h"
***************
*** 80,104 **** convert_and_check_filename(text *arg)
  
  
  /*
!  * Read a section of a file, returning it as text
   */
! Datum
! pg_read_file(PG_FUNCTION_ARGS)
  {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
! 	int64		seek_offset = PG_GETARG_INT64(1);
! 	int64		bytes_to_read = PG_GETARG_INT64(2);
! 	char	   *buf;
! 	size_t		nbytes;
! 	FILE	   *file;
! 	char	   *filename;
  
  	if (!superuser())
  		ereport(ERROR,
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  				 (errmsg("must be superuser to read files"))));
  
! 	filename = convert_and_check_filename(filename_t);
  
  	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
  		ereport(ERROR,
--- 82,126 ----
  
  
  /*
!  * Read a section of a file, returning it as bytea
!  *
!  * Will read all the file if given a nagative bytes_to_read, and a negative
!  * offset is applied from the end of the file (SEEK_END)
   */
! static bytea *
! read_binary_file(const char *filename, int64 offset, int64 bytes_to_read)
  {
! 	FILE       *file;
! 	bytea      *buf;
! 	int64       nbytes;
  
+ 	/* Only superuser can read files. */
  	if (!superuser())
  		ereport(ERROR,
  				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  				 (errmsg("must be superuser to read files"))));
  
! 	if (bytes_to_read >= 0)
! 		nbytes = bytes_to_read;
! 	else if (offset < 0)
! 		nbytes = -offset;
! 	else
! 	{
! 		struct stat fst;
! 
! 		if (stat(filename, &fst) < 0)
! 			ereport(ERROR,
! 					(errcode_for_file_access(),
! 					 errmsg("could not stat file \"%s\": %m", filename)));
! 
! 		nbytes = fst.st_size;
! 	}
! 
! 	/* not sure why anyone thought that int64 length was a good idea */
! 	if (nbytes > (MaxAllocSize - VARHDRSZ))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length too large")));
  
  	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
  		ereport(ERROR,
***************
*** 106,146 **** pg_read_file(PG_FUNCTION_ARGS)
  				 errmsg("could not open file \"%s\" for reading: %m",
  						filename)));
  
! 	if (fseeko(file, (off_t) seek_offset,
! 			   (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0)
  		ereport(ERROR,
  				(errcode_for_file_access(),
  				 errmsg("could not seek in file \"%s\": %m", filename)));
  
! 	if (bytes_to_read < 0)
  		ereport(ERROR,
  				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length cannot be negative")));
  
! 	/* not sure why anyone thought that int64 length was a good idea */
! 	if (bytes_to_read > (MaxAllocSize - VARHDRSZ))
  		ereport(ERROR,
  				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length too large")));
  
! 	buf = palloc((Size) bytes_to_read + VARHDRSZ);
  
! 	nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
  
! 	if (ferror(file))
  		ereport(ERROR,
! 				(errcode_for_file_access(),
! 				 errmsg("could not read file \"%s\": %m", filename)));
! 
! 	/* Make sure the input is valid */
! 	pg_verifymbstr(VARDATA(buf), nbytes, false);
  
! 	SET_VARSIZE(buf, nbytes + VARHDRSZ);
  
! 	FreeFile(file);
! 	pfree(filename);
  
! 	PG_RETURN_TEXT_P(buf);
  }
  
  /*
--- 128,270 ----
  				 errmsg("could not open file \"%s\" for reading: %m",
  						filename)));
  
! 	if (fseeko(file, (off_t) offset,
! 			   (offset >= 0) ? SEEK_SET : SEEK_END) != 0)
  		ereport(ERROR,
  				(errcode_for_file_access(),
  				 errmsg("could not seek in file \"%s\": %m", filename)));
  
! 	buf = (bytea *) palloc((Size) nbytes + VARHDRSZ);
! 	nbytes = fread(VARDATA(buf), 1, (size_t) nbytes, file);
! 
! 	if (ferror(file))
! 		ereport(ERROR,
! 				(errcode_for_file_access(),
! 				 errmsg("could not read file \"%s\": %m", filename)));
! 
! 	FreeFile(file);
! 
! 	SET_VARSIZE(buf, nbytes + VARHDRSZ);
! 
! 	return buf;
! }
! 
! /*
!  * In addition to read_binary_file, verify the contents are encoded in the
!  * database encoding.
!  */
! static text *
! read_text_file(const char *filename, int64 seek_offset, int64 bytes_to_read)
! {
! 	bytea *content = read_binary_file(filename, seek_offset, bytes_to_read);
! 
! 	/* Make sure the input is valid */
! 	pg_verifymbstr(VARDATA(content), VARSIZE(content) - VARHDRSZ, false);
! 
! 	/* Abuse knowledge that we're bytea and text are both varlena */
! 	return (text *) content;
! }
! 
! /*
!  * read given file and return the content converted to given encoding
!  */
! char *
! read_text_file_with_endoding(const char *filename, const char *encoding_name)
! {
! 	int			src_encoding;
! 	int			dest_encoding = GetDatabaseEncoding();
! 	bytea      *content;
! 	const char *src_str;
! 	char	   *dest_str;
! 	int			len;
! 
! 	/* use database encoding if not given */
! 	if (encoding_name == NULL)
! 		src_encoding = dest_encoding;
! 	else
! 		src_encoding = pg_char_to_encoding(encoding_name);
! 
! 	if (src_encoding < 0)
  		ereport(ERROR,
  				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("invalid source encoding name \"%s\"",
! 						encoding_name)));
! 
! 	/* superuser checks will be done in reading files */
! 	content = read_binary_file(filename, 0, -1);
! 
! 	/* make sure that source string is valid */
! 	len = VARSIZE_ANY_EXHDR(content);
! 	src_str = VARDATA_ANY(content);
! 	pg_verify_mbstr_len(src_encoding, src_str, len, false);
! 
! 	/* convert the encoding to the database encoding */
! 	dest_str = (char *) pg_do_encoding_conversion(
! 				(unsigned char *) src_str, len, src_encoding, dest_encoding);
! 	if (dest_str != src_str)
! 		return dest_str;
! 	else
! 		return text_to_cstring((text *)content);
! }
  
! /*
!  * Read a section of a file, returning its content as text
!  */
! Datum
! pg_read_file(PG_FUNCTION_ARGS)
! {
! 	char       *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
! 	int64		seek_offset = PG_GETARG_INT64(1);
! 	int64		bytes_to_read = PG_GETARG_INT64(2);
! 
! 	if (bytes_to_read < 0)
  		ereport(ERROR,
  				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length cannot be negative")));
! 
! 	PG_RETURN_TEXT_P(read_text_file(filename, seek_offset, bytes_to_read));
! }
! 
! /*
!  * Read till the end of a file, returning its content as text
!  */
! Datum
! pg_read_whole_file(PG_FUNCTION_ARGS)
! {
! 	char       *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
! 	int64		seek_offset = PG_GETARG_INT64(1);
  
! 	PG_RETURN_TEXT_P(read_text_file(filename, seek_offset, -1));
! }
  
! /*
!  * Read a section of a file, returning its content as bytea
!  */
! Datum
! pg_read_binary_file(PG_FUNCTION_ARGS)
! {
! 	char       *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
! 	int64		seek_offset = PG_GETARG_INT64(1);
! 	int64		bytes_to_read = PG_GETARG_INT64(2);
  
! 	if (bytes_to_read < 0)
  		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length cannot be negative")));
  
! 	PG_RETURN_BYTEA_P(read_binary_file(filename, seek_offset, bytes_to_read));
! }
  
! /*
!  * Read till the end of a file, returning its content as bytea
!  */
! Datum
! pg_read_whole_binary_file(PG_FUNCTION_ARGS)
! {
! 	char       *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
! 	int64		seek_offset = PG_GETARG_INT64(1);
  
! 	PG_RETURN_BYTEA_P(read_binary_file(filename, seek_offset, -1));
  }
  
  /*
***************
*** 149,155 **** pg_read_file(PG_FUNCTION_ARGS)
  Datum
  pg_stat_file(PG_FUNCTION_ARGS)
  {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
  	char	   *filename;
  	struct stat fst;
  	Datum		values[6];
--- 273,279 ----
  Datum
  pg_stat_file(PG_FUNCTION_ARGS)
  {
! 	text	   *filename_t = PG_GETARG_TEXT_PP(0);
  	char	   *filename;
  	struct stat fst;
  	Datum		values[6];
***************
*** 234,240 **** pg_ls_dir(PG_FUNCTION_ARGS)
  		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
  
  		fctx = palloc(sizeof(directory_fctx));
! 		fctx->location = convert_and_check_filename(PG_GETARG_TEXT_P(0));
  
  		fctx->dirdesc = AllocateDir(fctx->location);
  
--- 358,364 ----
  		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
  
  		fctx = palloc(sizeof(directory_fctx));
! 		fctx->location = convert_and_check_filename(PG_GETARG_TEXT_PP(0));
  
  		fctx->dirdesc = AllocateDir(fctx->location);
  
***************
*** 264,266 **** pg_ls_dir(PG_FUNCTION_ARGS)
--- 388,466 ----
  
  	SRF_RETURN_DONE(funcctx);
  }
+ 
+ /*
+  * Execute given SQL string.
+  */
+ void
+ execute_sql_string(const char *filename, const char *sql)
+ {
+ 	/*
+ 	 * We abuse some internal knowledge from spi.h here. As we don't know
+ 	 * which queries are going to get executed, we don't know what to expect
+ 	 * as an OK return code from SPI_execute().  We assume that
+ 	 * SPI_OK_CONNECT, SPI_OK_FINISH and SPI_OK_FETCH are quite improbable,
+ 	 * though, and the errors are negatives.  So a valid return code is
+ 	 * considered to be SPI_OK_UTILITY or anything from there.
+ 	 */
+ 	if (SPI_connect() != SPI_OK_CONNECT)
+ 		elog(ERROR, "SPI_connect failed");
+ 
+ 	if (SPI_execute(sql, false, 0) < SPI_OK_UTILITY)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATA_EXCEPTION),
+ 				 (filename != NULL
+ 				  ? errmsg("could not execute sql file: '%s'", filename)
+ 				  : errmsg("could not execute sql string"))));
+ 
+ 	if (SPI_finish() != SPI_OK_FINISH)
+ 		elog(ERROR, "SPI_finish failed");
+ }
+ 
+ /*
+  * Execute given SQL file's content in given encoding
+  */
+ void
+ execute_sql_file(const char *filename, const char *encoding_name)
+ {
+ 	execute_sql_string(filename,
+ 					   read_text_file_with_endoding(filename, encoding_name));
+ }
+ 
+ /*
+  * Execute SQL string.
+  */
+ Datum
+ pg_execute_sql_string(PG_FUNCTION_ARGS)
+ {
+ 	text	   *sql = PG_GETARG_TEXT_PP(0);
+ 
+ 	execute_sql_string(NULL, text_to_cstring(sql));
+ 	PG_RETURN_VOID();
+ }
+ 
+ /*
+  * Read a file, and execute the contents as SQL commands.
+  */
+ Datum
+ pg_execute_sql_file(PG_FUNCTION_ARGS)
+ {
+ 	char       *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ 
+ 	execute_sql_file(filename, NULL);
+ 	PG_RETURN_VOID();
+ }
+ 
+ /*
+  * In addition to pg_execute_sql_file, convert the encoding of the contents
+  * to the database encoding.
+  */
+ Datum
+ pg_convert_and_execute_sql_file(PG_FUNCTION_ARGS)
+ {
+ 	char       *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ 	char	   *encoding_name = NameStr(*PG_GETARG_NAME(1));
+ 
+ 	execute_sql_file(filename, encoding_name);
+ 	PG_RETURN_VOID();
+ }
*** a/src/backend/utils/adt/varlena.c
--- b/src/backend/utils/adt/varlena.c
***************
*** 24,29 ****
--- 24,30 ----
  #include "miscadmin.h"
  #include "parser/scansup.h"
  #include "regex/regex.h"
+ #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/bytea.h"
  #include "utils/lsyscache.h"
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 3399,3412 **** DESCR("reload configuration files");
  DATA(insert OID = 2622 ( pg_rotate_logfile		PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ ));
  DESCR("rotate log file");
  
! DATA(insert OID = 2623 ( pg_stat_file		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2249 "25" "{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}" "{filename,size,access,modification,change,creation,isdir}" _null_ pg_stat_file _null_ _null_ _null_ ));
  DESCR("return file information");
! DATA(insert OID = 2624 ( pg_read_file		PGNSP PGUID 12 1 0 0 f f f t f v 3 0 25 "25 20 20" _null_ _null_ _null_ _null_ pg_read_file _null_ _null_ _null_ ));
  DESCR("read text from a file");
! DATA(insert OID = 2625 ( pg_ls_dir			PGNSP PGUID 12 1 1000 0 f f f t t v 1 0 25 "25" _null_ _null_ _null_ _null_ pg_ls_dir _null_ _null_ _null_ ));
  DESCR("list all files in a directory");
! DATA(insert OID = 2626 ( pg_sleep			PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2278 "701" _null_ _null_ _null_ _null_ pg_sleep _null_ _null_ _null_ ));
  DESCR("sleep for the specified time in seconds");
  
  DATA(insert OID = 2971 (  text				PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "16" _null_ _null_ _null_ _null_ booltext _null_ _null_ _null_ ));
  DESCR("convert boolean to text");
--- 3399,3424 ----
  DATA(insert OID = 2622 ( pg_rotate_logfile		PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ ));
  DESCR("rotate log file");
  
! DATA(insert OID = 2623 ( pg_stat_file			PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2249 "25" "{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}" "{filename,size,access,modification,change,creation,isdir}" _null_ pg_stat_file _null_ _null_ _null_ ));
  DESCR("return file information");
! DATA(insert OID = 2624 ( pg_read_file			PGNSP PGUID 12 1 0 0 f f f t f v 3 0 25 "25 20 20" _null_ _null_ _null_ _null_ pg_read_file _null_ _null_ _null_ ));
  DESCR("read text from a file");
! DATA(insert OID = 3935 ( pg_read_file			PGNSP PGUID 12 1 0 0 f f f t f v 2 0 25 "25 20" _null_ _null_ _null_ _null_ pg_read_whole_file _null_ _null_ _null_ ));
! DESCR("read text from a file");
! DATA(insert OID = 2625 ( pg_ls_dir				PGNSP PGUID 12 1 1000 0 f f f t t v 1 0 25 "25" _null_ _null_ _null_ _null_ pg_ls_dir _null_ _null_ _null_ ));
  DESCR("list all files in a directory");
! DATA(insert OID = 2626 ( pg_sleep				PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2278 "701" _null_ _null_ _null_ _null_ pg_sleep _null_ _null_ _null_ ));
  DESCR("sleep for the specified time in seconds");
+ DATA(insert OID = 3930 ( pg_read_binary_file	PGNSP PGUID 12 1 0 0 f f f t f v 3 0 17 "25 20 20" _null_ _null_ _null_ _null_ pg_read_binary_file _null_ _null_ _null_ ));
+ DESCR("read bytea from a file");
+ DATA(insert OID = 3936 ( pg_read_binary_file	PGNSP PGUID 12 1 0 0 f f f t f v 2 0 17 "25 20" _null_ _null_ _null_ _null_ pg_read_whole_binary_file _null_ _null_ _null_ ));
+ DESCR("read bytea from a file");
+ DATA(insert OID = 3932 ( pg_execute_sql_string	PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2278 "25" _null_ _null_ _null_ _null_ pg_execute_sql_string _null_ _null_ _null_ ));
+ DESCR("execute queries read from a string");
+ DATA(insert OID = 3927 ( pg_execute_sql_file	PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2278 "25" _null_ _null_ _null_ _null_ pg_execute_sql_file _null_ _null_ _null_ ));
+ DESCR("execute queries read from a file");
+ DATA(insert OID = 3934 ( pg_execute_sql_file	PGNSP PGUID 12 1 0 0 f f f t f v 2 0 2278 "25 19" _null_ _null_ _null_ _null_ pg_convert_and_execute_sql_file _null_ _null_ _null_ ));
+ DESCR("execute queries read from a file");
  
  DATA(insert OID = 2971 (  text				PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "16" _null_ _null_ _null_ _null_ booltext _null_ _null_ _null_ ));
  DESCR("convert boolean to text");
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
***************
*** 442,448 **** extern Datum pg_relation_filepath(PG_FUNCTION_ARGS);
--- 442,455 ----
  /* genfile.c */
  extern Datum pg_stat_file(PG_FUNCTION_ARGS);
  extern Datum pg_read_file(PG_FUNCTION_ARGS);
+ extern Datum pg_read_whole_file(PG_FUNCTION_ARGS);
+ extern Datum pg_read_binary_file(PG_FUNCTION_ARGS);
+ extern Datum pg_read_whole_binary_file(PG_FUNCTION_ARGS);
  extern Datum pg_ls_dir(PG_FUNCTION_ARGS);
+ extern Datum replace_placeholders(PG_FUNCTION_ARGS);
+ extern Datum pg_execute_sql_string(PG_FUNCTION_ARGS);
+ extern Datum pg_execute_sql_file(PG_FUNCTION_ARGS);
+ extern Datum pg_convert_and_execute_sql_file(PG_FUNCTION_ARGS);
  
  /* misc.c */
  extern Datum current_database(PG_FUNCTION_ARGS);
*** /dev/null
--- b/src/include/utils/genfile.h
***************
*** 0 ****
--- 1,28 ----
+ /*--------------------------------------------------------------------
+  * genfile.h
+  *
+  * External declarations pertaining to backend/utils/misc/genfile.c
+  *
+  * Copyright (c) 2000-2010, PostgreSQL Global Development Group
+  *
+  * src/include/utils/genfile.h
+  *--------------------------------------------------------------------
+  */
+ #ifndef GENFILE_H
+ #define GENFILE_H
+ 
+ #include <dirent.h>
+ #include "utils/array.h"
+ 
+ typedef struct
+ {
+ 	char	   *location;
+ 	DIR		   *dirdesc;
+ } directory_fctx;
+ 
+ char *read_text_file_with_endoding(const char *filename,
+ 										const char *encoding_name);
+ void execute_sql_string(const char *filename, const char *sql);
+ void execute_sql_file(const char *filename, const char *encoding_name);
+ 
+ #endif   /* GENFILE_H */
*** a/src/test/regress/expected/strings.out
--- b/src/test/regress/expected/strings.out
***************
*** 1603,1605 **** SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 5
--- 1603,1616 ----
   Th\000o\x02\x03
  (1 row)
  
+ --
+ -- test execute
+ --
+ SELECT pg_execute_sql_string($do$ DO $$BEGIN RAISE INFO 'foo'; END;$$ $do$);
+ INFO:  foo
+ CONTEXT:  SQL statement " DO $$BEGIN RAISE INFO 'foo'; END;$$ "
+  pg_execute_sql_string 
+ -----------------------
+  
+ (1 row)
+ 
*** a/src/test/regress/sql/strings.sql
--- b/src/test/regress/sql/strings.sql
***************
*** 552,554 **** SELECT btrim(E'\\000trim\\000'::bytea, ''::bytea);
--- 552,559 ----
  SELECT encode(overlay(E'Th\\000omas'::bytea placing E'Th\\001omas'::bytea from 2),'escape');
  SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 8),'escape');
  SELECT encode(overlay(E'Th\\000omas'::bytea placing E'\\002\\003'::bytea from 5 for 3),'escape');
+ 
+ --
+ -- test execute
+ --
+ SELECT pg_execute_sql_string($do$ DO $$BEGIN RAISE INFO 'foo'; END;$$ $do$);
#2Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Dimitri Fontaine (#1)
Re: pg_execute_from_file, patch v10

On Sun, Dec 12, 2010 at 06:08, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

The other infrastructure patch that has been mark ready for commit then
commented further upon by Tom is $subject, even if the function provided
as been renamed to pg_execute_sql_file().

Please find attached the newer version that fixes Tom concerns, removing
the VARIADIC forms of the functions (those placeholders idea).

I think the version is almost OK, but I have a couple of comments:
- Why do you need directory_fctx in genfile.h ?
- It might be reasonable to have 3 and 1 arguments version of pg_read_file.
i.e, (path, offset, size) and (path). Two args version (path, offset)
doesn't seem to be so useful. In addition, CREATE EXTENSION will always
call it with offset=0, no?
- We don't need some of added #include "utils/array.h" anymore.

--
Itagaki Takahiro

#3Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Itagaki Takahiro (#2)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

I think the version is almost OK, but I have a couple of comments:
- Why do you need directory_fctx in genfile.h ?

I then use it in extension.c, this way:

typedef struct extension_fctx
{
directory_fctx dir;
ExtensionList *installed;
} extension_fctx;

- It might be reasonable to have 3 and 1 arguments version of pg_read_file.
i.e, (path, offset, size) and (path). Two args version (path, offset)
doesn't seem to be so useful. In addition, CREATE EXTENSION will always
call it with offset=0, no?

Depending on the 'relocatable' property, we now do either of those calls:

execute_sql_file(get_extension_absolute_path(control->script),
pg_encoding_to_char(encoding));

read_text_file_with_endoding(filename,
pg_encoding_to_char(encoding));

So we're using the internal forms only here, and we can propose whatever
API we find best. Reading through the end of the file seems common
enough, but I agree I would prefer reading the whole file here if I had
to pick only one.

- We don't need some of added #include "utils/array.h" anymore.

Ah yes, true.

Do you want another patch version from me?
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#4Robert Haas
robertmhaas@gmail.com
In reply to: Dimitri Fontaine (#3)
Re: pg_execute_from_file, patch v10

On Mon, Dec 13, 2010 at 9:36 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:

Do you want another patch version from me?

I'm looking at this patch and I'm confused. Why do we need this at
all? pg_read_binary_file() seems like it might be useful to somebody,
but I don't see what it has to do with extensions. And the rest of
this doesn't appear to provide any new functionality. The extension
mechanism hardly needs SQL-callable functions.

As a side note, this comment almost makes sense, but not quite:

+ /* Abuse knowledge that we're bytea and text are both varlena */

...but my real question is why any of this is necessary at all and
what it has to do with extensions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#4)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 10:53, Robert Haas <robertmhaas@gmail.com> wrote:

I'm looking at this patch and I'm confused.  Why do we need this at
all?  pg_read_binary_file() seems like it might be useful to somebody,
but I don't see what it has to do with extensions.  And the rest of
this doesn't appear to provide any new functionality.  The extension
mechanism hardly needs SQL-callable functions.

Hmm, I've expected that the EXTENSION patch would use the SQL functions
like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but
it actually uses internal functions and nested DirectFunctionCalls.
So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

However, the interface of current pg_read_file() is mis-designed
to read files in multi-byte encoding because
1. The encoding must be same with the server encoding.
2. Users need to specify correct offset in the file
not to split multi-byte characters.
So, it'd be better to improve pg_read_file() aside from EXTENSION anyway.
I think pg_read_whole_binary_file() is one of the solutions for the issue.

--
Itagaki Takahiro

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#5)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

On Tue, Dec 14, 2010 at 10:53, Robert Haas <robertmhaas@gmail.com> wrote:

I'm looking at this patch and I'm confused.  Why do we need this at
all?  pg_read_binary_file() seems like it might be useful to somebody,
but I don't see what it has to do with extensions.  And the rest of
this doesn't appear to provide any new functionality.  The extension
mechanism hardly needs SQL-callable functions.

Hmm, I've expected that the EXTENSION patch would use the SQL functions
like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but
it actually uses internal functions and nested DirectFunctionCalls.
So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

Has anyone thought twice about the security implications of that?
Not to mention that in most cases, the very last thing we want is to
have to specify an exact full path?

I think we'd be better off insisting that the extension files be under
sharedir or some such place.

In any case, I concur with what I gather Robert is thinking, which is
that there is no good reason to be exposing any of this at the SQL level.

regards, tom lane

#7Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#5)
Re: pg_execute_from_file, patch v10

On Mon, Dec 13, 2010 at 9:41 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

Hmm, I've expected that the EXTENSION patch would use the SQL functions
like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but
it actually uses internal functions and nested DirectFunctionCalls.
So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

As Tom says, this is clearly not going to fly on security grounds.

However, the interface of current pg_read_file() is mis-designed
to read files in multi-byte encoding because
 1. The encoding must be same with the server encoding.
 2. Users need to specify correct offset in the file
    not to split multi-byte characters.
So, it'd be better to improve pg_read_file() aside from EXTENSION anyway.
I think pg_read_whole_binary_file() is one of the solutions for the issue.

I don't have any problem with a separate patch to try to improve some
of these issues, but this is supposedly part of the extensions work,
yet (1) most of what's here has little to do with extensions and (2)
extensions don't need this stuff exposed at the SQL level anyway. I'm
inclined to mark this patch as Returned with Feedback. The portions
of this patch that are trying to fix multi-byte encoding issues can be
submitted as a separate patch that just does that. Whatever material
here is relevant to extensions can either be resubmitted with all of
these other things taken out, or just get absorbed into the main
extensions patch if (as I suspect) there isn't enough there to warrant
a separate patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#7)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 12:02, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 13, 2010 at 9:41 PM, Itagaki Takahiro

So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

As Tom says, this is clearly not going to fly on security grounds.

If it's a security hole, lo_import() should be also a hole
because we can use lo_import() and SELECT * FROM pg_largeobject
for the same purpose...

I don't have any problem with a separate patch to try to improve some
of these issues, but this is supposedly part of the extensions work,
yet (1) most of what's here has little to do with extensions and (2)
extensions don't need this stuff exposed at the SQL level anyway. I'm
inclined to mark this patch as Returned with Feedback.

If so, I'm not sure why we need to split the EXTENSION patch into sub pieces.
In my understanding, we did it because the sub pieces are also useful in
standalone. The requirement for the pieces was changed and extended in
discussions, but I hope the change will not be the reason to reject the patch.

--
Itagaki Takahiro

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#8)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

On Tue, Dec 14, 2010 at 12:02, Robert Haas <robertmhaas@gmail.com> wrote:

As Tom says, this is clearly not going to fly on security grounds.

If it's a security hole, lo_import() should be also a hole
because we can use lo_import() and SELECT * FROM pg_largeobject
for the same purpose...

lo_import is superuser-only. If we design this feature so that it will
forever have to be superuser-only, to get a behavior that I think we
don't even *want*, I believe we're making a serious error.

regards, tom lane

#10Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#8)
Re: pg_execute_from_file, patch v10

On Mon, Dec 13, 2010 at 10:21 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

I don't have any problem with a separate patch to try to improve some
of these issues, but this is supposedly part of the extensions work,
yet (1) most of what's here has little to do with extensions and (2)
extensions don't need this stuff exposed at the SQL level anyway.  I'm
inclined to mark this patch as Returned with Feedback.

If so, I'm not sure why we need to split the EXTENSION patch into sub pieces.
In my understanding, we did it because the sub pieces are also useful in
standalone. The requirement for the pieces was changed and extended in
discussions, but I hope the change will not be the reason to reject the patch.

Well, I think it is best when a patch has just one purpose. This
seems to be sort of an odd hodge-podge of things.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Tom Lane (#9)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 12:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:

lo_import is superuser-only.  If we design this feature so that it will
forever have to be superuser-only, to get a behavior that I think we
don't even *want*, I believe we're making a serious error.

CREATE EXTENSION and pg_read_file() is also superuser-only, no?

--
Itagaki Takahiro

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#11)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

On Tue, Dec 14, 2010 at 12:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:

lo_import is superuser-only.  If we design this feature so that it will
forever have to be superuser-only, to get a behavior that I think we
don't even *want*, I believe we're making a serious error.

CREATE EXTENSION and pg_read_file() is also superuser-only, no?

CREATE EXTENSION will be superuser to start with, no doubt, but I think
we'll someday want to allow it to database owners, just as happened with
CREATE LANGUAGE. Let's not build it on top of operations that
inherently involve security problems, especially when there's no need
to.

regards, tom lane

#13Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#6)
Re: pg_execute_from_file, patch v10

Tom Lane <tgl@sss.pgh.pa.us> writes:

Has anyone thought twice about the security implications of that?
Not to mention that in most cases, the very last thing we want is to
have to specify an exact full path?

Well, the security is left same as before, superuser only. And Itagaki
showed that superuser are allowed to read any file anywhere already, so
we didn't change anything here.

I think we'd be better off insisting that the extension files be under
sharedir or some such place.

That's the case, but the rework of genfile.c is more general than just
support for extension, or I wouldn't have been asked for a separate
patch, would I?

In any case, I concur with what I gather Robert is thinking, which is
that there is no good reason to be exposing any of this at the SQL level.

That used to be done this way, you know, in versions between 0 and 6 of
the patch. Starting at version 7, the underlyiong facilities have been
splitted and exposed, because of the file encoding and server encoding
issues reported by Itagaki.

I propose that more than 2 of you guys get in agreement on what the good
specs are and wake me up after that so that I spawn the right version of
the patch, and if necessary, revise it.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#14Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#12)
Re: pg_execute_from_file, patch v10

Tom Lane <tgl@sss.pgh.pa.us> writes:

CREATE EXTENSION will be superuser to start with, no doubt, but I think
we'll someday want to allow it to database owners, just as happened with
CREATE LANGUAGE. Let's not build it on top of operations that
inherently involve security problems, especially when there's no need
to.

That boils down to moving the superuser() test in the right functions,
it's now in the innermost facility to read files. If you have something
precise enough for me to work on it, please say, but I guess you'd spend
less time making the copy/paste in the code rather than in the mail.
That schedule optimisation is for you to make, though.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#15Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Robert Haas (#10)
Re: pg_execute_from_file, patch v10

Robert Haas <robertmhaas@gmail.com> writes:

Well, I think it is best when a patch has just one purpose. This
seems to be sort of an odd hodge-podge of things.

The purpose here is clean-up the existing pg_read_file() facility so
that it's easy to build pg_execute_sql_file() on top of it.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#16Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Dimitri Fontaine (#13)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 18:01, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

In any case, I concur with what I gather Robert is thinking, which is
that there is no good reason to be exposing any of this at the SQL level.

That used to be done this way, you know, in versions between 0 and 6 of
the patch. Starting at version 7, the underlyiong facilities have been
splitted and exposed, because of the file encoding and server encoding
issues reported by Itagaki.

I'm confused which part of the patch is the point of the discussion.
1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string/file()

As I pointed out, 1 is reasonable as long as we restrict the usage
only to superuser. If we think it is a security hole, there are
the same issue in lo_import() and COPY FROM by superuser.

2 is a *fix* for the badly-designed pg_read_file() interface.
It should have returned bytea rather than text.

3 could simplify later EXTENSION patches, but it might not be
a large help because we can just use SPI_exec() instead of them
if we write codes with C. I think the most useful parts of the
patch is reading a whole file with encoding, i.e., 1 and 2.

--
Itagaki Takahiro

#17Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#16)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 11:48 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

I'm confused which part of the patch is the point of the discussion.
 1. Relax pg_read_file() to be able to read any files.
 2. pg_read_binary_file()
 3. pg_execute_sql_string/file()

As I pointed out, 1 is reasonable as long as we restrict the usage
only to superuser. If we think it is a security hole, there are
the same issue in lo_import() and COPY FROM by superuser.

2 is a *fix* for the badly-designed pg_read_file() interface.
It should have returned bytea rather than text.

3 could simplify later EXTENSION patches, but it might not be
a large help because we can just use SPI_exec() instead of them
if we write codes with C.  I think the most useful parts of the
patch is reading a whole file with encoding, i.e., 1 and 2.

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already. We have no similar check for COPY, and both are
superuser-only. I also see that this is useful for the extensions
work, if that code wants to internally DirectFunctionCall to
pg_read_file.

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

I don't see why we need #3.

I think #4 is useful. I am not clear whether it is needed for the
extension stuff or not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#17)
Re: pg_execute_from_file, patch v10

Robert Haas <robertmhaas@gmail.com> writes:

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already.

That function would never have been accepted into core at all without a
locked-down range of how much of the filesystem it would let you get at.
There is nothing whatsoever in the extensions proposal that justifies
dropping that restriction. If you want to put it up as a separately
proposed, separately justified patch, go ahead ... but I'll vote against
it even then. (I will also point out that on SELinux-based systems,
relaxing the restriction would be completely useless anyway.)

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed. There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

BTW, it appears to me that pg_read_file expects server encoding not
client encoding. Minor detail only, but let's be clear what it is
we're talking about.

regards, tom lane

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already.

That function would never have been accepted into core at all without a
locked-down range of how much of the filesystem it would let you get at.

I have some angst about opening it up wide, but I'm also having a hard
time seeing what problem it creates that you can't already create with
COPY FROM or lo_import().

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed.  There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

BTW, it appears to me that pg_read_file expects server encoding not
client encoding.  Minor detail only, but let's be clear what it is
we're talking about.

OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#18)
Re: pg_execute_from_file, patch v10

Tom Lane <tgl@sss.pgh.pa.us> writes:

Robert Haas <robertmhaas@gmail.com> writes:

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already.

That function would never have been accepted into core at all without a
locked-down range of how much of the filesystem it would let you get at.

Ok. Previously pg_read_file() only allows absolute file names that point
into DataDir or into Log_directory. It used not to work in the first
versions of the extension's patch, but with the current code, the check
passes on a development install here: extension.c is only giving
genfile.c absolute file names.

Please note that debian will default to have DataDir in a different
place than the sharepath:

http://packages.debian.org/sid/amd64/postgresql-contrib-9.0/filelist

PGDATA: /var/lib/postgresql/9.1/main
sharepath: /usr/share/postgresql/9.1/contrib
libdir: /usr/lib/postgresql/9.1/lib

So I'm not sure how if it will play nice with such installs, or if
there's already some genfile.c patching on debian.

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed. There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

Well, in fact, the extension's code is using either execute_sql_file()
or read_text_file_with_endoding() then @extschema@ replacement then
execute_sql_string(), all those functions called directly thanks to
#include "utils/genfile.h". No DirectFunctionCall'ing, we can easily
remove SQL callable forms.

So what we need is 2, 3 and 4 (because 4 builds on 2).

BTW, it appears to me that pg_read_file expects server encoding not
client encoding. Minor detail only, but let's be clear what it is
we're talking about.

Hence the refactoring in the patch. Ask Itagaki for details with funny
environments using some file encoding that does not exists in the server
yet ain't client_encoding and can't be. I didn't follow the use case in
details, but he was happy with the current way of doing things and not
with any previous one.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#21Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Dimitri Fontaine (#20)
Re: pg_execute_from_file, patch v10

On Wed, Dec 15, 2010 at 04:39, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

Well, in fact, the extension's code is using either execute_sql_file()
or read_text_file_with_endoding() then @extschema@ replacement then
execute_sql_string(), all those functions called directly thanks to
#include "utils/genfile.h". No DirectFunctionCall'ing, we can easily
remove SQL callable forms.

So what we need is 2, 3 and 4 (because 4 builds on 2).

No, 3 is not needed. You can use SPI_exec() directly instead of
exporting execute_sql_string().

--
Itagaki Takahiro

#22Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#19)
Re: pg_execute_from_file, patch v10

On Wed, Dec 15, 2010 at 03:42, Robert Haas <robertmhaas@gmail.com> wrote:

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed.  There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

BTW, it appears to me that pg_read_file expects server encoding not
client encoding.  Minor detail only, but let's be clear what it is
we're talking about.

EXTENSION will use #2 with convert_from() for $4 like this:

Datum sql = replace(
convert_from(pg_read_binary_file($path), $encoding),
'@extschema@', $schema);
SPI_exec(TextDatumGetCString(sql));

I think it is a more flexible solution than adding 'encoding'
parameter to pg_read_file().

--
Itagaki Takahiro

#23Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#22)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 9:25 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Wed, Dec 15, 2010 at 03:42, Robert Haas <robertmhaas@gmail.com> wrote:

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed.  There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

BTW, it appears to me that pg_read_file expects server encoding not
client encoding.  Minor detail only, but let's be clear what it is
we're talking about.

EXTENSION will use #2 with convert_from() for $4 like this:

 Datum sql = replace(
               convert_from(pg_read_binary_file($path), $encoding),
               '@extschema@', $schema);
 SPI_exec(TextDatumGetCString(sql));

I think it is a more flexible solution than adding 'encoding'
parameter to pg_read_file().

It seems like pg_read_binary_file() is good to have regardless of
whatever else we decide to do here. Should we pull that part out and
commit it separately?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#23)
Re: pg_execute_from_file, patch v10

On Wed, Dec 15, 2010 at 12:20, Robert Haas <robertmhaas@gmail.com> wrote:

It seems like pg_read_binary_file() is good to have regardless of
whatever else we decide to do here.  Should we pull that part out and
commit it separately?

OK, I'll do that, but I have some questions:
#1 Should we add 'whole' versions of read functions in Dimitri's work?
#2 Should we allow additional directories? In the discussion,
no restriction seems to be a bad idea. But EXTENSION requires
to read PGSHARE or some system directories?

#2 can be added separately from the first change,
but I'd like to add #1 at the same time if required.

Or, if we're planning not to use pg_read_file functions in the
EXTENSION patch, we don't need #2 anyway.

--
Itagaki Takahiro

#25Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#24)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 10:43 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Wed, Dec 15, 2010 at 12:20, Robert Haas <robertmhaas@gmail.com> wrote:

It seems like pg_read_binary_file() is good to have regardless of
whatever else we decide to do here.  Should we pull that part out and
commit it separately?

OK, I'll do that, but I have some questions:
 #1 Should we add 'whole' versions of read functions in Dimitri's work?
 #2 Should we allow additional directories? In the discussion,
   no restriction seems to be a bad idea. But EXTENSION requires
   to read PGSHARE or some system directories?

#2 can be added separately from the first change,
but I'd like to add #1 at the same time if required.

Or, if we're planning not to use pg_read_file functions in the
EXTENSION patch, we don't need #2 anyway.

I think it's still unclear what we want to do about #2, so let's focus
on the parts we are most certain about first.

The whole-file versions seem like a good idea - my only hesitation is,
I'm not sure why we didn't include that functionality originally. It
seems obviously useful, so does that mean that it was omitted on
purpose for some reason?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#26Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#25)
1 attachment(s)
Re: pg_execute_from_file, patch v10

On Wed, Dec 15, 2010 at 12:55, Robert Haas <robertmhaas@gmail.com> wrote:

It seems like pg_read_binary_file() is good to have regardless of
whatever else we decide to do here.  Should we pull that part out and
commit it separately?

The whole-file versions seem like a good idea - my only hesitation is,
I'm not sure why we didn't include that functionality originally.  It
seems obviously useful, so does that mean that it was omitted on
purpose for some reason?

I applied the attached patch extracted from Dimitri's work.
One difference is 'offset' argument is removed from 'whole' mode.
So, we'll have (path, offset, length) and (path) versions.

Checking with convert_and_check_filename is left as-is.

--
Itagaki Takahiro

Attachments:

pg_read_binary_file.patchapplication/octet-stream; name=pg_read_binary_file.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 21f1ddf..7c1ba9d 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*************** postgres=# SELECT * FROM pg_xlogfile_nam
*** 14449,14461 ****
        </row>
        <row>
         <entry>
!         <literal><function>pg_read_file(<parameter>filename</> <type>text</>, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</>)</function></literal>
         </entry>
         <entry><type>text</type></entry>
         <entry>Return the contents of a text file</entry>
        </row>
        <row>
         <entry>
          <literal><function>pg_stat_file(<parameter>filename</> <type>text</>)</function></literal>
         </entry>
         <entry><type>record</type></entry>
--- 14449,14468 ----
        </row>
        <row>
         <entry>
!         <literal><function>pg_read_file(<parameter>filename</> <type>text</> [, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</>])</function></literal>
         </entry>
         <entry><type>text</type></entry>
         <entry>Return the contents of a text file</entry>
        </row>
        <row>
         <entry>
+         <literal><function>pg_read_binary_file(<parameter>filename</> <type>text</> [, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</>])</function></literal>
+        </entry>
+        <entry><type>bytea</type></entry>
+        <entry>Return the contents of a file</entry>
+       </row>
+       <row>
+        <entry>
          <literal><function>pg_stat_file(<parameter>filename</> <type>text</>)</function></literal>
         </entry>
         <entry><type>record</type></entry>
*************** postgres=# SELECT * FROM pg_xlogfile_nam
*** 14482,14487 ****
--- 14489,14510 ----
      at the given <parameter>offset</>, returning at most <parameter>length</>
      bytes (less if the end of file is reached first).  If <parameter>offset</>
      is negative, it is relative to the end of the file.
+     When <parameter>offset</> and <parameter>length</> parameters are omitted,
+     it returns the whole of the file.
+     The part of a file must be a valid text in the server encoding.
+    </para>
+ 
+    <indexterm>
+     <primary>pg_read_binary_file</primary>
+    </indexterm>
+    <para>
+     <function>pg_read_binary_file</> returns part of a file as like as
+     <function>pg_read_file</>, but the result is a bytea value.
+     One of the usages is to read a file in the specified encoding combined with
+     <function>convert_from</> function:
+ <programlisting>
+ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
+ </programlisting>    
     </para>
  
     <indexterm>
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index e8a36ed..e921250 100644
*** a/src/backend/utils/adt/genfile.c
--- b/src/backend/utils/adt/genfile.c
*************** convert_and_check_filename(text *arg)
*** 80,94 ****
  
  
  /*
!  * Read a section of a file, returning it as text
   */
! Datum
! pg_read_file(PG_FUNCTION_ARGS)
  {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
! 	int64		seek_offset = PG_GETARG_INT64(1);
! 	int64		bytes_to_read = PG_GETARG_INT64(2);
! 	char	   *buf;
  	size_t		nbytes;
  	FILE	   *file;
  	char	   *filename;
--- 80,93 ----
  
  
  /*
!  * Read a section of a file, returning it as bytea
!  *
!  * We read the whole of the file when bytes_to_read is nagative.
   */
! static bytea *
! read_binary_file(text *filename_t, int64 seek_offset, int64 bytes_to_read)
  {
! 	bytea	   *buf;
  	size_t		nbytes;
  	FILE	   *file;
  	char	   *filename;
*************** pg_read_file(PG_FUNCTION_ARGS)
*** 100,105 ****
--- 99,127 ----
  
  	filename = convert_and_check_filename(filename_t);
  
+ 	if (bytes_to_read < 0)
+ 	{
+ 		if (seek_offset < 0)
+ 			bytes_to_read = -seek_offset;
+ 		else
+ 		{
+ 			struct stat fst;
+ 
+ 			if (stat(filename, &fst) < 0)
+ 				ereport(ERROR,
+ 						(errcode_for_file_access(),
+ 						 errmsg("could not stat file \"%s\": %m", filename)));
+ 
+ 			bytes_to_read = fst.st_size - seek_offset;
+ 		}
+ 	}
+ 
+ 	/* not sure why anyone thought that int64 length was a good idea */
+ 	if (bytes_to_read > (MaxAllocSize - VARHDRSZ))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 				 errmsg("requested length too large")));
+ 
  	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
  		ereport(ERROR,
  				(errcode_for_file_access(),
*************** pg_read_file(PG_FUNCTION_ARGS)
*** 112,129 ****
  				(errcode_for_file_access(),
  				 errmsg("could not seek in file \"%s\": %m", filename)));
  
! 	if (bytes_to_read < 0)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length cannot be negative")));
! 
! 	/* not sure why anyone thought that int64 length was a good idea */
! 	if (bytes_to_read > (MaxAllocSize - VARHDRSZ))
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length too large")));
! 
! 	buf = palloc((Size) bytes_to_read + VARHDRSZ);
  
  	nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
  
--- 134,140 ----
  				(errcode_for_file_access(),
  				 errmsg("could not seek in file \"%s\": %m", filename)));
  
! 	buf = (bytea *) palloc((Size) bytes_to_read + VARHDRSZ);
  
  	nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
  
*************** pg_read_file(PG_FUNCTION_ARGS)
*** 132,146 ****
  				(errcode_for_file_access(),
  				 errmsg("could not read file \"%s\": %m", filename)));
  
- 	/* Make sure the input is valid */
- 	pg_verifymbstr(VARDATA(buf), nbytes, false);
- 
  	SET_VARSIZE(buf, nbytes + VARHDRSZ);
  
  	FreeFile(file);
  	pfree(filename);
  
! 	PG_RETURN_TEXT_P(buf);
  }
  
  /*
--- 143,228 ----
  				(errcode_for_file_access(),
  				 errmsg("could not read file \"%s\": %m", filename)));
  
  	SET_VARSIZE(buf, nbytes + VARHDRSZ);
  
  	FreeFile(file);
  	pfree(filename);
  
! 	return buf;
! }
! 
! /*
!  * In addition to read_binary_file, verify whether the contents are encoded
!  * in the database encoding.
!  */
! static text *
! read_text_file(text *filename, int64 seek_offset, int64 bytes_to_read)
! {
! 	bytea *buf = read_binary_file(filename, seek_offset, bytes_to_read);
! 
! 	/* Make sure the input is valid */
! 	pg_verifymbstr(VARDATA(buf), VARSIZE(buf) - VARHDRSZ, false);
! 
! 	/* OK, we can cast it as text safely */
! 	return (text *) buf;
! }
! 
! /*
!  * Read a section of a file, returning it as text
!  */
! Datum
! pg_read_file(PG_FUNCTION_ARGS)
! {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
! 	int64		seek_offset = PG_GETARG_INT64(1);
! 	int64		bytes_to_read = PG_GETARG_INT64(2);
! 
! 	if (bytes_to_read < 0)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length cannot be negative")));
! 
! 	PG_RETURN_TEXT_P(read_text_file(filename_t, seek_offset, bytes_to_read));
! }
! 
! /*
!  * Read the whole of a file, returning it as text
!  */
! Datum
! pg_read_file_all(PG_FUNCTION_ARGS)
! {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
! 
! 	PG_RETURN_TEXT_P(read_text_file(filename_t, 0, -1));
! }
! 
! /*
!  * Read a section of a file, returning it as bytea
!  */
! Datum
! pg_read_binary_file(PG_FUNCTION_ARGS)
! {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
! 	int64		seek_offset = PG_GETARG_INT64(1);
! 	int64		bytes_to_read = PG_GETARG_INT64(2);
! 
! 	if (bytes_to_read < 0)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 				 errmsg("requested length cannot be negative")));
! 
! 	PG_RETURN_BYTEA_P(read_binary_file(filename_t, seek_offset, bytes_to_read));
! }
! 
! /*
!  * Read the whole of a file, returning it as bytea
!  */
! Datum
! pg_read_binary_file_all(PG_FUNCTION_ARGS)
! {
! 	text	   *filename_t = PG_GETARG_TEXT_P(0);
! 
! 	PG_RETURN_BYTEA_P(read_binary_file(filename_t, 0, -1));
  }
  
  /*
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index feae22e..1e6e75f 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 2623 ( pg_stat_file		P
*** 3403,3408 ****
--- 3403,3414 ----
  DESCR("return file information");
  DATA(insert OID = 2624 ( pg_read_file		PGNSP PGUID 12 1 0 0 f f f t f v 3 0 25 "25 20 20" _null_ _null_ _null_ _null_ pg_read_file _null_ _null_ _null_ ));
  DESCR("read text from a file");
+ DATA(insert OID = 3826 ( pg_read_file		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 25 "25" _null_ _null_ _null_ _null_ pg_read_file_all _null_ _null_ _null_ ));
+ DESCR("read text from a file");
+ DATA(insert OID = 3827 ( pg_read_binary_file	PGNSP PGUID 12 1 0 0 f f f t f v 3 0 17 "25 20 20" _null_ _null_ _null_ _null_ pg_read_binary_file _null_ _null_ _null_ ));
+ DESCR("read bytea from a file");
+ DATA(insert OID = 3828 ( pg_read_binary_file	PGNSP PGUID 12 1 0 0 f f f t f v 1 0 17 "25" _null_ _null_ _null_ _null_ pg_read_binary_file_all _null_ _null_ _null_ ));
+ DESCR("read bytea from a file");
  DATA(insert OID = 2625 ( pg_ls_dir			PGNSP PGUID 12 1 1000 0 f f f t t v 1 0 25 "25" _null_ _null_ _null_ _null_ pg_ls_dir _null_ _null_ _null_ ));
  DESCR("list all files in a directory");
  DATA(insert OID = 2626 ( pg_sleep			PGNSP PGUID 12 1 0 0 f f f t f v 1 0 2278 "701" _null_ _null_ _null_ _null_ pg_sleep _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index a2fb749..1888e31 100644
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
*************** extern Datum pg_relation_filepath(PG_FUN
*** 442,447 ****
--- 442,450 ----
  /* genfile.c */
  extern Datum pg_stat_file(PG_FUNCTION_ARGS);
  extern Datum pg_read_file(PG_FUNCTION_ARGS);
+ extern Datum pg_read_file_all(PG_FUNCTION_ARGS);
+ extern Datum pg_read_binary_file(PG_FUNCTION_ARGS);
+ extern Datum pg_read_binary_file_all(PG_FUNCTION_ARGS);
  extern Datum pg_ls_dir(PG_FUNCTION_ARGS);
  
  /* misc.c */
#27Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Itagaki Takahiro (#26)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

I applied the attached patch extracted from Dimitri's work.

Thanks! I'm working on next extension's patch, merging now.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support