[PATCH] plpythonu datatype conversion improvements

Started by Caleb Weltonover 16 years ago25 messages
#1Caleb Welton
cwelton@greenplum.com

Patch for plpythonu

Primary motivation of the attached patch is to support handling bytea conversion allowing for embedded nulls, which in turn allows for supporting the marshal module.

Secondary motivation is slightly improved performance for conversion routines of basic datatypes that have simple mappings between postgres/python.

Primary design is to change the conversion routines from being based on cstrings to datums, eg:
PLyBool_FromString(const char *) => PLyBool_FromBool(PLyDatumToOb, Datum);

Thanks,
Caleb

-----

Index: plpython.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.120
diff -c -r1.120 plpython.c
*** plpython.c    3 Apr 2009 16:59:42 -0000    1.120
--- plpython.c    26 May 2009 22:58:52 -0000
***************
*** 78,84 ****
   * objects.
   */

! typedef PyObject *(*PLyDatumToObFunc) (const char *);

  typedef struct PLyDatumToOb
  {
--- 78,85 ----
   * objects.
   */

! struct PLyDatumToOb;
! typedef PyObject *(*PLyDatumToObFunc) (struct PLyDatumToOb*, Datum);

  typedef struct PLyDatumToOb
  {
***************
*** 104,111 ****
--- 105,120 ----
  /* convert PyObject to a Postgresql Datum or tuple.
   * output from Python
   */
+
+ struct PLyObToDatum;
+ struct PLyProcedure;
+ typedef Datum (*PLyObToDatumFunc) (struct PLyProcedure*,
+                                    struct PLyObToDatum*,
+                                    PyObject *, bool *isnull);
+
  typedef struct PLyObToDatum
  {
+     PLyObToDatumFunc func;
      FmgrInfo    typfunc;        /* The type's input function */
      Oid            typoid;            /* The OID of the type */
      Oid            typioparam;
***************
*** 255,270 ****
  static void PLy_input_tuple_funcs(PLyTypeInfo *, TupleDesc);

/* conversion functions */
static PyObject *PLyDict_FromTuple(PLyTypeInfo *, HeapTuple, TupleDesc);
! static PyObject *PLyBool_FromString(const char *);
! static PyObject *PLyFloat_FromString(const char *);
! static PyObject *PLyInt_FromString(const char *);
! static PyObject *PLyLong_FromString(const char *);
! static PyObject *PLyString_FromString(const char *);
!
! static HeapTuple PLyMapping_ToTuple(PLyTypeInfo *, PyObject *);
! static HeapTuple PLySequence_ToTuple(PLyTypeInfo *, PyObject *);
! static HeapTuple PLyObject_ToTuple(PLyTypeInfo *, PyObject *);

  /*
   * Currently active plpython function
--- 264,295 ----
  static void PLy_input_tuple_funcs(PLyTypeInfo *, TupleDesc);
  /* conversion functions */
+ static PyObject *PLyBool_FromBool(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyInt_FromInt16(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyInt_FromInt32(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyLong_FromInt64(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyString_FromText(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyString_FromDatum(PLyDatumToOb *arg, Datum d);
+
  static PyObject *PLyDict_FromTuple(PLyTypeInfo *, HeapTuple, TupleDesc);
!
! static Datum PLyObject_ToVoid(PLyProcedure *, PLyObToDatum *,
!                               PyObject *, bool *isnull);
! static Datum PLyObject_ToBool(PLyProcedure *, PLyObToDatum *,
!                               PyObject *, bool *isnull);
! static Datum PLyObject_ToBytea(PLyProcedure *, PLyObToDatum *,
!                                PyObject *, bool *isnull);
! static Datum PLyObject_ToText(PLyProcedure *, PLyObToDatum *,
!                               PyObject *, bool *isnull);
! static Datum PLyObject_ToDatum(PLyProcedure *, PLyObToDatum *,
!                                PyObject *, bool *isnull);
!
! static HeapTuple PLyMapping_ToTuple(PLyProcedure *, PyObject *);
! static HeapTuple PLySequence_ToTuple(PLyProcedure *, PyObject *);
! static HeapTuple PLyObject_ToTuple(PLyProcedure *, PyObject *);

/*
* Currently active plpython function
***************
*** 507,514 ****

          for (i = 0; i < natts; i++)
          {
-             char       *src;
-
              platt = PyList_GetItem(plkeys, i);
              if (!PyString_Check(platt))
                  ereport(ERROR,
--- 532,537 ----
***************
*** 533,564 ****
                  modvalues[i] = (Datum) 0;
                  modnulls[i] = 'n';
              }
!             else if (plval != Py_None)
              {
!                 plstr = PyObject_Str(plval);
!                 if (!plstr)
!                     PLy_elog(ERROR, "could not compute string representation of Python object in PL/Python function \"%s\" while modifying trigger row",
!                              proc->proname);
!                 src = PyString_AsString(plstr);
!
!                 modvalues[i] =
!                     InputFunctionCall(&proc->result.out.r.atts[atti].typfunc,
!                                       src,
!                                     proc->result.out.r.atts[atti].typioparam,
!                                       tupdesc->attrs[atti]->atttypmod);
!                 modnulls[i] = ' ';
!
!                 Py_DECREF(plstr);
!                 plstr = NULL;
!             }
!             else
!             {
!                 modvalues[i] =
!                     InputFunctionCall(&proc->result.out.r.atts[atti].typfunc,
!                                       NULL,
!                                     proc->result.out.r.atts[atti].typioparam,
!                                       tupdesc->attrs[atti]->atttypmod);
!                 modnulls[i] = 'n';
              }
              Py_DECREF(plval);
--- 556,565 ----
                  modvalues[i] = (Datum) 0;
                  modnulls[i] = 'n';
              }
!             else
              {
!                 PLyObToDatum *att = &proc->result.out.r.atts[atti];
!                 modvalues[i] = (att->func) (proc, att, plval, &modnulls[i]);
              }

Py_DECREF(plval);
***************
*** 784,791 ****
Datum rv;
PyObject *volatile plargs = NULL;
PyObject *volatile plrv = NULL;
- PyObject *volatile plrv_so = NULL;
- char *plrv_sc;

      PG_TRY();
      {
--- 785,790 ----
***************
*** 862,868 ****

Py_XDECREF(plargs);
Py_XDECREF(plrv);
- Py_XDECREF(plrv_so);

PLy_function_delete_args(proc);

--- 861,866 ----
***************
*** 876,922 ****
              }
          }

! /*
! * If the function is declared to return void, the Python return value
! * must be None. For void-returning functions, we also treat a None
! * return value as a special "void datum" rather than NULL (as is the
! * case for non-void-returning functions).
! */
! if (proc->result.out.d.typoid == VOIDOID)
! {
! if (plrv != Py_None)
! ereport(ERROR,
! (errcode(ERRCODE_DATATYPE_MISMATCH),
! errmsg("PL/Python function with return type \"void\" did not return None")));
!
! fcinfo->isnull = false;
! rv = (Datum) 0;
! }
! else if (plrv == Py_None)
! {
! fcinfo->isnull = true;
! if (proc->result.is_rowtype < 1)
! rv = InputFunctionCall(&proc->result.out.d.typfunc,
! NULL,
! proc->result.out.d.typioparam,
! -1);
! else
! /* Tuple as None */
! rv = (Datum) NULL;
! }
! else if (proc->result.is_rowtype >= 1)
{
HeapTuple tuple = NULL;

! if (PySequence_Check(plrv))
/* composite type as sequence (tuple, list etc) */
! tuple = PLySequence_ToTuple(&proc->result, plrv);
else if (PyMapping_Check(plrv))
/* composite type as mapping (currently only dict) */
! tuple = PLyMapping_ToTuple(&proc->result, plrv);
else
/* returned as smth, must provide method __getattr__(name) */
! tuple = PLyObject_ToTuple(&proc->result, plrv);

              if (tuple != NULL)
              {
--- 874,895 ----
              }
          }

! /* Convert python return value into postgres datatypes */
! if (proc->result.is_rowtype >= 1)
{
HeapTuple tuple = NULL;

! if (plrv == Py_None)
! tuple = NULL;
! else if (PySequence_Check(plrv))
/* composite type as sequence (tuple, list etc) */
! tuple = PLySequence_ToTuple(proc, plrv);
else if (PyMapping_Check(plrv))
/* composite type as mapping (currently only dict) */
! tuple = PLyMapping_ToTuple(proc, plrv);
else
/* returned as smth, must provide method __getattr__(name) */
! tuple = PLyObject_ToTuple(proc, plrv);

if (tuple != NULL)
{
***************
*** 931,952 ****
}
else
{
! fcinfo->isnull = false;
! plrv_so = PyObject_Str(plrv);
! if (!plrv_so)
! PLy_elog(ERROR, "could not create string representation of Python object in PL/Python function \"%s\" while creating return value", proc->proname);
! plrv_sc = PyString_AsString(plrv_so);
! rv = InputFunctionCall(&proc->result.out.d.typfunc,
! plrv_sc,
! proc->result.out.d.typioparam,
! -1);
}
}
PG_CATCH();
{
Py_XDECREF(plargs);
Py_XDECREF(plrv);
- Py_XDECREF(plrv_so);

          PG_RE_THROW();
      }
--- 904,919 ----
          }
          else
          {
!             rv = (proc->result.out.d.func) (proc,
!                                             &proc->result.out.d,
!                                             plrv,
!                                             &fcinfo->isnull);
          }
      }
      PG_CATCH();
      {
          Py_XDECREF(plargs);
          Py_XDECREF(plrv);

PG_RE_THROW();
}
***************
*** 954,960 ****

Py_XDECREF(plargs);
Py_DECREF(plrv);
- Py_XDECREF(plrv_so);

      return rv;
  }
--- 921,926 ----
***************
*** 1037,1048 ****
                      arg = NULL;
                  else
                  {
!                     char       *ct;
!
!                     ct = OutputFunctionCall(&(proc->args[i].in.d.typfunc),
!                                             fcinfo->arg[i]);
!                     arg = (proc->args[i].in.d.func) (ct);
!                     pfree(ct);
                  }
              }
--- 1003,1010 ----
                      arg = NULL;
                  else
                  {
!                     arg = (proc->args[i].in.d.func) (&(proc->args[i].in.d),
!                                                      fcinfo->arg[i]);
                  }
              }
***************
*** 1593,1598 ****
--- 1555,1589 ----
      arg->typoid = HeapTupleGetOid(typeTup);
      arg->typioparam = getTypeIOParam(typeTup);
      arg->typbyval = typeStruct->typbyval;
+
+     /* Determine which kind of Python object we will convert to */
+     switch (arg->typoid)
+     {
+         case VOIDOID:
+             arg->func = PLyObject_ToVoid;
+             break;
+         case BOOLOID:
+             arg->func = PLyObject_ToBool;
+             break;
+         case BYTEAOID:
+             arg->func = PLyObject_ToBytea;
+             break;
+         case BPCHAROID:
+         case VARCHAROID:
+         case TEXTOID:
+             arg->func = PLyObject_ToText;
+             break;
+
+         case FLOAT4OID:
+         case FLOAT8OID:
+         case NUMERICOID:
+         case INT2OID:
+         case INT4OID:
+         case INT8OID:
+         default:
+             arg->func = PLyObject_ToDatum;
+             break;
+     }
  }

static void
***************
*** 1619,1644 ****
switch (typeOid)
{
case BOOLOID:
! arg->func = PLyBool_FromString;
break;
case FLOAT4OID:
case FLOAT8OID:
case NUMERICOID:
! arg->func = PLyFloat_FromString;
break;
case INT2OID:
case INT4OID:
! arg->func = PLyInt_FromString;
break;
case INT8OID:
! arg->func = PLyLong_FromString;
break;
default:
! arg->func = PLyString_FromString;
break;
}
}

  static void
  PLy_typeinfo_init(PLyTypeInfo * arg)
  {
--- 1610,1648 ----
      switch (typeOid)
      {
          case BOOLOID:
!             arg->func = PLyBool_FromBool;
              break;
          case FLOAT4OID:
+             arg->func = PLyFloat_FromFloat4;
+             break;
          case FLOAT8OID:
+             arg->func = PLyFloat_FromFloat8;
+             break;
          case NUMERICOID:
!             arg->func = PLyFloat_FromNumeric;
              break;
          case INT2OID:
+             arg->func = PLyInt_FromInt16;
+             break;
          case INT4OID:
!             arg->func = PLyInt_FromInt32;
              break;
          case INT8OID:
!             arg->func = PLyLong_FromInt64;
!             break;
!         case BPCHAROID:
!         case VARCHAROID:
!         case TEXTOID:
!         case BYTEAOID:
!             arg->func = PLyString_FromText;
              break;
          default:
!             arg->func = PLyString_FromDatum;
              break;
      }
  }

+
static void
PLy_typeinfo_init(PLyTypeInfo * arg)
{
***************
*** 1660,1716 ****
}
}

- /* assumes that a bool is always returned as a 't' or 'f' */
static PyObject *
! PLyBool_FromString(const char *src)
{
/*
* We would like to use Py_RETURN_TRUE and Py_RETURN_FALSE here for
* generating SQL from trigger functions, but those are only supported in
* Python >= 2.3, and we support older versions.
* http://docs.python.org/api/boolObjects.html
*/
! if (src[0] == 't')
return PyBool_FromLong(1);
! return PyBool_FromLong(0);
}

static PyObject *
! PLyFloat_FromString(const char *src)
{
! double v;
! char *eptr;

! errno = 0;
! v = strtod(src, &eptr);
! if (*eptr != '\0' || errno)
! return NULL;
! return PyFloat_FromDouble(v);
}

static PyObject *
! PLyInt_FromString(const char *src)
{
! long v;
! char *eptr;

! errno = 0;
! v = strtol(src, &eptr, 0);
! if (*eptr != '\0' || errno)
! return NULL;
! return PyInt_FromLong(v);
}

static PyObject *
! PLyLong_FromString(const char *src)
{
! return PyLong_FromString((char *) src, NULL, 0);
}

static PyObject *
! PLyString_FromString(const char *src)
{
! return PyString_FromString(src);
}

  static PyObject *
--- 1664,1758 ----
      }
  }

static PyObject *
! PLyBool_FromBool(PLyDatumToOb *arg, Datum d)
{
+ bool x = DatumGetBool(d);
+ arg = 0; /* unused */
+
/*
* We would like to use Py_RETURN_TRUE and Py_RETURN_FALSE here for
* generating SQL from trigger functions, but those are only supported in
* Python >= 2.3, and we support older versions.
* http://docs.python.org/api/boolObjects.html
*/
! if (x)
return PyBool_FromLong(1);
! else
! return PyBool_FromLong(0);
}

static PyObject *
! PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d)
{
! arg = 0; /* unused */
! return PyFloat_FromDouble(DatumGetFloat4(d));
! }

! static PyObject *
! PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d)
! {
! arg = 0; /* unused */
! return PyFloat_FromDouble(DatumGetFloat8(d));
}

static PyObject *
! PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d)
{
! /*
! * Numeric is cast to a PyFloat:
! * This results in a loss of precision
! * Would it be better to cast to PyString?
! */
! Datum f = DirectFunctionCall1(numeric_float8, d);
! double x = DatumGetFloat8(f);
! arg = 0; /* unused */
! return PyFloat_FromDouble(x);
! }

! static PyObject *
! PLyInt_FromInt16(PLyDatumToOb *arg, Datum d)
! {
! arg = 0; /* unused */
! return PyInt_FromLong(DatumGetInt16(d));
}

static PyObject *
! PLyInt_FromInt32(PLyDatumToOb *arg, Datum d)
{
! arg = 0; /* unused */
! return PyInt_FromLong(DatumGetInt32(d));
}

static PyObject *
! PLyLong_FromInt64(PLyDatumToOb *arg, Datum d)
{
! arg = 0; /* unused */
!
! /* on 32 bit platforms "long" may be too small */
! if (sizeof(int64) > sizeof(long))
! return PyLong_FromLongLong(DatumGetInt64(d));
! else
! return PyLong_FromLong(DatumGetInt64(d));
! }
!
! static PyObject *
! PLyString_FromText(PLyDatumToOb *arg, Datum d)
! {
! text *txt = DatumGetTextP(d);
! char *str = VARDATA(txt);
! size_t size = VARSIZE(txt) - VARHDRSZ;
!
! return PyString_FromStringAndSize(str, size);
! }
!
! static PyObject *
! PLyString_FromDatum(PLyDatumToOb *arg, Datum d)
! {
! char *x = OutputFunctionCall(&arg->typfunc, d);
! PyObject *r = PyString_FromString(x);
! pfree(x);
! return r;
}

  static PyObject *
***************
*** 1730,1737 ****
      {
          for (i = 0; i < info->in.r.natts; i++)
          {
!             char       *key,
!                        *vsrc;
              Datum        vattr;
              bool        is_null;
              PyObject   *value;
--- 1772,1778 ----
      {
          for (i = 0; i < info->in.r.natts; i++)
          {
!             char       *key;
              Datum        vattr;
              bool        is_null;
              PyObject   *value;
***************
*** 1746,1759 ****
                  PyDict_SetItemString(dict, key, Py_None);
              else
              {
!                 vsrc = OutputFunctionCall(&info->in.r.atts[i].typfunc,
!                                           vattr);
!
!                 /*
!                  * no exceptions allowed
!                  */
!                 value = info->in.r.atts[i].func(vsrc);
!                 pfree(vsrc);
                  PyDict_SetItemString(dict, key, value);
                  Py_DECREF(value);
              }
--- 1787,1793 ----
                  PyDict_SetItemString(dict, key, Py_None);
              else
              {
!                 value = (info->in.r.atts[i].func) (&info->in.r.atts[i], vattr);
                  PyDict_SetItemString(dict, key, value);
                  Py_DECREF(value);
              }
***************
*** 1769,1777 ****
      return dict;
  }
  static HeapTuple
! PLyMapping_ToTuple(PLyTypeInfo * info, PyObject * mapping)
  {
      TupleDesc    desc;
      HeapTuple    tuple;
--- 1803,2017 ----
      return dict;
  }
+ static Datum
+ PLyObject_ToVoid(PLyProcedure *proc,
+                  PLyObToDatum *arg,
+                  PyObject *plrv,
+                  bool *isnull)
+ {
+     /*
+      * If the function is declared to return void, the Python return value must
+      * be None.  For void-returning functions, we also treat a None return value
+      * as a special "void datum" rather than NULL (as is the case for the
+      * non-void-returning functions).
+      */
+     if (plrv != Py_None)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+                  errmsg("PL/Python function with return type \"void\" did not "
+                         "return None")));
+
+     *isnull = false;
+     return (Datum) 0;
+ }
+
+ static Datum
+ PLyObject_ToBool(PLyProcedure *proc,
+                  PLyObToDatum *arg,
+                  PyObject *plrv,
+                  bool *isnull)
+ {
+     bool rv;
+
+     if (plrv == Py_None)
+     {
+         *isnull = true;
+         return (Datum) 0;
+     }
+
+     rv = PyObject_IsTrue(plrv);
+     *isnull = false;
+     return BoolGetDatum(rv);
+ }
+
+
+ static Datum
+ PLyObject_ToBytea(PLyProcedure *proc,
+                   PLyObToDatum *arg,
+                   PyObject *plrv,
+                   bool *isnull)
+ {
+     PyObject   *volatile plrv_so = NULL;
+     Datum       rv;
+
+     if (plrv == Py_None)
+     {
+         *isnull = true;
+         return (Datum) 0;
+     }
+
+     plrv_so = PyObject_Str(plrv);
+     if (!plrv_so)
+     {
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+                  errmsg("could not create string representation of Python "
+                         "object in PL/Python function \"%s\" while creating "
+                         "return value", proc->proname)));
+     }
+
+     PG_TRY();
+     {
+         char *plrv_sc = PyString_AsString(plrv_so);
+         size_t len = PyString_Size(plrv_so);
+         size_t size = len + VARHDRSZ;
+         bytea *result = (bytea*) palloc(size);
+
+         SET_VARSIZE(result, size);
+         memcpy(VARDATA(result), plrv_sc, len);
+         rv = PointerGetDatum(result);
+     }
+     PG_CATCH();
+     {
+         Py_XDECREF(plrv_so);
+         PG_RE_THROW();
+     }
+     PG_END_TRY();
+
+     Py_XDECREF(plrv_so);
+
+     *isnull = false;
+     return rv;
+ }
+
+ static Datum
+ PLyObject_ToText(PLyProcedure *proc,
+                  PLyObToDatum *arg,
+                  PyObject *plrv,
+                  bool *isnull)
+ {
+     PyObject   *volatile plrv_so = NULL;
+     Datum       rv;
+
+     if (plrv == Py_None)
+     {
+         *isnull = true;
+         return (Datum) 0;
+     }
+
+     plrv_so = PyObject_Str(plrv);
+     if (!plrv_so)
+     {
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+                  errmsg("could not create string representation of Python "
+                         "object in PL/Python function \"%s\" while creating "
+                         "return value", proc->proname)));
+     }
+
+     PG_TRY();
+     {
+         char *plrv_sc = PyString_AsString(plrv_so);
+         size_t len    = PyString_Size(plrv_so);
+         size_t size   = len + VARHDRSZ;
+         text *result;
+
+         if (strlen(plrv_sc) != (size_t) len)
+         {
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("PL/Python function \"%s\" could not convert "
+                             "Python object into text: expected string without "
+                             "null bytes", proc->proname)));
+         }
+
+         result = (bytea*) palloc(size);
+         SET_VARSIZE(result, size);
+         memcpy(VARDATA(result), plrv_sc, len);
+         rv = PointerGetDatum(result);
+     }
+     PG_CATCH();
+     {
+         Py_XDECREF(plrv_so);
+         PG_RE_THROW();
+     }
+     PG_END_TRY();
+
+     Py_XDECREF(plrv_so);
+
+     *isnull = false;
+     return rv;
+ }
+
+ /*
+  * Generic conversion function:
+  *  - Cast PyObject to cstring and cstring into postgres type.
+  */
+ static Datum
+ PLyObject_ToDatum(PLyProcedure *proc,
+                   PLyObToDatum *arg,
+                   PyObject *plrv,
+                   bool *isnull)
+ {
+     PyObject *volatile plrv_so = NULL;
+     Datum     rv;
+
+     if (plrv == Py_None)
+     {
+         *isnull = true;
+         return (Datum) 0;
+     }
+
+     plrv_so = PyObject_Str(plrv);
+     if (!plrv_so)
+     {
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+                  errmsg("could not create string representation of Python "
+                         "object in PL/Python function \"%s\" while creating "
+                         "return value", proc->proname)));
+     }
+
+     PG_TRY();
+     {
+         char *plrv_sc = PyString_AsString(plrv_so);
+         size_t len    = PyString_Size(plrv_so);
+
+         if (strlen(plrv_sc) != (size_t) len)
+         {
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("PL/Python function \"%s\" could not convert "
+                             "Python object into cstring: expected string without "
+                             "null bytes", proc->proname)));
+         }
+         rv = InputFunctionCall(&arg->typfunc, plrv_sc, arg->typioparam, -1);
+     }
+     PG_CATCH();
+     {
+         Py_XDECREF(plrv_so);
+         PG_RE_THROW();
+     }
+     PG_END_TRY();
+
+     Py_XDECREF(plrv_so);
+
+     *isnull = false;
+     return rv;
+ }

static HeapTuple
! PLyMapping_ToTuple(PLyProcedure *proc, PyObject *mapping)
{
TupleDesc desc;
HeapTuple tuple;
***************
*** 1781,1840 ****

Assert(PyMapping_Check(mapping));

! desc = lookup_rowtype_tupdesc(info->out.d.typoid, -1);
! if (info->is_rowtype == 2)
! PLy_output_tuple_funcs(info, desc);
! Assert(info->is_rowtype == 1);

/* Build tuple */
values = palloc(sizeof(Datum) * desc->natts);
nulls = palloc(sizeof(bool) * desc->natts);
for (i = 0; i < desc->natts; ++i)
{
! char *key;
! PyObject *volatile value,
! *volatile so;

key = NameStr(desc->attrs[i]->attname);
! value = so = NULL;
PG_TRY();
{
value = PyMapping_GetItemString(mapping, key);
! if (value == Py_None)
{
- values[i] = (Datum) NULL;
- nulls[i] = true;
- }
- else if (value)
- {
- char *valuestr;
-
- so = PyObject_Str(value);
- if (so == NULL)
- PLy_elog(ERROR, "could not compute string representation of Python object");
- valuestr = PyString_AsString(so);
-
- values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
- ,valuestr
- ,info->out.r.atts[i].typioparam
- ,-1);
- Py_DECREF(so);
- so = NULL;
- nulls[i] = false;
- }
- else
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("key \"%s\" not found in mapping", key),
errhint("To return null in a column, "
! "add the value None to the mapping with the key named after the column.")));

              Py_XDECREF(value);
              value = NULL;
          }
          PG_CATCH();
          {
-             Py_XDECREF(so);
              Py_XDECREF(value);
              PG_RE_THROW();
          }
--- 2021,2062 ----

Assert(PyMapping_Check(mapping));

! desc = lookup_rowtype_tupdesc(proc->result.out.d.typoid, -1);
! if (proc->result.is_rowtype == 2)
! PLy_output_tuple_funcs(&proc->result, desc);
! Assert(proc->result.is_rowtype == 1);

/* Build tuple */
values = palloc(sizeof(Datum) * desc->natts);
nulls = palloc(sizeof(bool) * desc->natts);
for (i = 0; i < desc->natts; ++i)
{
! char *key;
! PLyObToDatum *att;
! PyObject *volatile value;

+ att = &proc->result.out.r.atts[i];
key = NameStr(desc->attrs[i]->attname);
! value = NULL;
PG_TRY();
{
value = PyMapping_GetItemString(mapping, key);
! if (!value)
{
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("key \"%s\" not found in mapping", key),
errhint("To return null in a column, "
! "add the value None to the mapping with the "
! "key named after the column.")));
! }
! values[i] = (att->func) (proc, att, value, &nulls[i]);

Py_XDECREF(value);
value = NULL;
}
PG_CATCH();
{
Py_XDECREF(value);
PG_RE_THROW();
}
***************
*** 1851,1857 ****

  static HeapTuple
! PLySequence_ToTuple(PLyTypeInfo * info, PyObject * sequence)
  {
      TupleDesc    desc;
      HeapTuple    tuple;
--- 2073,2079 ----

static HeapTuple
! PLySequence_ToTuple(PLyProcedure *proc, PyObject *sequence)
{
TupleDesc desc;
HeapTuple tuple;
***************
*** 1866,1922 ****
* can ignore exceeding items or assume missing ones as null but to avoid
* plpython developer's errors we are strict here
*/
! desc = lookup_rowtype_tupdesc(info->out.d.typoid, -1);
if (PySequence_Length(sequence) != desc->natts)
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("length of returned sequence did not match number of columns in row")));

! if (info->is_rowtype == 2)
! PLy_output_tuple_funcs(info, desc);
! Assert(info->is_rowtype == 1);

/* Build tuple */
values = palloc(sizeof(Datum) * desc->natts);
nulls = palloc(sizeof(bool) * desc->natts);
for (i = 0; i < desc->natts; ++i)
{
! PyObject *volatile value,
! *volatile so;

! value = so = NULL;
PG_TRY();
{
value = PySequence_GetItem(sequence, i);
Assert(value);
! if (value == Py_None)
! {
! values[i] = (Datum) NULL;
! nulls[i] = true;
! }
! else if (value)
! {
! char *valuestr;
!
! so = PyObject_Str(value);
! if (so == NULL)
! PLy_elog(ERROR, "could not compute string representation of Python object");
! valuestr = PyString_AsString(so);
! values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
! ,valuestr
! ,info->out.r.atts[i].typioparam
! ,-1);
! Py_DECREF(so);
! so = NULL;
! nulls[i] = false;
! }

              Py_XDECREF(value);
              value = NULL;
          }
          PG_CATCH();
          {
-             Py_XDECREF(so);
              Py_XDECREF(value);
              PG_RE_THROW();
          }
--- 2088,2124 ----
       * can ignore exceeding items or assume missing ones as null but to avoid
       * plpython developer's errors we are strict here
       */
!     desc = lookup_rowtype_tupdesc(proc->result.out.d.typoid, -1);
      if (PySequence_Length(sequence) != desc->natts)
          ereport(ERROR,
                  (errcode(ERRCODE_DATATYPE_MISMATCH),
          errmsg("length of returned sequence did not match number of columns in row")));

! if (proc->result.is_rowtype == 2)
! PLy_output_tuple_funcs(&proc->result, desc);
! Assert(proc->result.is_rowtype == 1);

/* Build tuple */
values = palloc(sizeof(Datum) * desc->natts);
nulls = palloc(sizeof(bool) * desc->natts);
for (i = 0; i < desc->natts; ++i)
{
! PLyObToDatum *att;
! PyObject *volatile value;

! att = &proc->result.out.r.atts[i];
! value = NULL;
PG_TRY();
{
value = PySequence_GetItem(sequence, i);
Assert(value);
! values[i] = (att->func) (proc, att, value, &nulls[i]);

Py_XDECREF(value);
value = NULL;
}
PG_CATCH();
{
Py_XDECREF(value);
PG_RE_THROW();
}
***************
*** 1933,1939 ****

  static HeapTuple
! PLyObject_ToTuple(PLyTypeInfo * info, PyObject * object)
  {
      TupleDesc    desc;
      HeapTuple    tuple;
--- 2135,2141 ----

static HeapTuple
! PLyObject_ToTuple(PLyProcedure *proc, PyObject *object)
{
TupleDesc desc;
HeapTuple tuple;
***************
*** 1941,1962 ****
bool *nulls;
volatile int i;

! desc = lookup_rowtype_tupdesc(info->out.d.typoid, -1);
! if (info->is_rowtype == 2)
! PLy_output_tuple_funcs(info, desc);
! Assert(info->is_rowtype == 1);

/* Build tuple */
values = palloc(sizeof(Datum) * desc->natts);
nulls = palloc(sizeof(bool) * desc->natts);
for (i = 0; i < desc->natts; ++i)
{
! char *key;
! PyObject *volatile value,
! *volatile so;

          key = NameStr(desc->attrs[i]->attname);
!         value = so = NULL;
          PG_TRY();
          {
              value = PyObject_GetAttrString(object, key);
--- 2143,2165 ----
      bool       *nulls;
      volatile int i;

! desc = lookup_rowtype_tupdesc(proc->result.out.d.typoid, -1);
! if (proc->result.is_rowtype == 2)
! PLy_output_tuple_funcs(&proc->result, desc);
! Assert(proc->result.is_rowtype == 1);

/* Build tuple */
values = palloc(sizeof(Datum) * desc->natts);
nulls = palloc(sizeof(bool) * desc->natts);
for (i = 0; i < desc->natts; ++i)
{
! char *key;
! PLyObToDatum *att;
! PyObject *volatile value;

+         att = &proc->result.out.r.atts[i];
          key = NameStr(desc->attrs[i]->attname);
!         value = NULL;
          PG_TRY();
          {
              value = PyObject_GetAttrString(object, key);
***************
*** 1965,2000 ****
                  values[i] = (Datum) NULL;
                  nulls[i] = true;
              }
!             else if (value)
              {
-                 char       *valuestr;
-
-                 so = PyObject_Str(value);
-                 if (so == NULL)
-                     PLy_elog(ERROR, "could not compute string representation of Python object");
-                 valuestr = PyString_AsString(so);
-                 values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-                                               ,valuestr
-                                               ,info->out.r.atts[i].typioparam
-                                               ,-1);
-                 Py_DECREF(so);
-                 so = NULL;
-                 nulls[i] = false;
-             }
-             else
                  ereport(ERROR,
                          (errcode(ERRCODE_UNDEFINED_COLUMN),
!                          errmsg("attribute \"%s\" does not exist in Python object", key),
                           errhint("To return null in a column, "
!                                  "let the returned object have an attribute named "
!                                  "after column with value None.")));
              Py_XDECREF(value);
              value = NULL;
          }
          PG_CATCH();
          {
-             Py_XDECREF(so);
              Py_XDECREF(value);
              PG_RE_THROW();
          }
--- 2168,2190 ----
                  values[i] = (Datum) NULL;
                  nulls[i] = true;
              }
!             else if (!value)
              {
                  ereport(ERROR,
                          (errcode(ERRCODE_UNDEFINED_COLUMN),
!                          errmsg("key \"%s\" not found in object", key),
                           errhint("To return null in a column, "
!                                  "add the value None to the mapping with the "
!                                  "key named after the column.")));
!             }
!             else
!                 values[i] = (att->func) (proc, att, value, &nulls[i]);
              Py_XDECREF(value);
              value = NULL;
          }
          PG_CATCH();
          {
              Py_XDECREF(value);
              PG_RE_THROW();
          }
Index: expected/plpython_function.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/expected/plpython_function.out,v
retrieving revision 1.12
diff -c -r1.12 plpython_function.out
*** expected/plpython_function.out    3 Apr 2009 16:59:42 -0000    1.12
--- expected/plpython_function.out    26 May 2009 22:58:52 -0000
***************
*** 450,452 ****
--- 450,470 ----
  CREATE FUNCTION test_inout_params(first inout text) AS $$
  return first + '_inout';
  $$ LANGUAGE plpythonu;
+ CREATE FUNCTION test_type_conversion_bool(x bool) returns bool AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_char(x char) returns char AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int2(x int2) returns int2 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int4(x int4) returns int4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int8(x int8) returns int8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float4(x float4) returns float4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float8(x float8) returns float8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_numeric(x numeric) returns numeric AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_text(x text) returns text AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_bytea(x bytea) returns bytea AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_marshal() returns bytea AS $$
+ import marshal
+ return marshal.dumps('hello world')
+ $$ language plpythonu;
+ CREATE FUNCTION test_type_unmarshal(x bytea) returns text AS $$
+ import marshal
+ return marshal.loads(x)
+ $$ language plpythonu;
Index: expected/plpython_test.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/expected/plpython_test.out,v
retrieving revision 1.8
diff -c -r1.8 plpython_test.out
*** expected/plpython_test.out    3 Apr 2009 16:59:42 -0000    1.8
--- expected/plpython_test.out    26 May 2009 22:58:52 -0000
***************
*** 559,561 ****
--- 559,693 ----
   test_in_inout
  (1 row)
+ SELECT * FROM test_type_conversion_bool(true);
+  test_type_conversion_bool
+ ---------------------------
+  t
+ (1 row)
+
+ SELECT * FROM test_type_conversion_bool(false);
+  test_type_conversion_bool
+ ---------------------------
+  f
+ (1 row)
+
+ SELECT * FROM test_type_conversion_bool(null);
+  test_type_conversion_bool
+ ---------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_char('a');
+  test_type_conversion_char
+ ---------------------------
+  a
+ (1 row)
+
+ SELECT * FROM test_type_conversion_char(null);
+  test_type_conversion_char
+ ---------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_int2(100::int2);
+  test_type_conversion_int2
+ ---------------------------
+                        100
+ (1 row)
+
+ SELECT * FROM test_type_conversion_int2(null);
+  test_type_conversion_int2
+ ---------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_int4(100);
+  test_type_conversion_int4
+ ---------------------------
+                        100
+ (1 row)
+
+ SELECT * FROM test_type_conversion_int4(null);
+  test_type_conversion_int4
+ ---------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_int8(100);
+  test_type_conversion_int8
+ ---------------------------
+                        100
+ (1 row)
+
+ SELECT * FROM test_type_conversion_int8(null);
+  test_type_conversion_int8
+ ---------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_float4(100);
+  test_type_conversion_float4
+ -----------------------------
+                          100
+ (1 row)
+
+ SELECT * FROM test_type_conversion_float4(null);
+  test_type_conversion_float4
+ -----------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_float8(100);
+  test_type_conversion_float8
+ -----------------------------
+                          100
+ (1 row)
+
+ SELECT * FROM test_type_conversion_float8(null);
+  test_type_conversion_float8
+ -----------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_numeric(100);
+  test_type_conversion_numeric
+ ------------------------------
+                         100.0
+ (1 row)
+
+ SELECT * FROM test_type_conversion_numeric(null);
+  test_type_conversion_numeric
+ ------------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_text('hello world');
+  test_type_conversion_text
+ ---------------------------
+  hello world
+ (1 row)
+
+ SELECT * FROM test_type_conversion_text(null);
+  test_type_conversion_text
+ ---------------------------
+
+ (1 row)
+
+ SELECT * FROM test_type_conversion_bytea('hello world');
+  test_type_conversion_bytea
+ ----------------------------
+  hello world
+ (1 row)
+
+ SELECT * FROM test_type_conversion_bytea(null);
+  test_type_conversion_bytea
+ ----------------------------
+
+ (1 row)
+
+ SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
+  test_type_unmarshal
+ ---------------------
+  hello world
+ (1 row)
+
Index: sql/plpython_function.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/sql/plpython_function.sql,v
retrieving revision 1.12
diff -c -r1.12 plpython_function.sql
*** sql/plpython_function.sql    3 Apr 2009 16:59:43 -0000    1.12
--- sql/plpython_function.sql    26 May 2009 22:58:52 -0000
***************
*** 497,499 ****
--- 497,518 ----
  CREATE FUNCTION test_inout_params(first inout text) AS $$
  return first + '_inout';
  $$ LANGUAGE plpythonu;
+
+ CREATE FUNCTION test_type_conversion_bool(x bool) returns bool AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_char(x char) returns char AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int2(x int2) returns int2 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int4(x int4) returns int4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int8(x int8) returns int8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float4(x float4) returns float4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float8(x float8) returns float8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_numeric(x numeric) returns numeric AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_text(x text) returns text AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_bytea(x bytea) returns bytea AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_marshal() returns bytea AS $$
+ import marshal
+ return marshal.dumps('hello world')
+ $$ language plpythonu;
+ CREATE FUNCTION test_type_unmarshal(x bytea) returns text AS $$
+ import marshal
+ return marshal.loads(x)
+ $$ language plpythonu;
Index: sql/plpython_test.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/sql/plpython_test.sql,v
retrieving revision 1.5
diff -c -r1.5 plpython_test.sql
*** sql/plpython_test.sql    3 Apr 2009 16:59:43 -0000    1.5
--- sql/plpython_test.sql    26 May 2009 22:58:52 -0000
***************
*** 149,151 ****
--- 149,176 ----
  -- this doesn't work yet :-(
  SELECT * FROM test_in_out_params_multi('test_in');
  SELECT * FROM test_inout_params('test_in');
+
+ SELECT * FROM test_type_conversion_bool(true);
+ SELECT * FROM test_type_conversion_bool(false);
+ SELECT * FROM test_type_conversion_bool(null);
+ SELECT * FROM test_type_conversion_char('a');
+ SELECT * FROM test_type_conversion_char(null);
+ SELECT * FROM test_type_conversion_int2(100::int2);
+ SELECT * FROM test_type_conversion_int2(null);
+ SELECT * FROM test_type_conversion_int4(100);
+ SELECT * FROM test_type_conversion_int4(null);
+ SELECT * FROM test_type_conversion_int8(100);
+ SELECT * FROM test_type_conversion_int8(null);
+ SELECT * FROM test_type_conversion_float4(100);
+ SELECT * FROM test_type_conversion_float4(null);
+ SELECT * FROM test_type_conversion_float8(100);
+ SELECT * FROM test_type_conversion_float8(null);
+ SELECT * FROM test_type_conversion_numeric(100);
+ SELECT * FROM test_type_conversion_numeric(null);
+ SELECT * FROM test_type_conversion_text('hello world');
+ SELECT * FROM test_type_conversion_text(null);
+ SELECT * FROM test_type_conversion_bytea('hello world');
+ SELECT * FROM test_type_conversion_bytea(null);
+ SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
+
+
#2Peter Eisentraut
peter_e@gmx.net
In reply to: Caleb Welton (#1)
Re: [PATCH] plpythonu datatype conversion improvements

On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote:

Patch for plpythonu

Primary motivation of the attached patch is to support handling bytea
conversion allowing for embedded nulls, which in turn allows for supporting
the marshal module.

Secondary motivation is slightly improved performance for conversion
routines of basic datatypes that have simple mappings between
postgres/python.

Primary design is to change the conversion routines from being based on
cstrings to datums, eg: PLyBool_FromString(const char *) =>
PLyBool_FromBool(PLyDatumToOb, Datum);

Makes sense; please add it to the next commit fest.

Are there any compatibility implications, that is, do any of the conversions
work differently from before (except when they were broken before, as in the
case of bytea)?

#3Caleb Welton
cwelton@greenplum.com
In reply to: Peter Eisentraut (#2)
Re: [PATCH] plpythonu datatype conversion improvements

All data types should map to the same python object types as they did before, so int32->PyInt, int64->PyLong, numeric->PyFloat, etc.

The conversion routines are slightly different, eg int32 is initialized via PyInt_FromLong() instead of first converting the integer to a string then calling PyInt_FromString, this is a little faster, but shouldn't result in differences, if anything this should be more correct.

Previously numeric->string->PyFloat_FromString, now numeric->double->PyFloat_FromDouble, which makes use of postgres numeric->double routines rather than python string->double routines, and it is conceivable that there are precision variations between the two. My own feeling on the matter is that PyFloat is the wrong mapping for numeric, but I didn't want to muddy this patch by changing that.

The main compatibility issue is with Python Strings that contain embedded nulls. Conversion to bytea will now work correctly and build the bytea using PyString_FromStringAndSize instead of PyString_FromString. Other datatypes will now error if the PyString contains embedded nulls, previously they would silently truncate.

Thanks for the comments,
Caleb

On 5/27/09 3:11 AM, "Peter Eisentraut" <peter_e@gmx.net> wrote:

On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote:

Patch for plpythonu

Primary motivation of the attached patch is to support handling bytea
conversion allowing for embedded nulls, which in turn allows for supporting
the marshal module.

Secondary motivation is slightly improved performance for conversion
routines of basic datatypes that have simple mappings between
postgres/python.

Primary design is to change the conversion routines from being based on
cstrings to datums, eg: PLyBool_FromString(const char *) =>
PLyBool_FromBool(PLyDatumToOb, Datum);

Makes sense; please add it to the next commit fest.

Are there any compatibility implications, that is, do any of the conversions
work differently from before (except when they were broken before, as in the
case of bytea)?

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Caleb Welton (#3)
Re: [PATCH] plpythonu datatype conversion improvements

On Wednesday 27 May 2009 21:53:31 Caleb Welton wrote:

Previously numeric->string->PyFloat_FromString, now
numeric->double->PyFloat_FromDouble, which makes use of postgres
numeric->double routines rather than python string->double routines, and it
is conceivable that there are precision variations between the two. My own
feeling on the matter is that PyFloat is the wrong mapping for numeric, but
I didn't want to muddy this patch by changing that.

Yeah, that one had me wondering for a while as well, but as you say it is
better to address that separately.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: [PATCH] plpythonu datatype conversion improvements

Peter Eisentraut <peter_e@gmx.net> writes:

On Wednesday 27 May 2009 21:53:31 Caleb Welton wrote:

... My own
feeling on the matter is that PyFloat is the wrong mapping for numeric, but
I didn't want to muddy this patch by changing that.

Yeah, that one had me wondering for a while as well, but as you say it is
better to address that separately.

That was making me itch as well, in my very cursory look at the patch.
Does Python have a saner mapping for it?

regards, tom lane

#6Caleb Welton
cwelton@greenplum.com
In reply to: Tom Lane (#5)
Re: [PATCH] plpythonu datatype conversion improvements

Yes, in Python >= 2.4 there is the Decimal datatype.

However, unlike the other mappings employed by plpythonu, Decimal requires an import statement to be in scope.

-Caleb

On 5/27/09 2:07 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On Wednesday 27 May 2009 21:53:31 Caleb Welton wrote:

... My own
feeling on the matter is that PyFloat is the wrong mapping for numeric, but
I didn't want to muddy this patch by changing that.

Yeah, that one had me wondering for a while as well, but as you say it is
better to address that separately.

That was making me itch as well, in my very cursory look at the patch.
Does Python have a saner mapping for it?

regards, tom lane

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Caleb Welton (#1)
Re: [PATCH] plpythonu datatype conversion improvements

On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote:

Patch for plpythonu

This patch doesn't apply; I think it got mangled during email transport.
(Tabs changed to spaces, it looks like.) Could you resend the patch as a
separate attachment in a way that it doesn't get mangled?

#8Caleb Welton
cwelton@greenplum.com
In reply to: Peter Eisentraut (#7)
1 attachment(s)
Re: [PATCH] plpythonu datatype conversion improvements

Sorry about that. Here it is again as an attachment.

-Caleb

On 7/16/09 7:16 AM, "Peter Eisentraut" <peter_e@gmx.net> wrote:

On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote:

Patch for plpythonu

This patch doesn't apply; I think it got mangled during email transport.
(Tabs changed to spaces, it looks like.) Could you resend the patch as a
separate attachment in a way that it doesn't get mangled?

Attachments:

plpython_bytea.patchapplication/octet-stream; name=plpython_bytea.patchDownload
Index: plpython.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.120
diff -c -r1.120 plpython.c
*** plpython.c	3 Apr 2009 16:59:42 -0000	1.120
--- plpython.c	26 May 2009 22:58:52 -0000
***************
*** 78,84 ****
   * objects.
   */
  
! typedef PyObject *(*PLyDatumToObFunc) (const char *);
  
  typedef struct PLyDatumToOb
  {
--- 78,85 ----
   * objects.
   */
  
! struct PLyDatumToOb;
! typedef PyObject *(*PLyDatumToObFunc) (struct PLyDatumToOb*, Datum);
  
  typedef struct PLyDatumToOb
  {
***************
*** 104,111 ****
--- 105,120 ----
  /* convert PyObject to a Postgresql Datum or tuple.
   * output from Python
   */
+ 
+ struct PLyObToDatum;
+ struct PLyProcedure;
+ typedef Datum (*PLyObToDatumFunc) (struct PLyProcedure*, 
+ 								   struct PLyObToDatum*, 
+ 								   PyObject *, bool *isnull);
+ 
  typedef struct PLyObToDatum
  {
+ 	PLyObToDatumFunc func;
  	FmgrInfo	typfunc;		/* The type's input function */
  	Oid			typoid;			/* The OID of the type */
  	Oid			typioparam;
***************
*** 255,270 ****
  static void PLy_input_tuple_funcs(PLyTypeInfo *, TupleDesc);
  
  /* conversion functions */
  static PyObject *PLyDict_FromTuple(PLyTypeInfo *, HeapTuple, TupleDesc);
! static PyObject *PLyBool_FromString(const char *);
! static PyObject *PLyFloat_FromString(const char *);
! static PyObject *PLyInt_FromString(const char *);
! static PyObject *PLyLong_FromString(const char *);
! static PyObject *PLyString_FromString(const char *);
! 
! static HeapTuple PLyMapping_ToTuple(PLyTypeInfo *, PyObject *);
! static HeapTuple PLySequence_ToTuple(PLyTypeInfo *, PyObject *);
! static HeapTuple PLyObject_ToTuple(PLyTypeInfo *, PyObject *);
  
  /*
   * Currently active plpython function
--- 264,295 ----
  static void PLy_input_tuple_funcs(PLyTypeInfo *, TupleDesc);
  
  /* conversion functions */
+ static PyObject *PLyBool_FromBool(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyInt_FromInt16(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyInt_FromInt32(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyLong_FromInt64(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyString_FromText(PLyDatumToOb *arg, Datum d);
+ static PyObject *PLyString_FromDatum(PLyDatumToOb *arg, Datum d);
+ 
  static PyObject *PLyDict_FromTuple(PLyTypeInfo *, HeapTuple, TupleDesc);
! 
! static Datum PLyObject_ToVoid(PLyProcedure *, PLyObToDatum *, 
! 							  PyObject *, bool *isnull);
! static Datum PLyObject_ToBool(PLyProcedure *, PLyObToDatum *, 
! 							  PyObject *, bool *isnull);
! static Datum PLyObject_ToBytea(PLyProcedure *, PLyObToDatum *, 
! 							   PyObject *, bool *isnull);
! static Datum PLyObject_ToText(PLyProcedure *, PLyObToDatum *, 
! 							  PyObject *, bool *isnull);
! static Datum PLyObject_ToDatum(PLyProcedure *, PLyObToDatum *, 
! 							   PyObject *, bool *isnull);
! 
! static HeapTuple PLyMapping_ToTuple(PLyProcedure *, PyObject *);
! static HeapTuple PLySequence_ToTuple(PLyProcedure *, PyObject *);
! static HeapTuple PLyObject_ToTuple(PLyProcedure *, PyObject *);
  
  /*
   * Currently active plpython function
***************
*** 507,514 ****
  
  		for (i = 0; i < natts; i++)
  		{
- 			char	   *src;
- 
  			platt = PyList_GetItem(plkeys, i);
  			if (!PyString_Check(platt))
  				ereport(ERROR,
--- 532,537 ----
***************
*** 533,564 ****
  				modvalues[i] = (Datum) 0;
  				modnulls[i] = 'n';
  			}
! 			else if (plval != Py_None)
  			{
! 				plstr = PyObject_Str(plval);
! 				if (!plstr)
! 					PLy_elog(ERROR, "could not compute string representation of Python object in PL/Python function \"%s\" while modifying trigger row",
! 							 proc->proname);
! 				src = PyString_AsString(plstr);
! 
! 				modvalues[i] =
! 					InputFunctionCall(&proc->result.out.r.atts[atti].typfunc,
! 									  src,
! 									proc->result.out.r.atts[atti].typioparam,
! 									  tupdesc->attrs[atti]->atttypmod);
! 				modnulls[i] = ' ';
! 
! 				Py_DECREF(plstr);
! 				plstr = NULL;
! 			}
! 			else
! 			{
! 				modvalues[i] =
! 					InputFunctionCall(&proc->result.out.r.atts[atti].typfunc,
! 									  NULL,
! 									proc->result.out.r.atts[atti].typioparam,
! 									  tupdesc->attrs[atti]->atttypmod);
! 				modnulls[i] = 'n';
  			}
  
  			Py_DECREF(plval);
--- 556,565 ----
  				modvalues[i] = (Datum) 0;
  				modnulls[i] = 'n';
  			}
! 			else 
  			{
! 				PLyObToDatum *att = &proc->result.out.r.atts[atti];
! 				modvalues[i] = (att->func) (proc, att, plval, &modnulls[i]);
  			}
  
  			Py_DECREF(plval);
***************
*** 784,791 ****
  	Datum		rv;
  	PyObject   *volatile plargs = NULL;
  	PyObject   *volatile plrv = NULL;
- 	PyObject   *volatile plrv_so = NULL;
- 	char	   *plrv_sc;
  
  	PG_TRY();
  	{
--- 785,790 ----
***************
*** 862,868 ****
  
  				Py_XDECREF(plargs);
  				Py_XDECREF(plrv);
- 				Py_XDECREF(plrv_so);
  
  				PLy_function_delete_args(proc);
  
--- 861,866 ----
***************
*** 876,922 ****
  			}
  		}
  
! 		/*
! 		 * If the function is declared to return void, the Python return value
! 		 * must be None. For void-returning functions, we also treat a None
! 		 * return value as a special "void datum" rather than NULL (as is the
! 		 * case for non-void-returning functions).
! 		 */
! 		if (proc->result.out.d.typoid == VOIDOID)
! 		{
! 			if (plrv != Py_None)
! 				ereport(ERROR,
! 						(errcode(ERRCODE_DATATYPE_MISMATCH),
! 					   errmsg("PL/Python function with return type \"void\" did not return None")));
! 
! 			fcinfo->isnull = false;
! 			rv = (Datum) 0;
! 		}
! 		else if (plrv == Py_None)
! 		{
! 			fcinfo->isnull = true;
! 			if (proc->result.is_rowtype < 1)
! 				rv = InputFunctionCall(&proc->result.out.d.typfunc,
! 									   NULL,
! 									   proc->result.out.d.typioparam,
! 									   -1);
! 			else
! 				/* Tuple as None */
! 				rv = (Datum) NULL;
! 		}
! 		else if (proc->result.is_rowtype >= 1)
  		{
  			HeapTuple	tuple = NULL;
  
! 			if (PySequence_Check(plrv))
  				/* composite type as sequence (tuple, list etc) */
! 				tuple = PLySequence_ToTuple(&proc->result, plrv);
  			else if (PyMapping_Check(plrv))
  				/* composite type as mapping (currently only dict) */
! 				tuple = PLyMapping_ToTuple(&proc->result, plrv);
  			else
  				/* returned as smth, must provide method __getattr__(name) */
! 				tuple = PLyObject_ToTuple(&proc->result, plrv);
  
  			if (tuple != NULL)
  			{
--- 874,895 ----
  			}
  		}
  
! 		/* Convert python return value into postgres datatypes */
! 		if (proc->result.is_rowtype >= 1)
  		{
  			HeapTuple	tuple = NULL;
  
! 			if (plrv == Py_None)
! 				tuple = NULL;
! 			else if (PySequence_Check(plrv))
  				/* composite type as sequence (tuple, list etc) */
! 				tuple = PLySequence_ToTuple(proc, plrv);
  			else if (PyMapping_Check(plrv))
  				/* composite type as mapping (currently only dict) */
! 				tuple = PLyMapping_ToTuple(proc, plrv);
  			else
  				/* returned as smth, must provide method __getattr__(name) */
! 				tuple = PLyObject_ToTuple(proc, plrv);
  
  			if (tuple != NULL)
  			{
***************
*** 931,952 ****
  		}
  		else
  		{
! 			fcinfo->isnull = false;
! 			plrv_so = PyObject_Str(plrv);
! 			if (!plrv_so)
! 				PLy_elog(ERROR, "could not create string representation of Python object in PL/Python function \"%s\" while creating return value", proc->proname);
! 			plrv_sc = PyString_AsString(plrv_so);
! 			rv = InputFunctionCall(&proc->result.out.d.typfunc,
! 								   plrv_sc,
! 								   proc->result.out.d.typioparam,
! 								   -1);
  		}
  	}
  	PG_CATCH();
  	{
  		Py_XDECREF(plargs);
  		Py_XDECREF(plrv);
- 		Py_XDECREF(plrv_so);
  
  		PG_RE_THROW();
  	}
--- 904,919 ----
  		}
  		else
  		{
! 			rv = (proc->result.out.d.func) (proc,
! 											&proc->result.out.d, 
! 											plrv,
! 											&fcinfo->isnull);
  		}
  	}
  	PG_CATCH();
  	{
  		Py_XDECREF(plargs);
  		Py_XDECREF(plrv);
  
  		PG_RE_THROW();
  	}
***************
*** 954,960 ****
  
  	Py_XDECREF(plargs);
  	Py_DECREF(plrv);
- 	Py_XDECREF(plrv_so);
  
  	return rv;
  }
--- 921,926 ----
***************
*** 1037,1048 ****
  					arg = NULL;
  				else
  				{
! 					char	   *ct;
! 
! 					ct = OutputFunctionCall(&(proc->args[i].in.d.typfunc),
! 											fcinfo->arg[i]);
! 					arg = (proc->args[i].in.d.func) (ct);
! 					pfree(ct);
  				}
  			}
  
--- 1003,1010 ----
  					arg = NULL;
  				else
  				{
! 					arg = (proc->args[i].in.d.func) (&(proc->args[i].in.d),
! 													 fcinfo->arg[i]);
  				}
  			}
  
***************
*** 1593,1598 ****
--- 1555,1589 ----
  	arg->typoid = HeapTupleGetOid(typeTup);
  	arg->typioparam = getTypeIOParam(typeTup);
  	arg->typbyval = typeStruct->typbyval;
+ 
+ 	/* Determine which kind of Python object we will convert to */
+ 	switch (arg->typoid)
+ 	{
+ 		case VOIDOID:
+ 			arg->func = PLyObject_ToVoid;
+ 			break;
+ 		case BOOLOID:
+ 			arg->func = PLyObject_ToBool;
+ 			break;
+ 		case BYTEAOID:
+ 			arg->func = PLyObject_ToBytea;
+ 			break;
+ 		case BPCHAROID:
+ 		case VARCHAROID:
+ 		case TEXTOID:
+ 			arg->func = PLyObject_ToText;
+ 			break;
+ 
+ 		case FLOAT4OID:
+ 		case FLOAT8OID:
+ 		case NUMERICOID:
+ 		case INT2OID:
+ 		case INT4OID:
+ 		case INT8OID:
+ 		default:
+ 			arg->func = PLyObject_ToDatum;
+ 			break;
+ 	}
  }
  
  static void
***************
*** 1619,1644 ****
  	switch (typeOid)
  	{
  		case BOOLOID:
! 			arg->func = PLyBool_FromString;
  			break;
  		case FLOAT4OID:
  		case FLOAT8OID:
  		case NUMERICOID:
! 			arg->func = PLyFloat_FromString;
  			break;
  		case INT2OID:
  		case INT4OID:
! 			arg->func = PLyInt_FromString;
  			break;
  		case INT8OID:
! 			arg->func = PLyLong_FromString;
  			break;
  		default:
! 			arg->func = PLyString_FromString;
  			break;
  	}
  }
  
  static void
  PLy_typeinfo_init(PLyTypeInfo * arg)
  {
--- 1610,1648 ----
  	switch (typeOid)
  	{
  		case BOOLOID:
! 			arg->func = PLyBool_FromBool;
  			break;
  		case FLOAT4OID:
+ 			arg->func = PLyFloat_FromFloat4;
+ 			break;
  		case FLOAT8OID:
+ 			arg->func = PLyFloat_FromFloat8;
+ 			break;
  		case NUMERICOID:
! 			arg->func = PLyFloat_FromNumeric;
  			break;
  		case INT2OID:
+ 			arg->func = PLyInt_FromInt16;
+ 			break;
  		case INT4OID:
! 			arg->func = PLyInt_FromInt32;
  			break;
  		case INT8OID:
! 			arg->func = PLyLong_FromInt64;
! 			break;
! 		case BPCHAROID:
! 		case VARCHAROID:
! 		case TEXTOID:
! 		case BYTEAOID:
! 			arg->func = PLyString_FromText;
  			break;
  		default:
! 			arg->func = PLyString_FromDatum;
  			break;
  	}
  }
  
+ 
  static void
  PLy_typeinfo_init(PLyTypeInfo * arg)
  {
***************
*** 1660,1716 ****
  	}
  }
  
- /* assumes that a bool is always returned as a 't' or 'f' */
  static PyObject *
! PLyBool_FromString(const char *src)
  {
  	/*
  	 * We would like to use Py_RETURN_TRUE and Py_RETURN_FALSE here for
  	 * generating SQL from trigger functions, but those are only supported in
  	 * Python >= 2.3, and we support older versions.
  	 * http://docs.python.org/api/boolObjects.html
  	 */
! 	if (src[0] == 't')
  		return PyBool_FromLong(1);
! 	return PyBool_FromLong(0);
  }
  
  static PyObject *
! PLyFloat_FromString(const char *src)
  {
! 	double		v;
! 	char	   *eptr;
  
! 	errno = 0;
! 	v = strtod(src, &eptr);
! 	if (*eptr != '\0' || errno)
! 		return NULL;
! 	return PyFloat_FromDouble(v);
  }
  
  static PyObject *
! PLyInt_FromString(const char *src)
  {
! 	long		v;
! 	char	   *eptr;
  
! 	errno = 0;
! 	v = strtol(src, &eptr, 0);
! 	if (*eptr != '\0' || errno)
! 		return NULL;
! 	return PyInt_FromLong(v);
  }
  
  static PyObject *
! PLyLong_FromString(const char *src)
  {
! 	return PyLong_FromString((char *) src, NULL, 0);
  }
  
  static PyObject *
! PLyString_FromString(const char *src)
  {
! 	return PyString_FromString(src);
  }
  
  static PyObject *
--- 1664,1758 ----
  	}
  }
  
  static PyObject *
! PLyBool_FromBool(PLyDatumToOb *arg, Datum d)
  {
+ 	bool x = DatumGetBool(d);
+ 	arg = 0;  /* unused */
+ 
  	/*
  	 * We would like to use Py_RETURN_TRUE and Py_RETURN_FALSE here for
  	 * generating SQL from trigger functions, but those are only supported in
  	 * Python >= 2.3, and we support older versions.
  	 * http://docs.python.org/api/boolObjects.html
  	 */
! 	if (x)
  		return PyBool_FromLong(1);
! 	else
! 		return PyBool_FromLong(0);
  }
  
  static PyObject *
! PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d)
  {
! 	arg = 0;  /* unused */
! 	return PyFloat_FromDouble(DatumGetFloat4(d));
! }
  
! static PyObject *
! PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d)
! {
! 	arg = 0;  /* unused */
! 	return PyFloat_FromDouble(DatumGetFloat8(d));
  }
  
  static PyObject *
! PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d)
  {
! 	/* 
! 	 * Numeric is cast to a PyFloat: 
! 	 *   This results in a loss of precision
! 	 *   Would it be better to cast to PyString? 
! 	 */
! 	Datum  f = DirectFunctionCall1(numeric_float8, d);
! 	double x = DatumGetFloat8(f);
! 	arg = 0;  /* unused */
! 	return PyFloat_FromDouble(x);
! }
  
! static PyObject *
! PLyInt_FromInt16(PLyDatumToOb *arg, Datum d)
! {
! 	arg = 0;  /* unused */
! 	return PyInt_FromLong(DatumGetInt16(d));
  }
  
  static PyObject *
! PLyInt_FromInt32(PLyDatumToOb *arg, Datum d)
  {
! 	arg = 0;  /* unused */
! 	return PyInt_FromLong(DatumGetInt32(d));
  }
  
  static PyObject *
! PLyLong_FromInt64(PLyDatumToOb *arg, Datum d)
  {
! 	arg = 0;  /* unused */
! 
! 	/* on 32 bit platforms "long" may be too small */
! 	if (sizeof(int64) > sizeof(long))
! 		return PyLong_FromLongLong(DatumGetInt64(d));
! 	else
! 		return PyLong_FromLong(DatumGetInt64(d));
! }
! 
! static PyObject *
! PLyString_FromText(PLyDatumToOb *arg, Datum d)
! {
! 	text     *txt = DatumGetTextP(d);
! 	char     *str = VARDATA(txt);
! 	size_t    size = VARSIZE(txt) - VARHDRSZ;
! 
! 	return PyString_FromStringAndSize(str, size);
! }
! 
! static PyObject *
! PLyString_FromDatum(PLyDatumToOb *arg, Datum d)
! {
! 	char     *x = OutputFunctionCall(&arg->typfunc, d);
! 	PyObject *r = PyString_FromString(x);
! 	pfree(x);
! 	return r;
  }
  
  static PyObject *
***************
*** 1730,1737 ****
  	{
  		for (i = 0; i < info->in.r.natts; i++)
  		{
! 			char	   *key,
! 					   *vsrc;
  			Datum		vattr;
  			bool		is_null;
  			PyObject   *value;
--- 1772,1778 ----
  	{
  		for (i = 0; i < info->in.r.natts; i++)
  		{
! 			char	   *key;
  			Datum		vattr;
  			bool		is_null;
  			PyObject   *value;
***************
*** 1746,1759 ****
  				PyDict_SetItemString(dict, key, Py_None);
  			else
  			{
! 				vsrc = OutputFunctionCall(&info->in.r.atts[i].typfunc,
! 										  vattr);
! 
! 				/*
! 				 * no exceptions allowed
! 				 */
! 				value = info->in.r.atts[i].func(vsrc);
! 				pfree(vsrc);
  				PyDict_SetItemString(dict, key, value);
  				Py_DECREF(value);
  			}
--- 1787,1793 ----
  				PyDict_SetItemString(dict, key, Py_None);
  			else
  			{
! 				value = (info->in.r.atts[i].func) (&info->in.r.atts[i], vattr);
  				PyDict_SetItemString(dict, key, value);
  				Py_DECREF(value);
  			}
***************
*** 1769,1777 ****
  	return dict;
  }
  
  
  static HeapTuple
! PLyMapping_ToTuple(PLyTypeInfo * info, PyObject * mapping)
  {
  	TupleDesc	desc;
  	HeapTuple	tuple;
--- 1803,2017 ----
  	return dict;
  }
  
+ static Datum 
+ PLyObject_ToVoid(PLyProcedure *proc, 
+ 				 PLyObToDatum *arg, 
+ 				 PyObject *plrv, 
+ 				 bool *isnull)
+ {
+ 	/* 
+ 	 * If the function is declared to return void, the Python return value must
+ 	 * be None.  For void-returning functions, we also treat a None return value
+ 	 * as a special "void datum" rather than NULL (as is the case for the 
+ 	 * non-void-returning functions).
+ 	 */
+ 	if (plrv != Py_None)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 				 errmsg("PL/Python function with return type \"void\" did not "
+ 						"return None")));
+ 
+ 	*isnull = false;
+ 	return (Datum) 0;
+ }
+ 
+ static Datum 
+ PLyObject_ToBool(PLyProcedure *proc, 
+ 				 PLyObToDatum *arg, 
+ 				 PyObject *plrv, 
+ 				 bool *isnull)
+ {
+ 	bool rv; 
+ 
+ 	if (plrv == Py_None)
+ 	{
+ 		*isnull = true;
+ 		return (Datum) 0;
+ 	}
+ 
+ 	rv = PyObject_IsTrue(plrv);
+ 	*isnull = false;
+ 	return BoolGetDatum(rv);
+ }
+ 
+ 
+ static Datum 
+ PLyObject_ToBytea(PLyProcedure *proc, 
+ 				  PLyObToDatum *arg, 
+ 				  PyObject *plrv, 
+ 				  bool *isnull)
+ {
+ 	PyObject   *volatile plrv_so = NULL;
+ 	Datum       rv;
+ 
+ 	if (plrv == Py_None)
+ 	{
+ 		*isnull = true;
+ 		return (Datum) 0;
+ 	}
+ 
+ 	plrv_so = PyObject_Str(plrv);
+ 	if (!plrv_so)
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 				 errmsg("could not create string representation of Python "
+ 						"object in PL/Python function \"%s\" while creating "
+ 						"return value", proc->proname)));
+ 	}
+ 
+ 	PG_TRY();
+ 	{
+ 		char *plrv_sc = PyString_AsString(plrv_so);
+ 		size_t len = PyString_Size(plrv_so);
+ 		size_t size = len + VARHDRSZ;
+ 		bytea *result = (bytea*) palloc(size);
+ 
+ 		SET_VARSIZE(result, size);
+ 		memcpy(VARDATA(result), plrv_sc, len);
+ 		rv = PointerGetDatum(result);
+ 	}
+ 	PG_CATCH();
+ 	{
+ 		Py_XDECREF(plrv_so);
+ 		PG_RE_THROW();
+ 	}
+ 	PG_END_TRY();
+ 
+ 	Py_XDECREF(plrv_so);
+ 
+ 	*isnull = false;
+ 	return rv;
+ }
+ 
+ static Datum 
+ PLyObject_ToText(PLyProcedure *proc, 
+ 				 PLyObToDatum *arg, 
+ 				 PyObject *plrv, 
+ 				 bool *isnull)
+ {
+ 	PyObject   *volatile plrv_so = NULL;
+ 	Datum       rv;
+ 
+ 	if (plrv == Py_None)
+ 	{
+ 		*isnull = true;
+ 		return (Datum) 0;
+ 	}
+ 
+ 	plrv_so = PyObject_Str(plrv);
+ 	if (!plrv_so)
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 				 errmsg("could not create string representation of Python "
+ 						"object in PL/Python function \"%s\" while creating "
+ 						"return value", proc->proname)));
+ 	}
+ 
+ 	PG_TRY();
+ 	{
+ 		char *plrv_sc = PyString_AsString(plrv_so);
+ 		size_t len    = PyString_Size(plrv_so);
+ 		size_t size   = len + VARHDRSZ;
+ 		text *result;
+ 
+ 		if (strlen(plrv_sc) != (size_t) len)
+ 		{
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 					 errmsg("PL/Python function \"%s\" could not convert "
+ 							"Python object into text: expected string without "
+ 							"null bytes", proc->proname)));
+ 		}
+ 
+ 		result = (bytea*) palloc(size);
+ 		SET_VARSIZE(result, size);
+ 		memcpy(VARDATA(result), plrv_sc, len);
+ 		rv = PointerGetDatum(result);
+ 	}
+ 	PG_CATCH();
+ 	{
+ 		Py_XDECREF(plrv_so);
+ 		PG_RE_THROW();
+ 	}
+ 	PG_END_TRY();
+ 
+ 	Py_XDECREF(plrv_so);
+ 
+ 	*isnull = false;
+ 	return rv;
+ }
+ 
+ /* 
+  * Generic conversion function:
+  *  - Cast PyObject to cstring and cstring into postgres type.
+  */
+ static Datum 
+ PLyObject_ToDatum(PLyProcedure *proc, 
+ 				  PLyObToDatum *arg, 
+ 				  PyObject *plrv, 
+ 				  bool *isnull)
+ {
+ 	PyObject *volatile plrv_so = NULL;
+ 	Datum     rv;
+ 
+ 	if (plrv == Py_None)
+ 	{
+ 		*isnull = true;
+ 		return (Datum) 0;
+ 	}
+ 
+ 	plrv_so = PyObject_Str(plrv);
+ 	if (!plrv_so)
+ 	{
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 				 errmsg("could not create string representation of Python "
+ 						"object in PL/Python function \"%s\" while creating "
+ 						"return value", proc->proname)));
+ 	}
+ 
+ 	PG_TRY();
+ 	{
+ 		char *plrv_sc = PyString_AsString(plrv_so);
+ 		size_t len    = PyString_Size(plrv_so);		
+ 
+ 		if (strlen(plrv_sc) != (size_t) len)
+ 		{
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 					 errmsg("PL/Python function \"%s\" could not convert "
+ 							"Python object into cstring: expected string without "
+ 							"null bytes", proc->proname)));
+ 		}
+ 		rv = InputFunctionCall(&arg->typfunc, plrv_sc, arg->typioparam, -1);
+ 	}
+ 	PG_CATCH();
+ 	{
+ 		Py_XDECREF(plrv_so);
+ 		PG_RE_THROW();
+ 	}
+ 	PG_END_TRY();
+ 
+ 	Py_XDECREF(plrv_so);
+ 
+ 	*isnull = false;
+ 	return rv;
+ }
  
  static HeapTuple
! PLyMapping_ToTuple(PLyProcedure *proc, PyObject *mapping)
  {
  	TupleDesc	desc;
  	HeapTuple	tuple;
***************
*** 1781,1840 ****
  
  	Assert(PyMapping_Check(mapping));
  
! 	desc = lookup_rowtype_tupdesc(info->out.d.typoid, -1);
! 	if (info->is_rowtype == 2)
! 		PLy_output_tuple_funcs(info, desc);
! 	Assert(info->is_rowtype == 1);
  
  	/* Build tuple */
  	values = palloc(sizeof(Datum) * desc->natts);
  	nulls = palloc(sizeof(bool) * desc->natts);
  	for (i = 0; i < desc->natts; ++i)
  	{
! 		char	   *key;
! 		PyObject   *volatile value,
! 				   *volatile so;
  
  		key = NameStr(desc->attrs[i]->attname);
! 		value = so = NULL;
  		PG_TRY();
  		{
  			value = PyMapping_GetItemString(mapping, key);
! 			if (value == Py_None)
  			{
- 				values[i] = (Datum) NULL;
- 				nulls[i] = true;
- 			}
- 			else if (value)
- 			{
- 				char	   *valuestr;
- 
- 				so = PyObject_Str(value);
- 				if (so == NULL)
- 					PLy_elog(ERROR, "could not compute string representation of Python object");
- 				valuestr = PyString_AsString(so);
- 
- 				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
- 											  ,valuestr
- 											  ,info->out.r.atts[i].typioparam
- 											  ,-1);
- 				Py_DECREF(so);
- 				so = NULL;
- 				nulls[i] = false;
- 			}
- 			else
  				ereport(ERROR,
  						(errcode(ERRCODE_UNDEFINED_COLUMN),
  						 errmsg("key \"%s\" not found in mapping", key),
  						 errhint("To return null in a column, "
! 					  "add the value None to the mapping with the key named after the column.")));
  
  			Py_XDECREF(value);
  			value = NULL;
  		}
  		PG_CATCH();
  		{
- 			Py_XDECREF(so);
  			Py_XDECREF(value);
  			PG_RE_THROW();
  		}
--- 2021,2062 ----
  
  	Assert(PyMapping_Check(mapping));
  
! 	desc = lookup_rowtype_tupdesc(proc->result.out.d.typoid, -1);
! 	if (proc->result.is_rowtype == 2)
! 		PLy_output_tuple_funcs(&proc->result, desc);
! 	Assert(proc->result.is_rowtype == 1);
  
  	/* Build tuple */
  	values = palloc(sizeof(Datum) * desc->natts);
  	nulls = palloc(sizeof(bool) * desc->natts);
  	for (i = 0; i < desc->natts; ++i)
  	{
! 		char	     *key;
! 		PLyObToDatum *att;
! 		PyObject     *volatile value;
  
+ 		att = &proc->result.out.r.atts[i];
  		key = NameStr(desc->attrs[i]->attname);
! 		value = NULL;
  		PG_TRY();
  		{
  			value = PyMapping_GetItemString(mapping, key);
! 			if (!value)
  			{
  				ereport(ERROR,
  						(errcode(ERRCODE_UNDEFINED_COLUMN),
  						 errmsg("key \"%s\" not found in mapping", key),
  						 errhint("To return null in a column, "
! 								 "add the value None to the mapping with the "
! 								 "key named after the column.")));
! 			}
! 			values[i] = (att->func) (proc, att, value, &nulls[i]);
  
  			Py_XDECREF(value);
  			value = NULL;
  		}
  		PG_CATCH();
  		{
  			Py_XDECREF(value);
  			PG_RE_THROW();
  		}
***************
*** 1851,1857 ****
  
  
  static HeapTuple
! PLySequence_ToTuple(PLyTypeInfo * info, PyObject * sequence)
  {
  	TupleDesc	desc;
  	HeapTuple	tuple;
--- 2073,2079 ----
  
  
  static HeapTuple
! PLySequence_ToTuple(PLyProcedure *proc, PyObject *sequence)
  {
  	TupleDesc	desc;
  	HeapTuple	tuple;
***************
*** 1866,1922 ****
  	 * can ignore exceeding items or assume missing ones as null but to avoid
  	 * plpython developer's errors we are strict here
  	 */
! 	desc = lookup_rowtype_tupdesc(info->out.d.typoid, -1);
  	if (PySequence_Length(sequence) != desc->natts)
  		ereport(ERROR,
  				(errcode(ERRCODE_DATATYPE_MISMATCH),
  		errmsg("length of returned sequence did not match number of columns in row")));
  
! 	if (info->is_rowtype == 2)
! 		PLy_output_tuple_funcs(info, desc);
! 	Assert(info->is_rowtype == 1);
  
  	/* Build tuple */
  	values = palloc(sizeof(Datum) * desc->natts);
  	nulls = palloc(sizeof(bool) * desc->natts);
  	for (i = 0; i < desc->natts; ++i)
  	{
! 		PyObject   *volatile value,
! 				   *volatile so;
  
! 		value = so = NULL;
  		PG_TRY();
  		{
  			value = PySequence_GetItem(sequence, i);
  			Assert(value);
! 			if (value == Py_None)
! 			{
! 				values[i] = (Datum) NULL;
! 				nulls[i] = true;
! 			}
! 			else if (value)
! 			{
! 				char	   *valuestr;
! 
! 				so = PyObject_Str(value);
! 				if (so == NULL)
! 					PLy_elog(ERROR, "could not compute string representation of Python object");
! 				valuestr = PyString_AsString(so);
! 				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
! 											  ,valuestr
! 											  ,info->out.r.atts[i].typioparam
! 											  ,-1);
! 				Py_DECREF(so);
! 				so = NULL;
! 				nulls[i] = false;
! 			}
  
  			Py_XDECREF(value);
  			value = NULL;
  		}
  		PG_CATCH();
  		{
- 			Py_XDECREF(so);
  			Py_XDECREF(value);
  			PG_RE_THROW();
  		}
--- 2088,2124 ----
  	 * can ignore exceeding items or assume missing ones as null but to avoid
  	 * plpython developer's errors we are strict here
  	 */
! 	desc = lookup_rowtype_tupdesc(proc->result.out.d.typoid, -1);
  	if (PySequence_Length(sequence) != desc->natts)
  		ereport(ERROR,
  				(errcode(ERRCODE_DATATYPE_MISMATCH),
  		errmsg("length of returned sequence did not match number of columns in row")));
  
! 	if (proc->result.is_rowtype == 2)
! 		PLy_output_tuple_funcs(&proc->result, desc);
! 	Assert(proc->result.is_rowtype == 1);
  
  	/* Build tuple */
  	values = palloc(sizeof(Datum) * desc->natts);
  	nulls = palloc(sizeof(bool) * desc->natts);
  	for (i = 0; i < desc->natts; ++i)
  	{
! 		PLyObToDatum *att;
! 		PyObject     *volatile value;
  
! 		att = &proc->result.out.r.atts[i];
! 		value = NULL;
  		PG_TRY();
  		{
  			value = PySequence_GetItem(sequence, i);
  			Assert(value);
! 			values[i] = (att->func) (proc, att, value, &nulls[i]);
  
  			Py_XDECREF(value);
  			value = NULL;
  		}
  		PG_CATCH();
  		{
  			Py_XDECREF(value);
  			PG_RE_THROW();
  		}
***************
*** 1933,1939 ****
  
  
  static HeapTuple
! PLyObject_ToTuple(PLyTypeInfo * info, PyObject * object)
  {
  	TupleDesc	desc;
  	HeapTuple	tuple;
--- 2135,2141 ----
  
  
  static HeapTuple
! PLyObject_ToTuple(PLyProcedure *proc, PyObject *object)
  {
  	TupleDesc	desc;
  	HeapTuple	tuple;
***************
*** 1941,1962 ****
  	bool	   *nulls;
  	volatile int i;
  
! 	desc = lookup_rowtype_tupdesc(info->out.d.typoid, -1);
! 	if (info->is_rowtype == 2)
! 		PLy_output_tuple_funcs(info, desc);
! 	Assert(info->is_rowtype == 1);
  
  	/* Build tuple */
  	values = palloc(sizeof(Datum) * desc->natts);
  	nulls = palloc(sizeof(bool) * desc->natts);
  	for (i = 0; i < desc->natts; ++i)
  	{
! 		char	   *key;
! 		PyObject   *volatile value,
! 				   *volatile so;
  
  		key = NameStr(desc->attrs[i]->attname);
! 		value = so = NULL;
  		PG_TRY();
  		{
  			value = PyObject_GetAttrString(object, key);
--- 2143,2165 ----
  	bool	   *nulls;
  	volatile int i;
  
! 	desc = lookup_rowtype_tupdesc(proc->result.out.d.typoid, -1);
! 	if (proc->result.is_rowtype == 2)
! 		PLy_output_tuple_funcs(&proc->result, desc);
! 	Assert(proc->result.is_rowtype == 1);
  
  	/* Build tuple */
  	values = palloc(sizeof(Datum) * desc->natts);
  	nulls = palloc(sizeof(bool) * desc->natts);
  	for (i = 0; i < desc->natts; ++i)
  	{
! 		char	     *key;
! 		PLyObToDatum *att;
! 		PyObject     *volatile value;
  
+ 		att = &proc->result.out.r.atts[i];
  		key = NameStr(desc->attrs[i]->attname);
! 		value = NULL;
  		PG_TRY();
  		{
  			value = PyObject_GetAttrString(object, key);
***************
*** 1965,2000 ****
  				values[i] = (Datum) NULL;
  				nulls[i] = true;
  			}
! 			else if (value)
  			{
- 				char	   *valuestr;
- 
- 				so = PyObject_Str(value);
- 				if (so == NULL)
- 					PLy_elog(ERROR, "could not compute string representation of Python object");
- 				valuestr = PyString_AsString(so);
- 				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
- 											  ,valuestr
- 											  ,info->out.r.atts[i].typioparam
- 											  ,-1);
- 				Py_DECREF(so);
- 				so = NULL;
- 				nulls[i] = false;
- 			}
- 			else
  				ereport(ERROR,
  						(errcode(ERRCODE_UNDEFINED_COLUMN),
! 						 errmsg("attribute \"%s\" does not exist in Python object", key),
  						 errhint("To return null in a column, "
! 								 "let the returned object have an attribute named "
! 								 "after column with value None.")));
  
  			Py_XDECREF(value);
  			value = NULL;
  		}
  		PG_CATCH();
  		{
- 			Py_XDECREF(so);
  			Py_XDECREF(value);
  			PG_RE_THROW();
  		}
--- 2168,2190 ----
  				values[i] = (Datum) NULL;
  				nulls[i] = true;
  			}
! 			else if (!value)
  			{
  				ereport(ERROR,
  						(errcode(ERRCODE_UNDEFINED_COLUMN),
! 						 errmsg("key \"%s\" not found in object", key),
  						 errhint("To return null in a column, "
! 								 "add the value None to the mapping with the "
! 								 "key named after the column.")));
! 			}
! 			else
! 				values[i] = (att->func) (proc, att, value, &nulls[i]);
  
  			Py_XDECREF(value);
  			value = NULL;
  		}
  		PG_CATCH();
  		{
  			Py_XDECREF(value);
  			PG_RE_THROW();
  		}
Index: expected/plpython_function.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/expected/plpython_function.out,v
retrieving revision 1.12
diff -c -r1.12 plpython_function.out
*** expected/plpython_function.out	3 Apr 2009 16:59:42 -0000	1.12
--- expected/plpython_function.out	26 May 2009 22:58:52 -0000
***************
*** 450,452 ****
--- 450,470 ----
  CREATE FUNCTION test_inout_params(first inout text) AS $$
  return first + '_inout';
  $$ LANGUAGE plpythonu;
+ CREATE FUNCTION test_type_conversion_bool(x bool) returns bool AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_char(x char) returns char AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int2(x int2) returns int2 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int4(x int4) returns int4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int8(x int8) returns int8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float4(x float4) returns float4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float8(x float8) returns float8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_numeric(x numeric) returns numeric AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_text(x text) returns text AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_bytea(x bytea) returns bytea AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_marshal() returns bytea AS $$ 
+ import marshal
+ return marshal.dumps('hello world')
+ $$ language plpythonu;
+ CREATE FUNCTION test_type_unmarshal(x bytea) returns text AS $$
+ import marshal
+ return marshal.loads(x)
+ $$ language plpythonu;
Index: expected/plpython_test.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/expected/plpython_test.out,v
retrieving revision 1.8
diff -c -r1.8 plpython_test.out
*** expected/plpython_test.out	3 Apr 2009 16:59:42 -0000	1.8
--- expected/plpython_test.out	26 May 2009 22:58:52 -0000
***************
*** 559,561 ****
--- 559,693 ----
   test_in_inout
  (1 row)
  
+ SELECT * FROM test_type_conversion_bool(true);
+  test_type_conversion_bool 
+ ---------------------------
+  t
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_bool(false);
+  test_type_conversion_bool 
+ ---------------------------
+  f
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_bool(null);
+  test_type_conversion_bool 
+ ---------------------------
+  
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_char('a');
+  test_type_conversion_char 
+ ---------------------------
+  a
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_char(null);
+  test_type_conversion_char 
+ ---------------------------
+  
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_int2(100::int2);
+  test_type_conversion_int2 
+ ---------------------------
+                        100
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_int2(null);
+  test_type_conversion_int2 
+ ---------------------------
+                           
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_int4(100);
+  test_type_conversion_int4 
+ ---------------------------
+                        100
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_int4(null);
+  test_type_conversion_int4 
+ ---------------------------
+                           
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_int8(100);
+  test_type_conversion_int8 
+ ---------------------------
+                        100
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_int8(null);
+  test_type_conversion_int8 
+ ---------------------------
+                           
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_float4(100);
+  test_type_conversion_float4 
+ -----------------------------
+                          100
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_float4(null);
+  test_type_conversion_float4 
+ -----------------------------
+                             
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_float8(100);
+  test_type_conversion_float8 
+ -----------------------------
+                          100
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_float8(null);
+  test_type_conversion_float8 
+ -----------------------------
+                             
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_numeric(100);
+  test_type_conversion_numeric 
+ ------------------------------
+                         100.0
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_numeric(null);
+  test_type_conversion_numeric 
+ ------------------------------
+                              
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_text('hello world');
+  test_type_conversion_text 
+ ---------------------------
+  hello world
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_text(null);
+  test_type_conversion_text 
+ ---------------------------
+  
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_bytea('hello world');
+  test_type_conversion_bytea 
+ ----------------------------
+  hello world
+ (1 row)
+ 
+ SELECT * FROM test_type_conversion_bytea(null);
+  test_type_conversion_bytea 
+ ----------------------------
+  
+ (1 row)
+ 
+ SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
+  test_type_unmarshal 
+ ---------------------
+  hello world
+ (1 row)
+ 
Index: sql/plpython_function.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/sql/plpython_function.sql,v
retrieving revision 1.12
diff -c -r1.12 plpython_function.sql
*** sql/plpython_function.sql	3 Apr 2009 16:59:43 -0000	1.12
--- sql/plpython_function.sql	26 May 2009 22:58:52 -0000
***************
*** 497,499 ****
--- 497,518 ----
  CREATE FUNCTION test_inout_params(first inout text) AS $$
  return first + '_inout';
  $$ LANGUAGE plpythonu;
+ 
+ CREATE FUNCTION test_type_conversion_bool(x bool) returns bool AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_char(x char) returns char AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int2(x int2) returns int2 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int4(x int4) returns int4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_int8(x int8) returns int8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float4(x float4) returns float4 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_float8(x float8) returns float8 AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_numeric(x numeric) returns numeric AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_text(x text) returns text AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_conversion_bytea(x bytea) returns bytea AS $$ return x $$ language plpythonu;
+ CREATE FUNCTION test_type_marshal() returns bytea AS $$ 
+ import marshal
+ return marshal.dumps('hello world')
+ $$ language plpythonu;
+ CREATE FUNCTION test_type_unmarshal(x bytea) returns text AS $$
+ import marshal
+ return marshal.loads(x)
+ $$ language plpythonu;
Index: sql/plpython_test.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/pl/plpython/sql/plpython_test.sql,v
retrieving revision 1.5
diff -c -r1.5 plpython_test.sql
*** sql/plpython_test.sql	3 Apr 2009 16:59:43 -0000	1.5
--- sql/plpython_test.sql	26 May 2009 22:58:52 -0000
***************
*** 149,151 ****
--- 149,176 ----
  -- this doesn't work yet :-(
  SELECT * FROM test_in_out_params_multi('test_in');
  SELECT * FROM test_inout_params('test_in');
+ 
+ SELECT * FROM test_type_conversion_bool(true);
+ SELECT * FROM test_type_conversion_bool(false);
+ SELECT * FROM test_type_conversion_bool(null);
+ SELECT * FROM test_type_conversion_char('a');
+ SELECT * FROM test_type_conversion_char(null);
+ SELECT * FROM test_type_conversion_int2(100::int2);
+ SELECT * FROM test_type_conversion_int2(null);
+ SELECT * FROM test_type_conversion_int4(100);
+ SELECT * FROM test_type_conversion_int4(null);
+ SELECT * FROM test_type_conversion_int8(100);
+ SELECT * FROM test_type_conversion_int8(null);
+ SELECT * FROM test_type_conversion_float4(100);
+ SELECT * FROM test_type_conversion_float4(null);
+ SELECT * FROM test_type_conversion_float8(100);
+ SELECT * FROM test_type_conversion_float8(null);
+ SELECT * FROM test_type_conversion_numeric(100);
+ SELECT * FROM test_type_conversion_numeric(null);
+ SELECT * FROM test_type_conversion_text('hello world');
+ SELECT * FROM test_type_conversion_text(null);
+ SELECT * FROM test_type_conversion_bytea('hello world');
+ SELECT * FROM test_type_conversion_bytea(null);
+ SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
+ 
+ 
#9Peter Eisentraut
peter_e@gmx.net
In reply to: Caleb Welton (#1)
1 attachment(s)
Re: [PATCH] plpythonu datatype conversion improvements

On tis, 2009-05-26 at 16:07 -0700, Caleb Welton wrote:

Patch for plpythonu

Primary motivation of the attached patch is to support handling bytea
conversion allowing for embedded nulls, which in turn allows for
supporting the marshal module.

Secondary motivation is slightly improved performance for conversion
routines of basic datatypes that have simple mappings between
postgres/python.

Primary design is to change the conversion routines from being based
on cstrings to datums, eg:
PLyBool_FromString(const char *) =>
PLyBool_FromBool(PLyDatumToOb, Datum);

I have reworked this patch a bit and extended the plpython test suite
around it. Current copy attached.

The remaining problem is that the patch loses domain checking on the
return types, because some paths no longer go through the data type's
input function. I have marked these places as FIXME, and the regression
tests also contain a failing test case for this.

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain. I
haven't found one that is exported, but maybe someone could give a tip.

Attachments:

plpython-datatypes.patchtext/x-patch; charset=UTF-8; name=plpython-datatypes.patchDownload
diff --git a/src/pl/plpython/expected/plpython_record.out b/src/pl/plpython/expected/plpython_record.out
index 9e4645d..c8c4f9d 100644
--- a/src/pl/plpython/expected/plpython_record.out
+++ b/src/pl/plpython/expected/plpython_record.out
@@ -313,13 +313,15 @@ $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_record_error1();
 ERROR:  key "second" not found in mapping
 HINT:  To return null in a column, add the value None to the mapping with the key named after the column.
-CONTEXT:  PL/Python function "test_type_record_error1"
+CONTEXT:  while creating return value
+PL/Python function "test_type_record_error1"
 CREATE FUNCTION test_type_record_error2() RETURNS type_record AS $$
     return [ 'first' ]
 $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_record_error2();
 ERROR:  length of returned sequence did not match number of columns in row
-CONTEXT:  PL/Python function "test_type_record_error2"
+CONTEXT:  while creating return value
+PL/Python function "test_type_record_error2"
 CREATE FUNCTION test_type_record_error3() RETURNS type_record AS $$
     class type_record: pass
     type_record.first = 'first'
@@ -328,4 +330,5 @@ $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_record_error3();
 ERROR:  attribute "second" does not exist in Python object
 HINT:  To return null in a column, let the returned object have an attribute named after column with value None.
-CONTEXT:  PL/Python function "test_type_record_error3"
+CONTEXT:  while creating return value
+PL/Python function "test_type_record_error3"
diff --git a/src/pl/plpython/expected/plpython_trigger.out b/src/pl/plpython/expected/plpython_trigger.out
index 7591404..dd08303 100644
--- a/src/pl/plpython/expected/plpython_trigger.out
+++ b/src/pl/plpython/expected/plpython_trigger.out
@@ -353,7 +353,8 @@ BEFORE UPDATE ON trigger_test
 FOR EACH ROW EXECUTE PROCEDURE stupid4();
 UPDATE trigger_test SET v = 'null' WHERE i = 0;
 ERROR:  TD["new"] deleted, cannot modify row
-CONTEXT:  PL/Python function "stupid4"
+CONTEXT:  while modifying trigger row
+PL/Python function "stupid4"
 DROP TRIGGER stupid_trigger4 ON trigger_test;
 -- TD not a dictionary
 CREATE FUNCTION stupid5() RETURNS trigger
@@ -366,7 +367,8 @@ BEFORE UPDATE ON trigger_test
 FOR EACH ROW EXECUTE PROCEDURE stupid5();
 UPDATE trigger_test SET v = 'null' WHERE i = 0;
 ERROR:  TD["new"] is not a dictionary
-CONTEXT:  PL/Python function "stupid5"
+CONTEXT:  while modifying trigger row
+PL/Python function "stupid5"
 DROP TRIGGER stupid_trigger5 ON trigger_test;
 -- TD not having string keys
 CREATE FUNCTION stupid6() RETURNS trigger
@@ -379,7 +381,8 @@ BEFORE UPDATE ON trigger_test
 FOR EACH ROW EXECUTE PROCEDURE stupid6();
 UPDATE trigger_test SET v = 'null' WHERE i = 0;
 ERROR:  TD["new"] dictionary key at ordinal position 0 is not a string
-CONTEXT:  PL/Python function "stupid6"
+CONTEXT:  while modifying trigger row
+PL/Python function "stupid6"
 DROP TRIGGER stupid_trigger6 ON trigger_test;
 -- TD keys not corresponding to row columns
 CREATE FUNCTION stupid7() RETURNS trigger
@@ -392,7 +395,8 @@ BEFORE UPDATE ON trigger_test
 FOR EACH ROW EXECUTE PROCEDURE stupid7();
 UPDATE trigger_test SET v = 'null' WHERE i = 0;
 ERROR:  key "a" found in TD["new"] does not exist as a column in the triggering row
-CONTEXT:  PL/Python function "stupid7"
+CONTEXT:  while modifying trigger row
+PL/Python function "stupid7"
 DROP TRIGGER stupid_trigger7 ON trigger_test;
 -- calling a trigger function directly
 SELECT stupid7();
diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
index 476f329..a03d0cc 100644
--- a/src/pl/plpython/expected/plpython_types.out
+++ b/src/pl/plpython/expected/plpython_types.out
@@ -278,7 +278,7 @@ plpy.info(x, type(x))
 return x
 $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_conversion_bytea('hello world');
-INFO:  ('\\x68656c6c6f20776f726c64', <type 'str'>)
+INFO:  ('hello world', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea"
  test_type_conversion_bytea 
 ----------------------------
@@ -308,8 +308,8 @@ $$ LANGUAGE plpythonu;
    Python as a string in bytea-encoding, which Python doesn't understand. */
 SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
    test_type_unmarshal    
---------------------------
- FAILED: bad marshal data
+---------------------
+ hello world
 (1 row)
 
 --
@@ -332,7 +332,8 @@ SELECT * FROM test_type_conversion_uint2(100::uint2, -50);
 INFO:  (100, <type 'int'>)
 CONTEXT:  PL/Python function "test_type_conversion_uint2"
 ERROR:  value for domain uint2 violates check constraint "uint2_check"
-CONTEXT:  PL/Python function "test_type_conversion_uint2"
+CONTEXT:  while creating return value
+PL/Python function "test_type_conversion_uint2"
 SELECT * FROM test_type_conversion_uint2(null, 1);
 INFO:  (None, <type 'NoneType'>)
 CONTEXT:  PL/Python function "test_type_conversion_uint2"
@@ -341,13 +342,29 @@ CONTEXT:  PL/Python function "test_type_conversion_uint2"
                           1
 (1 row)
 
+CREATE DOMAIN nnint AS int CHECK (VALUE IS NOT NULL);
+CREATE FUNCTION test_type_conversion_nnint(x nnint, y int) RETURNS nnint AS $$
+return y
+$$ LANGUAGE plpythonu;
+SELECT * FROM test_type_conversion_nnint(10, 20);
+ test_type_conversion_nnint 
+----------------------------
+                         20
+(1 row)
+
+SELECT * FROM test_type_conversion_nnint(null, 20);
+ERROR:  value for domain nnint violates check constraint "nnint_check"
+SELECT * FROM test_type_conversion_nnint(10, null);
+ERROR:  value for domain nnint violates check constraint "nnint_check"
+CONTEXT:  while creating return value
+PL/Python function "test_type_conversion_nnint"
 CREATE DOMAIN bytea10 AS bytea CHECK (octet_length(VALUE) = 10 AND VALUE IS NOT NULL);
 CREATE FUNCTION test_type_conversion_bytea10(x bytea10, y bytea) RETURNS bytea10 AS $$
 plpy.info(x, type(x))
 return y
 $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_conversion_bytea10('hello wold', 'hello wold');
-INFO:  ('\\x68656c6c6f20776f6c64', <type 'str'>)
+INFO:  ('hello wold', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
  test_type_conversion_bytea10 
 ------------------------------
@@ -357,14 +374,15 @@ CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 SELECT * FROM test_type_conversion_bytea10('hello world', 'hello wold');
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 SELECT * FROM test_type_conversion_bytea10('hello word', 'hello world');
-INFO:  ('\\x68656c6c6f20776f7264', <type 'str'>)
+INFO:  ('hello word', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 SELECT * FROM test_type_conversion_bytea10(null, 'hello word');
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 SELECT * FROM test_type_conversion_bytea10('hello word', null);
-INFO:  ('\\x68656c6c6f20776f7264', <type 'str'>)
+INFO:  ('hello word', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
-CONTEXT:  PL/Python function "test_type_conversion_bytea10"
+CONTEXT:  while creating return value
+PL/Python function "test_type_conversion_bytea10"
diff --git a/src/pl/plpython/expected/plpython_unicode.out b/src/pl/plpython/expected/plpython_unicode.out
index ce19eb9..d3b6fd1 100644
--- a/src/pl/plpython/expected/plpython_unicode.out
+++ b/src/pl/plpython/expected/plpython_unicode.out
@@ -24,13 +24,15 @@ rv = plpy.execute(plan, u"\\x80", 1)
 return rv[0]["testvalue1"]
 ' LANGUAGE plpythonu;
 SELECT unicode_return_error();
-ERROR:  PL/Python: could not create string representation of Python object, while creating return value
+ERROR:  PL/Python: could not create string representation of Python object
 DETAIL:  <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)
-CONTEXT:  PL/Python function "unicode_return_error"
+CONTEXT:  while creating return value
+PL/Python function "unicode_return_error"
 INSERT INTO unicode_test (testvalue) VALUES ('test');
-ERROR:  PL/Python: could not compute string representation of Python object, while modifying trigger row
+ERROR:  PL/Python: could not create string representation of Python object
 DETAIL:  <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)
-CONTEXT:  PL/Python function "unicode_trigger_error"
+CONTEXT:  while modifying trigger row
+PL/Python function "unicode_trigger_error"
 SELECT unicode_plan_error1();
 WARNING:  PL/Python: <class 'plpy.Error'>: unrecognized error in PLy_spi_execute_plan
 CONTEXT:  PL/Python function "unicode_plan_error1"
diff --git a/src/pl/plpython/expected/plpython_unicode_2.out b/src/pl/plpython/expected/plpython_unicode_2.out
index 9280fe7..7bb02c7 100644
--- a/src/pl/plpython/expected/plpython_unicode_2.out
+++ b/src/pl/plpython/expected/plpython_unicode_2.out
@@ -24,13 +24,15 @@ rv = plpy.execute(plan, u"\\x80", 1)
 return rv[0]["testvalue1"]
 ' LANGUAGE plpythonu;
 SELECT unicode_return_error();
-ERROR:  PL/Python: could not create string representation of Python object, while creating return value
+ERROR:  PL/Python: could not create string representation of Python object
 DETAIL:  exceptions.UnicodeError: ASCII encoding error: ordinal not in range(128)
-CONTEXT:  PL/Python function "unicode_return_error"
+CONTEXT:  while creating return value
+PL/Python function "unicode_return_error"
 INSERT INTO unicode_test (testvalue) VALUES ('test');
-ERROR:  PL/Python: could not compute string representation of Python object, while modifying trigger row
+ERROR:  PL/Python: could not compute string representation of Python object
 DETAIL:  exceptions.UnicodeError: ASCII encoding error: ordinal not in range(128)
-CONTEXT:  PL/Python function "unicode_trigger_error"
+CONTEXT:  while modifying trigger row
+PL/Python function "unicode_trigger_error"
 SELECT unicode_plan_error1();
 WARNING:  PL/Python: plpy.Error: unrecognized error in PLy_spi_execute_plan
 CONTEXT:  PL/Python function "unicode_plan_error1"
diff --git a/src/pl/plpython/expected/plpython_unicode_3.out b/src/pl/plpython/expected/plpython_unicode_3.out
index f058e2b..6395871 100644
--- a/src/pl/plpython/expected/plpython_unicode_3.out
+++ b/src/pl/plpython/expected/plpython_unicode_3.out
@@ -24,13 +24,15 @@ rv = plpy.execute(plan, u"\\x80", 1)
 return rv[0]["testvalue1"]
 ' LANGUAGE plpythonu;
 SELECT unicode_return_error();
-ERROR:  PL/Python: could not create string representation of Python object, while creating return value
+ERROR:  PL/Python: could not create string representation of Python object
 DETAIL:  exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)
-CONTEXT:  PL/Python function "unicode_return_error"
+CONTEXT:  while creating return value
+PL/Python function "unicode_return_error"
 INSERT INTO unicode_test (testvalue) VALUES ('test');
-ERROR:  PL/Python: could not compute string representation of Python object, while modifying trigger row
+ERROR:  PL/Python: could not compute string representation of Python object
 DETAIL:  exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)
-CONTEXT:  PL/Python function "unicode_trigger_error"
+CONTEXT:  while modifying trigger row
+PL/Python function "unicode_trigger_error"
 SELECT unicode_plan_error1();
 WARNING:  PL/Python: plpy.Error: unrecognized error in PLy_spi_execute_plan
 CONTEXT:  PL/Python function "unicode_plan_error1"
diff --git a/src/pl/plpython/expected/plpython_void.out b/src/pl/plpython/expected/plpython_void.out
index d067de0..1080d12 100644
--- a/src/pl/plpython/expected/plpython_void.out
+++ b/src/pl/plpython/expected/plpython_void.out
@@ -20,7 +20,8 @@ SELECT test_void_func1(), test_void_func1() IS NULL AS "is null";
 
 SELECT test_void_func2(); -- should fail
 ERROR:  PL/Python function with return type "void" did not return None
-CONTEXT:  PL/Python function "test_void_func2"
+CONTEXT:  while creating return value
+PL/Python function "test_void_func2"
 SELECT test_return_none(), test_return_none() IS NULL AS "is null";
  test_return_none | is null 
 ------------------+---------
diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c
index cfc2225..cab5e45 100644
--- a/src/pl/plpython/plpython.c
+++ b/src/pl/plpython/plpython.c
@@ -78,7 +78,8 @@ PG_MODULE_MAGIC;
  * objects.
  */
 
-typedef PyObject *(*PLyDatumToObFunc) (const char *);
+struct PLyDatumToOb;
+typedef PyObject *(*PLyDatumToObFunc) (struct PLyDatumToOb*, Datum);
 
 typedef struct PLyDatumToOb
 {
@@ -104,8 +105,16 @@ typedef union PLyTypeInput
 /* convert PyObject to a Postgresql Datum or tuple.
  * output from Python
  */
+
+struct PLyObToDatum;
+struct PLyTypeInfo;
+typedef Datum (*PLyObToDatumFunc) (struct PLyTypeInfo*,
+								   struct PLyObToDatum*,
+								   PyObject *);
+
 typedef struct PLyObToDatum
 {
+	PLyObToDatumFunc func;
 	FmgrInfo	typfunc;		/* The type's input function */
 	Oid			typoid;			/* The OID of the type */
 	Oid			typioparam;
@@ -131,12 +140,11 @@ typedef struct PLyTypeInfo
 {
 	PLyTypeInput in;
 	PLyTypeOutput out;
-	int			is_rowtype;
-
 	/*
-	 * is_rowtype can be: -1  not known yet (initial state) 0  scalar datatype
-	 * 1  rowtype 2  rowtype, but I/O functions not set up yet
+	 * is_rowtype can be: -1 = not known yet (initial state); 0 = scalar datatype;
+	 * 1 = rowtype; 2 = rowtype, but I/O functions not set up yet
 	 */
+	int			is_rowtype;
 } PLyTypeInfo;
 
 
@@ -263,12 +271,26 @@ static void PLy_output_tuple_funcs(PLyTypeInfo *, TupleDesc);
 static void PLy_input_tuple_funcs(PLyTypeInfo *, TupleDesc);
 
 /* conversion functions */
+static PyObject *PLyBool_FromBool(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyInt_FromInt16(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyInt_FromInt32(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyLong_FromInt64(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyString_FromText(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyString_FromDatum(PLyDatumToOb *arg, Datum d);
+
 static PyObject *PLyDict_FromTuple(PLyTypeInfo *, HeapTuple, TupleDesc);
-static PyObject *PLyBool_FromString(const char *);
-static PyObject *PLyFloat_FromString(const char *);
-static PyObject *PLyInt_FromString(const char *);
-static PyObject *PLyLong_FromString(const char *);
-static PyObject *PLyString_FromString(const char *);
+
+static Datum PLyObject_ToBool(PLyTypeInfo *, PLyObToDatum *,
+							  PyObject *);
+static Datum PLyObject_ToBytea(PLyTypeInfo *, PLyObToDatum *,
+							   PyObject *);
+static Datum PLyObject_ToText(PLyTypeInfo *, PLyObToDatum *,
+							  PyObject *);
+static Datum PLyObject_ToDatum(PLyTypeInfo *, PLyObToDatum *,
+							   PyObject *);
 
 static HeapTuple PLyMapping_ToTuple(PLyTypeInfo *, PyObject *);
 static HeapTuple PLySequence_ToTuple(PLyTypeInfo *, PyObject *);
@@ -339,6 +361,20 @@ plpython_error_callback(void *arg)
 		errcontext("PL/Python function \"%s\"", PLy_procedure_name(PLy_curr_procedure));
 }
 
+static void
+plpython_trigger_error_callback(void *arg)
+{
+	if (PLy_curr_procedure)
+		errcontext("while modifying trigger row");
+}
+
+static void
+plpython_return_error_callback(void *arg)
+{
+	if (PLy_curr_procedure)
+		errcontext("while creating return value");
+}
+
 Datum
 plpython_call_handler(PG_FUNCTION_ARGS)
 {
@@ -506,6 +542,11 @@ PLy_modify_tuple(PLyProcedure *proc, PyObject *pltd, TriggerData *tdata,
 	Datum	   *volatile modvalues;
 	char	   *volatile modnulls;
 	TupleDesc	tupdesc;
+	ErrorContextCallback plerrcontext;
+
+	plerrcontext.callback = plpython_trigger_error_callback;
+	plerrcontext.previous = error_context_stack;
+	error_context_stack = &plerrcontext;
 
 	plntup = plkeys = platt = plval = plstr = NULL;
 	modattrs = NULL;
@@ -533,8 +574,6 @@ PLy_modify_tuple(PLyProcedure *proc, PyObject *pltd, TriggerData *tdata,
 
 		for (i = 0; i < natts; i++)
 		{
-			char	   *src;
-
 			platt = PyList_GetItem(plkeys, i);
 			if (!PyString_Check(platt))
 				ereport(ERROR,
@@ -561,20 +600,9 @@ PLy_modify_tuple(PLyProcedure *proc, PyObject *pltd, TriggerData *tdata,
 			}
 			else if (plval != Py_None)
 			{
-				plstr = PyObject_Str(plval);
-				if (!plstr)
-					PLy_elog(ERROR, "could not compute string representation of Python object, while modifying trigger row");
-				src = PyString_AsString(plstr);
-
-				modvalues[i] =
-					InputFunctionCall(&proc->result.out.r.atts[atti].typfunc,
-									  src,
-									proc->result.out.r.atts[atti].typioparam,
-									  tupdesc->attrs[atti]->atttypmod);
+				PLyObToDatum *att = &proc->result.out.r.atts[atti];
+				modvalues[i] = (att->func) (&proc->result, att, plval);
 				modnulls[i] = ' ';
-
-				Py_DECREF(plstr);
-				plstr = NULL;
 			}
 			else
 			{
@@ -620,6 +648,8 @@ PLy_modify_tuple(PLyProcedure *proc, PyObject *pltd, TriggerData *tdata,
 	pfree(modvalues);
 	pfree(modnulls);
 
+	error_context_stack = plerrcontext.previous;
+
 	return rtup;
 }
 
@@ -809,8 +839,7 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 	Datum		rv;
 	PyObject   *volatile plargs = NULL;
 	PyObject   *volatile plrv = NULL;
-	PyObject   *volatile plrv_so = NULL;
-	char	   *plrv_sc;
+	ErrorContextCallback plerrcontext;
 
 	PG_TRY();
 	{
@@ -887,7 +916,6 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 
 				Py_XDECREF(plargs);
 				Py_XDECREF(plrv);
-				Py_XDECREF(plrv_so);
 
 				PLy_function_delete_args(proc);
 
@@ -901,6 +929,12 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 			}
 		}
 
+		/* Convert python return value into postgres datatypes */
+
+		plerrcontext.callback = plpython_return_error_callback;
+		plerrcontext.previous = error_context_stack;
+		error_context_stack = &plerrcontext;
+
 		/*
 		 * If the function is declared to return void, the Python return value
 		 * must be None. For void-returning functions, we also treat a None
@@ -957,21 +991,18 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 		else
 		{
 			fcinfo->isnull = false;
-			plrv_so = PyObject_Str(plrv);
-			if (!plrv_so)
-				PLy_elog(ERROR, "could not create string representation of Python object, while creating return value");
-			plrv_sc = PyString_AsString(plrv_so);
-			rv = InputFunctionCall(&proc->result.out.d.typfunc,
-								   plrv_sc,
-								   proc->result.out.d.typioparam,
-								   -1);
+			rv = (proc->result.out.d.func) (&proc->result,
+											&proc->result.out.d,
+											plrv);
+			// FIMXE: call input function for domain check
 		}
+
+		error_context_stack = plerrcontext.previous;
 	}
 	PG_CATCH();
 	{
 		Py_XDECREF(plargs);
 		Py_XDECREF(plrv);
-		Py_XDECREF(plrv_so);
 
 		PG_RE_THROW();
 	}
@@ -979,7 +1010,6 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 
 	Py_XDECREF(plargs);
 	Py_DECREF(plrv);
-	Py_XDECREF(plrv_so);
 
 	return rv;
 }
@@ -1062,12 +1092,8 @@ PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
 					arg = NULL;
 				else
 				{
-					char	   *ct;
-
-					ct = OutputFunctionCall(&(proc->args[i].in.d.typfunc),
-											fcinfo->arg[i]);
-					arg = (proc->args[i].in.d.func) (ct);
-					pfree(ct);
+					arg = (proc->args[i].in.d.func) (&(proc->args[i].in.d),
+													 fcinfo->arg[i]);
 				}
 			}
 
@@ -1618,6 +1644,33 @@ PLy_output_datum_func2(PLyObToDatum *arg, HeapTuple typeTup)
 	arg->typoid = HeapTupleGetOid(typeTup);
 	arg->typioparam = getTypeIOParam(typeTup);
 	arg->typbyval = typeStruct->typbyval;
+
+	/* Determine which kind of Python object we will convert to */
+	switch (getBaseType(arg->typoid))
+	{
+		case BOOLOID:
+			arg->func = PLyObject_ToBool;
+			break;
+		case BYTEAOID:
+			arg->func = PLyObject_ToBytea;
+			break;
+		case BPCHAROID:
+		case VARCHAROID:
+		case TEXTOID:
+			arg->func = PLyObject_ToText;
+			break;
+
+		case FLOAT4OID:
+		case FLOAT8OID:
+		case NUMERICOID:
+		case INT2OID:
+		case INT4OID:
+		case INT8OID:
+		case VOIDOID:
+		default:
+			arg->func = PLyObject_ToDatum;
+			break;
+	}
 }
 
 static void
@@ -1644,22 +1697,34 @@ PLy_input_datum_func2(PLyDatumToOb *arg, Oid typeOid, HeapTuple typeTup)
 	switch (getBaseType(typeOid))
 	{
 		case BOOLOID:
-			arg->func = PLyBool_FromString;
+			arg->func = PLyBool_FromBool;
 			break;
 		case FLOAT4OID:
+			arg->func = PLyFloat_FromFloat4;
+			break;
 		case FLOAT8OID:
+			arg->func = PLyFloat_FromFloat8;
+			break;
 		case NUMERICOID:
-			arg->func = PLyFloat_FromString;
+			arg->func = PLyFloat_FromNumeric;
 			break;
 		case INT2OID:
+			arg->func = PLyInt_FromInt16;
+			break;
 		case INT4OID:
-			arg->func = PLyInt_FromString;
+			arg->func = PLyInt_FromInt32;
 			break;
 		case INT8OID:
-			arg->func = PLyLong_FromString;
+			arg->func = PLyLong_FromInt64;
+			break;
+		case BPCHAROID:
+		case VARCHAROID:
+		case TEXTOID:
+		case BYTEAOID:
+			arg->func = PLyString_FromText;
 			break;
 		default:
-			arg->func = PLyString_FromString;
+			arg->func = PLyString_FromDatum;
 			break;
 	}
 }
@@ -1685,9 +1750,8 @@ PLy_typeinfo_dealloc(PLyTypeInfo *arg)
 	}
 }
 
-/* assumes that a bool is always returned as a 't' or 'f' */
 static PyObject *
-PLyBool_FromString(const char *src)
+PLyBool_FromBool(PLyDatumToOb *arg, Datum d)
 {
 	/*
 	 * We would like to use Py_RETURN_TRUE and Py_RETURN_FALSE here for
@@ -1695,47 +1759,75 @@ PLyBool_FromString(const char *src)
 	 * Python >= 2.3, and we support older versions.
 	 * http://docs.python.org/api/boolObjects.html
 	 */
-	if (src[0] == 't')
+	if (DatumGetBool(d))
 		return PyBool_FromLong(1);
 	return PyBool_FromLong(0);
 }
 
 static PyObject *
-PLyFloat_FromString(const char *src)
+PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d)
 {
-	double		v;
-	char	   *eptr;
+	return PyFloat_FromDouble(DatumGetFloat4(d));
+}
 
-	errno = 0;
-	v = strtod(src, &eptr);
-	if (*eptr != '\0' || errno)
-		return NULL;
-	return PyFloat_FromDouble(v);
+static PyObject *
+PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d)
+{
+	return PyFloat_FromDouble(DatumGetFloat8(d));
 }
 
 static PyObject *
-PLyInt_FromString(const char *src)
+PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d)
 {
-	long		v;
-	char	   *eptr;
+	/*
+	 * Numeric is cast to a PyFloat:
+	 *   This results in a loss of precision
+	 *   Would it be better to cast to PyString?
+	 */
+	Datum  f = DirectFunctionCall1(numeric_float8, d);
+	double x = DatumGetFloat8(f);
+	return PyFloat_FromDouble(x);
+}
 
-	errno = 0;
-	v = strtol(src, &eptr, 0);
-	if (*eptr != '\0' || errno)
-		return NULL;
-	return PyInt_FromLong(v);
+static PyObject *
+PLyInt_FromInt16(PLyDatumToOb *arg, Datum d)
+{
+	return PyInt_FromLong(DatumGetInt16(d));
+}
+
+static PyObject *
+PLyInt_FromInt32(PLyDatumToOb *arg, Datum d)
+{
+	return PyInt_FromLong(DatumGetInt32(d));
+}
+
+static PyObject *
+PLyLong_FromInt64(PLyDatumToOb *arg, Datum d)
+{
+	/* on 32 bit platforms "long" may be too small */
+	if (sizeof(int64) > sizeof(long))
+		return PyLong_FromLongLong(DatumGetInt64(d));
+	else
+		return PyLong_FromLong(DatumGetInt64(d));
 }
 
 static PyObject *
-PLyLong_FromString(const char *src)
+PLyString_FromText(PLyDatumToOb *arg, Datum d)
 {
-	return PyLong_FromString((char *) src, NULL, 0);
+	text     *txt = DatumGetTextP(d);
+	char     *str = VARDATA(txt);
+	size_t    size = VARSIZE(txt) - VARHDRSZ;
+
+	return PyString_FromStringAndSize(str, size);
 }
 
 static PyObject *
-PLyString_FromString(const char *src)
+PLyString_FromDatum(PLyDatumToOb *arg, Datum d)
 {
-	return PyString_FromString(src);
+	char     *x = OutputFunctionCall(&arg->typfunc, d);
+	PyObject *r = PyString_FromString(x);
+	pfree(x);
+	return r;
 }
 
 static PyObject *
@@ -1755,8 +1847,7 @@ PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
 	{
 		for (i = 0; i < info->in.r.natts; i++)
 		{
-			char	   *key,
-					   *vsrc;
+			char	   *key;
 			Datum		vattr;
 			bool		is_null;
 			PyObject   *value;
@@ -1771,14 +1862,7 @@ PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
 				PyDict_SetItemString(dict, key, Py_None);
 			else
 			{
-				vsrc = OutputFunctionCall(&info->in.r.atts[i].typfunc,
-										  vattr);
-
-				/*
-				 * no exceptions allowed
-				 */
-				value = info->in.r.atts[i].func(vsrc);
-				pfree(vsrc);
+				value = (info->in.r.atts[i].func) (&info->in.r.atts[i], vattr);
 				PyDict_SetItemString(dict, key, value);
 				Py_DECREF(value);
 			}
@@ -1794,6 +1878,147 @@ PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
 	return dict;
 }
 
+static Datum
+PLyObject_ToBool(PLyTypeInfo *info,
+				 PLyObToDatum *arg,
+				 PyObject *plrv)
+{
+	Assert(plrv != Py_None);
+	return BoolGetDatum(PyObject_IsTrue(plrv));
+	// FIXME: domain check
+}
+
+
+static Datum
+PLyObject_ToBytea(PLyTypeInfo *info,
+				  PLyObToDatum *arg,
+				  PyObject *plrv)
+{
+	PyObject   *volatile plrv_so = NULL;
+	Datum       rv;
+
+	Assert(plrv != Py_None);
+
+	plrv_so = PyObject_Str(plrv);
+	if (!plrv_so)
+		PLy_elog(ERROR, "could not create string representation of Python object");
+
+	PG_TRY();
+	{
+		char *plrv_sc = PyString_AsString(plrv_so);
+		size_t len = PyString_Size(plrv_so);
+		size_t size = len + VARHDRSZ;
+		bytea *result = (bytea*) palloc(size);
+
+		SET_VARSIZE(result, size);
+		memcpy(VARDATA(result), plrv_sc, len);
+		rv = PointerGetDatum(result);
+	}
+	PG_CATCH();
+	{
+		Py_XDECREF(plrv_so);
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+
+	Py_XDECREF(plrv_so);
+
+	return rv;
+	// FIXME: domain check
+}
+
+static Datum
+PLyObject_ToText(PLyTypeInfo *info,
+				 PLyObToDatum *arg,
+				 PyObject *plrv)
+{
+	PyObject   *volatile plrv_so = NULL;
+	Datum       rv;
+
+	Assert(plrv != Py_None);
+
+	plrv_so = PyObject_Str(plrv);
+	if (!plrv_so)
+		PLy_elog(ERROR, "could not create string representation of Python object");
+
+	PG_TRY();
+	{
+		char *plrv_sc = PyString_AsString(plrv_so);
+		size_t len    = PyString_Size(plrv_so);
+		size_t size   = len + VARHDRSZ;
+		text *result;
+
+		if (strlen(plrv_sc) != (size_t) len)
+		{
+			ereport(ERROR,
+					(errcode(ERRCODE_DATATYPE_MISMATCH),
+					 errmsg("could not convert Python object into text: expected string without null bytes")));
+		}
+
+		result = (bytea*) palloc(size);
+		SET_VARSIZE(result, size);
+		memcpy(VARDATA(result), plrv_sc, len);
+		rv = PointerGetDatum(result);
+	}
+	PG_CATCH();
+	{
+		Py_XDECREF(plrv_so);
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+
+	Py_XDECREF(plrv_so);
+
+	return rv;
+	// FIXME: domain check
+}
+
+/*
+ * Generic conversion function:
+ *  - Cast PyObject to cstring and cstring into postgres type.
+ */
+static Datum
+PLyObject_ToDatum(PLyTypeInfo *info,
+				  PLyObToDatum *arg,
+				  PyObject *plrv)
+{
+	PyObject *volatile plrv_so = NULL;
+	Datum     rv;
+
+	Assert(plrv != Py_None);
+
+	plrv_so = PyObject_Str(plrv);
+	if (!plrv_so)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_DATATYPE_MISMATCH),
+				 errmsg("could not create string representation of Python object")));
+	}
+
+	PG_TRY();
+	{
+		char *plrv_sc = PyString_AsString(plrv_so);
+		size_t len    = PyString_Size(plrv_so);
+
+		if (strlen(plrv_sc) != (size_t) len)
+		{
+			ereport(ERROR,
+					(errcode(ERRCODE_DATATYPE_MISMATCH),
+					 errmsg("could not convert Python object into cstring: expected string without null bytes")));
+		}
+		rv = InputFunctionCall(&arg->typfunc, plrv_sc, arg->typioparam, -1);
+	}
+	PG_CATCH();
+	{
+		Py_XDECREF(plrv_so);
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+
+	Py_XDECREF(plrv_so);
+
+	return rv;
+}
 
 static HeapTuple
 PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
@@ -1817,11 +2042,12 @@ PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
 	for (i = 0; i < desc->natts; ++i)
 	{
 		char	   *key;
-		PyObject   *volatile value,
-				   *volatile so;
+		PyObject   *volatile value;
+		PLyObToDatum *att;
 
 		key = NameStr(desc->attrs[i]->attname);
-		value = so = NULL;
+		value = NULL;
+		att = &info->out.r.atts[i];
 		PG_TRY();
 		{
 			value = PyMapping_GetItemString(mapping, key);
@@ -1832,19 +2058,7 @@ PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
 			}
 			else if (value)
 			{
-				char	   *valuestr;
-
-				so = PyObject_Str(value);
-				if (so == NULL)
-					PLy_elog(ERROR, "could not compute string representation of Python object");
-				valuestr = PyString_AsString(so);
-
-				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-											  ,valuestr
-											  ,info->out.r.atts[i].typioparam
-											  ,-1);
-				Py_DECREF(so);
-				so = NULL;
+				values[i] = (att->func) (info, att, value);
 				nulls[i] = false;
 			}
 			else
@@ -1859,7 +2073,6 @@ PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
 		}
 		PG_CATCH();
 		{
-			Py_XDECREF(so);
 			Py_XDECREF(value);
 			PG_RE_THROW();
 		}
@@ -1906,10 +2119,11 @@ PLySequence_ToTuple(PLyTypeInfo *info, PyObject *sequence)
 	nulls = palloc(sizeof(bool) * desc->natts);
 	for (i = 0; i < desc->natts; ++i)
 	{
-		PyObject   *volatile value,
-				   *volatile so;
+		PyObject   *volatile value;
+		PLyObToDatum *att;
 
-		value = so = NULL;
+		value = NULL;
+		att = &info->out.r.atts[i];
 		PG_TRY();
 		{
 			value = PySequence_GetItem(sequence, i);
@@ -1921,18 +2135,7 @@ PLySequence_ToTuple(PLyTypeInfo *info, PyObject *sequence)
 			}
 			else if (value)
 			{
-				char	   *valuestr;
-
-				so = PyObject_Str(value);
-				if (so == NULL)
-					PLy_elog(ERROR, "could not compute string representation of Python object");
-				valuestr = PyString_AsString(so);
-				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-											  ,valuestr
-											  ,info->out.r.atts[i].typioparam
-											  ,-1);
-				Py_DECREF(so);
-				so = NULL;
+				values[i] = (att->func) (info, att, value);
 				nulls[i] = false;
 			}
 
@@ -1941,7 +2144,6 @@ PLySequence_ToTuple(PLyTypeInfo *info, PyObject *sequence)
 		}
 		PG_CATCH();
 		{
-			Py_XDECREF(so);
 			Py_XDECREF(value);
 			PG_RE_THROW();
 		}
@@ -1977,11 +2179,12 @@ PLyObject_ToTuple(PLyTypeInfo *info, PyObject *object)
 	for (i = 0; i < desc->natts; ++i)
 	{
 		char	   *key;
-		PyObject   *volatile value,
-				   *volatile so;
+		PyObject   *volatile value;
+		PLyObToDatum *att;
 
 		key = NameStr(desc->attrs[i]->attname);
-		value = so = NULL;
+		value = NULL;
+		att = &info->out.r.atts[i];
 		PG_TRY();
 		{
 			value = PyObject_GetAttrString(object, key);
@@ -1992,18 +2195,7 @@ PLyObject_ToTuple(PLyTypeInfo *info, PyObject *object)
 			}
 			else if (value)
 			{
-				char	   *valuestr;
-
-				so = PyObject_Str(value);
-				if (so == NULL)
-					PLy_elog(ERROR, "could not compute string representation of Python object");
-				valuestr = PyString_AsString(so);
-				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-											  ,valuestr
-											  ,info->out.r.atts[i].typioparam
-											  ,-1);
-				Py_DECREF(so);
-				so = NULL;
+				values[i] = (att->func) (info, att, value);
 				nulls[i] = false;
 			}
 			else
@@ -2019,7 +2211,6 @@ PLyObject_ToTuple(PLyTypeInfo *info, PyObject *object)
 		}
 		PG_CATCH();
 		{
-			Py_XDECREF(so);
 			Py_XDECREF(value);
 			PG_RE_THROW();
 		}
diff --git a/src/pl/plpython/sql/plpython_types.sql b/src/pl/plpython/sql/plpython_types.sql
index 79fbbb9..49c15c2 100644
--- a/src/pl/plpython/sql/plpython_types.sql
+++ b/src/pl/plpython/sql/plpython_types.sql
@@ -142,6 +142,17 @@ SELECT * FROM test_type_conversion_uint2(100::uint2, -50);
 SELECT * FROM test_type_conversion_uint2(null, 1);
 
 
+CREATE DOMAIN nnint AS int CHECK (VALUE IS NOT NULL);
+
+CREATE FUNCTION test_type_conversion_nnint(x nnint, y int) RETURNS nnint AS $$
+return y
+$$ LANGUAGE plpythonu;
+
+SELECT * FROM test_type_conversion_nnint(10, 20);
+SELECT * FROM test_type_conversion_nnint(null, 20);
+SELECT * FROM test_type_conversion_nnint(10, null);
+
+
 CREATE DOMAIN bytea10 AS bytea CHECK (octet_length(VALUE) = 10 AND VALUE IS NOT NULL);
 
 CREATE FUNCTION test_type_conversion_bytea10(x bytea10, y bytea) RETURNS bytea10 AS $$
In reply to: Peter Eisentraut (#9)
Re: [PATCH] plpythonu datatype conversion improvements

Primary motivation of the attached patch is to support handling bytea
conversion allowing for embedded nulls, which in turn allows for
supporting the marshal module.

Secondary motivation is slightly improved performance for conversion
routines of basic datatypes that have simple mappings between
postgres/python.

Primary design is to change the conversion routines from being based
on cstrings to datums, eg:
PLyBool_FromString(const char *) =>
PLyBool_FromBool(PLyDatumToOb, Datum);

I have reworked this patch a bit and extended the plpython test suite
around it. Current copy attached.

The remaining problem is that the patch loses domain checking on the
return types, because some paths no longer go through the data type's
input function. I have marked these places as FIXME, and the regression
tests also contain a failing test case for this.

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain. I
haven't found one that is exported, but maybe someone could give a tip.

I see an intersection between the work I'm currently doing on COPY BINARY
and this.

Basically if you have an INT, you aren't going to make lots of checks.

However, for a TEXT, postgres needs to reject it if it has a NULL in it
(which doesn't bother Python at all), or if it is has chars which are not
valid in the current encoding, etc.
Many other types like TIMESTAMP have checks which are absolutely necessary
for correctness...

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain. I
haven't found one that is exported, but maybe someone could give a tip.

Problems :

- If the data you're trying to put in the Datum doesn't fit (example : out
of range error, varchar too small, etc), and you want a
datum-type-specific function to check your datum and reject it, how are
you going to build the datum ? perhaps you can't, since your value doesn't
fit. It's a chicken and egg problem : the check function that you expect
to reject your invalid datum will not know it's invalid, since you've
trimmed it at the edges to make it fit in the required Datum type...

- you are going to build a datum that is perhaps valid, and perhaps not,
and send this to a function... having known-invalid datums moving around
could be not such a good idea...

Why not use the copy binary format to communicate between python and pg ?

-> you write code to serialize python objects to binary form
-> you call the recv function to get a postgres datum
-> recv function throws an error if there is any problem

-> as a bonus, you release your python object <-> postgres binary code as
a separate library so people can use it to output data readable by COPY
BINARY and parse COPY BINARY dumps.

#11James Pye
lists@jwp.name
In reply to: Peter Eisentraut (#9)
Re: [PATCH] plpythonu datatype conversion improvements

On Aug 15, 2009, at 4:44 PM, Peter Eisentraut wrote:

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain.

/agree =)

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#9)
Re: [PATCH] plpythonu datatype conversion improvements

Peter Eisentraut <peter_e@gmx.net> writes:

The remaining problem is that the patch loses domain checking on the
return types, because some paths no longer go through the data type's
input function. I have marked these places as FIXME, and the regression
tests also contain a failing test case for this.

For the record, I think this entire patch is a bad idea. PLs should not
be so much in bed with the internal representation of datatypes. To
take just one example, this *will* break when/if we change text to carry
some internal locale indicator. There has been absolutely zero evidence
presented to justify that there's a need to break abstraction to gain
performance in this area.

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain. I
haven't found one that is exported, but maybe someone could give a tip.

There isn't one, but maybe you could expose domain_state_setup and
domain_check_input, or some simple wrapper around them.

regards, tom lane

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#12)
Re: [PATCH] plpythonu datatype conversion improvements

Tom Lane wrote:

For the record, I think this entire patch is a bad idea. PLs should not
be so much in bed with the internal representation of datatypes.

I thought there was some suggestion in the past that we should move some
in that direction. The discussion context was Theo Schlossnagle's
complaint about the overhead of passing bytea to and from PLPerl,
although that might be ameliorated by the hex gadget. The other major
case that would benefit would be passing Array values as the PL's native
array type, and Composite values as the PL's associative array type.
That would save PL users a lot of highly error-prone coding
deconstructing the text representation of such objects.

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
Re: [PATCH] plpythonu datatype conversion improvements

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

For the record, I think this entire patch is a bad idea. PLs should not
be so much in bed with the internal representation of datatypes.

I thought there was some suggestion in the past that we should move some
in that direction.

There's been some discussion about functional improvements like
translating arrays to arrays. I don't know what we'd have to do
to manage that, but possibly some API extensions to the array code
would make it feasible without violating abstractions. The present
patch, however, doesn't appear to have any reason to live other than an
undocumented amount of performance improvement. My feeling about that
is if you're concerned about micro-performance, why are you coding in
python to begin with? It isn't the best choice out there.

regards, tom lane

#15Alvaro Herrera
alvherre@commandprompt.com
In reply to: Peter Eisentraut (#9)
Re: [PATCH] plpythonu datatype conversion improvements

Peter Eisentraut wrote:

I have reworked this patch a bit and extended the plpython test suite
around it. Current copy attached.

I think the errcontext bits should be committed separately to get them
out of the way (and to ensure that they get in, regardless of objections
to other parts of the patch).

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#12)
Re: [PATCH] plpythonu datatype conversion improvements

On mån, 2009-08-17 at 10:42 -0400, Tom Lane wrote:

For the record, I think this entire patch is a bad idea. PLs should not
be so much in bed with the internal representation of datatypes. To
take just one example, this *will* break when/if we change text to carry
some internal locale indicator. There has been absolutely zero evidence
presented to justify that there's a need to break abstraction to gain
performance in this area.

The motivation for this patch has nothing to do with performance. The
point is to pass data types into and out of PL/Python sensibly. In
particular, passing bytea into and out of PL/Python is currently
completely broken, in the sense that what you get in Python is not a
byte string that you can process sensibly.

We could argue that peeking inside the internal representation of data
types might be inappropriate. In which case the solution would be to
run the data through the data type output function and have PL/Python
parse that back in. That would just be a localized change in the patch,
however. (It might be less than ideal for passing float types,
perhaps.) We do, however, expose a data types binary format through the
binary protocol, so perhaps we should be using the send/recv functions
instead of input/output. Which would require hardcoding the bytea
binary format, at least. Either of these solutions would probably solve
the domains problem, though.

Note also that we have historically broken the bytea text format twice
as often as the bytea binary format. ;-)

#17Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#16)
Re: [PATCH] plpythonu datatype conversion improvements

2009/8/18 Peter Eisentraut <peter_e@gmx.net>:

On mån, 2009-08-17 at 10:42 -0400, Tom Lane wrote:

For the record, I think this entire patch is a bad idea.  PLs should not
be so much in bed with the internal representation of datatypes.  To
take just one example, this *will* break when/if we change text to carry
some internal locale indicator.  There has been absolutely zero evidence
presented to justify that there's a need to break abstraction to gain
performance in this area.

I thing, so communication based on text type is bad. Maybe we should
to use binary communication based on send and recv function? It's
should be better and maybe stable than direct transfer - but maybe
little bit slower.

The motivation for this patch has nothing to do with performance.  The
point is to pass data types into and out of PL/Python sensibly.  In
particular, passing bytea into and out of PL/Python is currently
completely broken, in the sense that what you get in Python is not a
byte string that you can process sensibly.

I thing so bytea should be very well optimized. You can expect, so
there be moved bigger block of data.

regards
Pavel

Show quoted text

We could argue that peeking inside the internal representation of data
types might be inappropriate.  In which case the solution would be to
run the data through the data type output function and have PL/Python
parse that back in.  That would just be a localized change in the patch,
however.  (It might be less than ideal for passing float types,
perhaps.)  We do, however, expose a data types binary format through the
binary protocol, so perhaps we should be using the send/recv functions
instead of input/output.  Which would require hardcoding the bytea
binary format, at least.  Either of these solutions would probably solve
the domains problem, though.

Note also that we have historically broken the bytea text format twice
as often as the bytea binary format. ;-)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Caleb Welton
cwelton@greenplum.com
In reply to: Tom Lane (#14)
Re: [PATCH] plpythonu datatype conversion improvements

As documented in the patch, the primary motivation was support of BYTEA datatype, which when cast through cstring was truncating python strings with embedded nulls,
performance was only a secondary consideration.

Regards,
Caleb

(Sorry for my slow entry on this thread, I'm on vacation right now.)

On 8/17/09 8:12 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

For the record, I think this entire patch is a bad idea. PLs should not
be so much in bed with the internal representation of datatypes.

I thought there was some suggestion in the past that we should move some
in that direction.

There's been some discussion about functional improvements like
translating arrays to arrays. I don't know what we'd have to do
to manage that, but possibly some API extensions to the array code
would make it feasible without violating abstractions. The present
patch, however, doesn't appear to have any reason to live other than an
undocumented amount of performance improvement. My feeling about that
is if you're concerned about micro-performance, why are you coding in
python to begin with? It isn't the best choice out there.

regards, tom lane

#19Greg Stark
gsstark@mit.edu
In reply to: Caleb Welton (#18)
Re: [PATCH] plpythonu datatype conversion improvements

On Sat, Aug 22, 2009 at 11:45 AM, Caleb Welton<cwelton@greenplum.com> wrote:

As documented in the patch, the primary motivation was support of BYTEA
datatype, which when cast through cstring was truncating python strings with
embedded nulls,
performance was only a secondary consideration.

The alternative to attaching to the internal representation would be
to marshal and unmarshal the text representation where nuls are
escaped as \000.

However I dispute this this is "micro-performance" that we're talking
about. On any given small datum it may be a small incremental amount
of time but it's not incremental time that matters, it's aggregate. If
you're processing 1TB of data and you have to marshal and unmarshal
all 1TB it doesn't matter that you're doing it in 100 byte chunks. And
in any case there are plenty of people throwing around multi-megabyte
bytea blobs and having to marshal and unmarshal them every time they
go from the database into a PL or back would be a noticeable delay and
risk of out-of-memory errors.

If we want PLs to not be overly in bed with Postgres data types then
the way to do it is to have data types provide abstract methods for
accessing their internals. At least for bytea and text that would be
fairly straightforward. For numeric I don't see that it would really
buy much since it wouldn't really let us completely change
representations.

--
greg
http://mit.edu/~gsstark/resume.pdf

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#19)
Re: [PATCH] plpythonu datatype conversion improvements

Greg Stark <gsstark@mit.edu> writes:

On Sat, Aug 22, 2009 at 11:45 AM, Caleb Welton<cwelton@greenplum.com> wrote:

As documented in the patch, the primary motivation was support of BYTEA
datatype, which when cast through cstring was truncating python strings with
embedded nulls,
performance was only a secondary consideration.

The alternative to attaching to the internal representation would be
to marshal and unmarshal the text representation where nuls are
escaped as \000.

I don't actually have a problem with depending on the internal
representation of bytea. What I'm unhappy about is that (despite
Caleb's assertions that this is only about bytea) the patch proceeds
to make plpython intimate with the internal representation of a bunch
of *other* datatypes, some of which we have good reason to think may
change in future. If it were only touching bytea I would not have
complained.

regards, tom lane

#21Caleb Welton
cwelton@greenplum.com
In reply to: Tom Lane (#20)
Re: [PATCH] plpythonu datatype conversion improvements

I didn't say that it _only_ affects bytea, I said that was the _primary motivation_ for it.

Converting from postgres=>python this change affects boolean, float4, float8, numeric, int16, int32, int64, text, and bytea. The code to handle this goes through DatumGetXXX for the native C type for the datatype, with the exception of the Varlena types (special case) and Numeric which calls numeric_float8() to convert the numeric to a native C double precision float. As mentioned in the original post I do not think that this is appropriate for numeric, and I would prefer a better mapping, but this was a pre-existing issue and is not a change in behavior for the patch. Since this is a separate issue I opted not to change it to keep the patch concise.

Converting from python=>postgres this change effects void, bool, bytea, and text.

The reason for this asymmetry is that there is not a 1:1 mapping of Postgres datatypes to Python datatypes and conciseness of the patch.

All other datatypes (including arrays unfortunately) go through the same text input functions that they did before.

Of the above I would expect the only type that we would have good reason to expect to change would be numeric, and this patch _doesn't_ rely on it's internal representation: it calls numeric_float8().

I think it would be good to have mappings for other datatypes, depending on internal representation or not, but thought that was beyond the scope of the patch.

Regards,
Caleb

On 8/22/09 7:03 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Greg Stark <gsstark@mit.edu> writes:

On Sat, Aug 22, 2009 at 11:45 AM, Caleb Welton<cwelton@greenplum.com> wrote:

As documented in the patch, the primary motivation was support of BYTEA
datatype, which when cast through cstring was truncating python strings with
embedded nulls,
performance was only a secondary consideration.

The alternative to attaching to the internal representation would be
to marshal and unmarshal the text representation where nuls are
escaped as \000.

I don't actually have a problem with depending on the internal
representation of bytea. What I'm unhappy about is that (despite
Caleb's assertions that this is only about bytea) the patch proceeds
to make plpython intimate with the internal representation of a bunch
of *other* datatypes, some of which we have good reason to think may
change in future. If it were only touching bytea I would not have
complained.

regards, tom lane

#22Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#15)
Re: [PATCH] plpythonu datatype conversion improvements

On mån, 2009-08-17 at 11:55 -0400, Alvaro Herrera wrote:

Peter Eisentraut wrote:

I have reworked this patch a bit and extended the plpython test suite
around it. Current copy attached.

I think the errcontext bits should be committed separately to get them
out of the way (and to ensure that they get in, regardless of objections
to other parts of the patch).

Done that now.

#23Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#9)
1 attachment(s)
Re: [PATCH] plpythonu datatype conversion improvements

On sön, 2009-08-16 at 02:44 +0300, Peter Eisentraut wrote:

The remaining problem is that the patch loses domain checking on the
return types, because some paths no longer go through the data type's
input function. I have marked these places as FIXME, and the regression
tests also contain a failing test case for this.

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain. I
haven't found one that is exported, but maybe someone could give a tip.

Got that fixed now. Updated patch is attached. I will sleep over it,
but I think it's good to go.

Attachments:

plpython-datatypes.patchtext/x-patch; charset=UTF-8; name=plpython-datatypes.patchDownload
diff --git a/src/backend/utils/adt/domains.c b/src/backend/utils/adt/domains.c
index ffd5c7a..bda200b 100644
--- a/src/backend/utils/adt/domains.c
+++ b/src/backend/utils/adt/domains.c
@@ -302,3 +302,40 @@ domain_recv(PG_FUNCTION_ARGS)
 	else
 		PG_RETURN_DATUM(value);
 }
+
+/*
+ * domain_check - check that a datum satisfies the constraints of a
+ * domain.  extra and mcxt can be passed if they are available from,
+ * say, a FmgrInfo structure, or they can be NULL, in which case the
+ * setup is repeated for each call.
+ */
+void
+domain_check(Datum value, bool isnull, Oid domainType, void **extra, MemoryContext mcxt)
+{
+	DomainIOData *my_extra = NULL;
+
+	if (mcxt == NULL)
+		mcxt = CurrentMemoryContext;
+
+	/*
+	 * We arrange to look up the needed info just once per series of calls,
+	 * assuming the domain type doesn't change underneath us.
+	 */
+	if (extra)
+		my_extra = (DomainIOData *) *extra;
+	if (my_extra == NULL)
+	{
+		my_extra = (DomainIOData *) MemoryContextAlloc(mcxt,
+													   sizeof(DomainIOData));
+		domain_state_setup(my_extra, domainType, true, mcxt);
+		if (extra)
+			*extra = (void *) my_extra;
+	}
+	else if (my_extra->domain_type != domainType)
+		domain_state_setup(my_extra, domainType, true, mcxt);
+
+	/*
+	 * Do the necessary checks to ensure it's a valid domain value.
+	 */
+	domain_check_input(value, isnull, my_extra);
+}
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 9b0a2b7..df37b16 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -137,6 +137,7 @@ extern Datum char_text(PG_FUNCTION_ARGS);
 /* domains.c */
 extern Datum domain_in(PG_FUNCTION_ARGS);
 extern Datum domain_recv(PG_FUNCTION_ARGS);
+extern void domain_check(Datum value, bool isnull, Oid domainType, void **extra, MemoryContext mcxt);
 
 /* encode.c */
 extern Datum binary_encode(PG_FUNCTION_ARGS);
diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
index 19b3c9e..2a08834 100644
--- a/src/pl/plpython/expected/plpython_types.out
+++ b/src/pl/plpython/expected/plpython_types.out
@@ -32,6 +32,74 @@ CONTEXT:  PL/Python function "test_type_conversion_bool"
  
 (1 row)
 
+-- test various other ways to expression Booleans in Python
+CREATE FUNCTION test_type_conversion_bool_other(n int) RETURNS bool AS $$
+# numbers
+if n == 0:
+   ret = 0
+elif n == 1:
+   ret = 5
+# strings
+elif n == 2:
+   ret = ''
+elif n == 3:
+   ret = 'fa' # true in Python, false in PostgreSQL
+# containers
+elif n == 4:
+   ret = []
+elif n == 5:
+   ret = [0]
+plpy.info(ret, not not ret)
+return ret
+$$ LANGUAGE plpythonu;
+SELECT * FROM test_type_conversion_bool_other(0);
+INFO:  (0, False)
+CONTEXT:  PL/Python function "test_type_conversion_bool_other"
+ test_type_conversion_bool_other 
+---------------------------------
+ f
+(1 row)
+
+SELECT * FROM test_type_conversion_bool_other(1);
+INFO:  (5, True)
+CONTEXT:  PL/Python function "test_type_conversion_bool_other"
+ test_type_conversion_bool_other 
+---------------------------------
+ t
+(1 row)
+
+SELECT * FROM test_type_conversion_bool_other(2);
+INFO:  ('', False)
+CONTEXT:  PL/Python function "test_type_conversion_bool_other"
+ test_type_conversion_bool_other 
+---------------------------------
+ f
+(1 row)
+
+SELECT * FROM test_type_conversion_bool_other(3);
+INFO:  ('fa', True)
+CONTEXT:  PL/Python function "test_type_conversion_bool_other"
+ test_type_conversion_bool_other 
+---------------------------------
+ t
+(1 row)
+
+SELECT * FROM test_type_conversion_bool_other(4);
+INFO:  ([], False)
+CONTEXT:  PL/Python function "test_type_conversion_bool_other"
+ test_type_conversion_bool_other 
+---------------------------------
+ f
+(1 row)
+
+SELECT * FROM test_type_conversion_bool_other(5);
+INFO:  ([0], True)
+CONTEXT:  PL/Python function "test_type_conversion_bool_other"
+ test_type_conversion_bool_other 
+---------------------------------
+ t
+(1 row)
+
 CREATE FUNCTION test_type_conversion_char(x char) RETURNS char AS $$
 plpy.info(x, type(x))
 return x
@@ -278,13 +346,21 @@ plpy.info(x, type(x))
 return x
 $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_conversion_bytea('hello world');
-INFO:  ('\\x68656c6c6f20776f726c64', <type 'str'>)
+INFO:  ('hello world', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea"
  test_type_conversion_bytea 
 ----------------------------
  \x68656c6c6f20776f726c64
 (1 row)
 
+SELECT * FROM test_type_conversion_bytea(E'null\\000byte');
+INFO:  ('null\x00byte', <type 'str'>)
+CONTEXT:  PL/Python function "test_type_conversion_bytea"
+ test_type_conversion_bytea 
+----------------------------
+ \x6e756c6c0062797465
+(1 row)
+
 SELECT * FROM test_type_conversion_bytea(null);
 INFO:  (None, <type 'NoneType'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea"
@@ -304,17 +380,31 @@ try:
 except ValueError, e:
     return 'FAILED: ' + str(e)
 $$ LANGUAGE plpythonu;
-/* This will currently fail because the bytea datum is presented to
-   Python as a string in bytea-encoding, which Python doesn't understand. */
 SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
-   test_type_unmarshal    
---------------------------
- FAILED: bad marshal data
+ test_type_unmarshal 
+---------------------
+ hello world
 (1 row)
 
 --
 -- Domains
 --
+CREATE DOMAIN booltrue AS bool CHECK (VALUE IS TRUE OR VALUE IS NULL);
+CREATE FUNCTION test_type_conversion_booltrue(x booltrue, y bool) RETURNS booltrue AS $$
+return y
+$$ LANGUAGE plpythonu;
+SELECT * FROM test_type_conversion_booltrue(true, true);
+ test_type_conversion_booltrue 
+-------------------------------
+ t
+(1 row)
+
+SELECT * FROM test_type_conversion_booltrue(false, true);
+ERROR:  value for domain booltrue violates check constraint "booltrue_check"
+SELECT * FROM test_type_conversion_booltrue(true, false);
+ERROR:  value for domain booltrue violates check constraint "booltrue_check"
+CONTEXT:  while creating return value
+PL/Python function "test_type_conversion_booltrue"
 CREATE DOMAIN uint2 AS int2 CHECK (VALUE >= 0);
 CREATE FUNCTION test_type_conversion_uint2(x uint2, y int) RETURNS uint2 AS $$
 plpy.info(x, type(x))
@@ -342,13 +432,29 @@ CONTEXT:  PL/Python function "test_type_conversion_uint2"
                           1
 (1 row)
 
+CREATE DOMAIN nnint AS int CHECK (VALUE IS NOT NULL);
+CREATE FUNCTION test_type_conversion_nnint(x nnint, y int) RETURNS nnint AS $$
+return y
+$$ LANGUAGE plpythonu;
+SELECT * FROM test_type_conversion_nnint(10, 20);
+ test_type_conversion_nnint 
+----------------------------
+                         20
+(1 row)
+
+SELECT * FROM test_type_conversion_nnint(null, 20);
+ERROR:  value for domain nnint violates check constraint "nnint_check"
+SELECT * FROM test_type_conversion_nnint(10, null);
+ERROR:  value for domain nnint violates check constraint "nnint_check"
+CONTEXT:  while creating return value
+PL/Python function "test_type_conversion_nnint"
 CREATE DOMAIN bytea10 AS bytea CHECK (octet_length(VALUE) = 10 AND VALUE IS NOT NULL);
 CREATE FUNCTION test_type_conversion_bytea10(x bytea10, y bytea) RETURNS bytea10 AS $$
 plpy.info(x, type(x))
 return y
 $$ LANGUAGE plpythonu;
 SELECT * FROM test_type_conversion_bytea10('hello wold', 'hello wold');
-INFO:  ('\\x68656c6c6f20776f6c64', <type 'str'>)
+INFO:  ('hello wold', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
  test_type_conversion_bytea10 
 ------------------------------
@@ -358,7 +464,7 @@ CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 SELECT * FROM test_type_conversion_bytea10('hello world', 'hello wold');
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 SELECT * FROM test_type_conversion_bytea10('hello word', 'hello world');
-INFO:  ('\\x68656c6c6f20776f7264', <type 'str'>)
+INFO:  ('hello word', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 CONTEXT:  while creating return value
@@ -366,7 +472,7 @@ PL/Python function "test_type_conversion_bytea10"
 SELECT * FROM test_type_conversion_bytea10(null, 'hello word');
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 SELECT * FROM test_type_conversion_bytea10('hello word', null);
-INFO:  ('\\x68656c6c6f20776f7264', <type 'str'>)
+INFO:  ('hello word', <type 'str'>)
 CONTEXT:  PL/Python function "test_type_conversion_bytea10"
 ERROR:  value for domain bytea10 violates check constraint "bytea10_check"
 CONTEXT:  while creating return value
diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c
index d9eee66..f072d75 100644
--- a/src/pl/plpython/plpython.c
+++ b/src/pl/plpython/plpython.c
@@ -78,7 +78,8 @@ PG_MODULE_MAGIC;
  * objects.
  */
 
-typedef PyObject *(*PLyDatumToObFunc) (const char *);
+struct PLyDatumToOb;
+typedef PyObject *(*PLyDatumToObFunc) (struct PLyDatumToOb*, Datum);
 
 typedef struct PLyDatumToOb
 {
@@ -104,8 +105,16 @@ typedef union PLyTypeInput
 /* convert PyObject to a Postgresql Datum or tuple.
  * output from Python
  */
+
+struct PLyObToDatum;
+struct PLyTypeInfo;
+typedef Datum (*PLyObToDatumFunc) (struct PLyTypeInfo*,
+								   struct PLyObToDatum*,
+								   PyObject *);
+
 typedef struct PLyObToDatum
 {
+	PLyObToDatumFunc func;
 	FmgrInfo	typfunc;		/* The type's input function */
 	Oid			typoid;			/* The OID of the type */
 	Oid			typioparam;
@@ -131,12 +140,11 @@ typedef struct PLyTypeInfo
 {
 	PLyTypeInput in;
 	PLyTypeOutput out;
-	int			is_rowtype;
-
 	/*
-	 * is_rowtype can be: -1  not known yet (initial state) 0  scalar datatype
-	 * 1  rowtype 2  rowtype, but I/O functions not set up yet
+	 * is_rowtype can be: -1 = not known yet (initial state); 0 = scalar datatype;
+	 * 1 = rowtype; 2 = rowtype, but I/O functions not set up yet
 	 */
+	int			is_rowtype;
 } PLyTypeInfo;
 
 
@@ -263,12 +271,24 @@ static void PLy_output_tuple_funcs(PLyTypeInfo *, TupleDesc);
 static void PLy_input_tuple_funcs(PLyTypeInfo *, TupleDesc);
 
 /* conversion functions */
+static PyObject *PLyBool_FromBool(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyInt_FromInt16(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyInt_FromInt32(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyLong_FromInt64(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyString_FromBytea(PLyDatumToOb *arg, Datum d);
+static PyObject *PLyString_FromDatum(PLyDatumToOb *arg, Datum d);
+
 static PyObject *PLyDict_FromTuple(PLyTypeInfo *, HeapTuple, TupleDesc);
-static PyObject *PLyBool_FromString(const char *);
-static PyObject *PLyFloat_FromString(const char *);
-static PyObject *PLyInt_FromString(const char *);
-static PyObject *PLyLong_FromString(const char *);
-static PyObject *PLyString_FromString(const char *);
+
+static Datum PLyObject_ToBool(PLyTypeInfo *, PLyObToDatum *,
+							  PyObject *);
+static Datum PLyObject_ToBytea(PLyTypeInfo *, PLyObToDatum *,
+							   PyObject *);
+static Datum PLyObject_ToDatum(PLyTypeInfo *, PLyObToDatum *,
+							   PyObject *);
 
 static HeapTuple PLyMapping_ToTuple(PLyTypeInfo *, PyObject *);
 static HeapTuple PLySequence_ToTuple(PLyTypeInfo *, PyObject *);
@@ -552,8 +572,6 @@ PLy_modify_tuple(PLyProcedure *proc, PyObject *pltd, TriggerData *tdata,
 
 		for (i = 0; i < natts; i++)
 		{
-			char	   *src;
-
 			platt = PyList_GetItem(plkeys, i);
 			if (!PyString_Check(platt))
 				ereport(ERROR,
@@ -580,20 +598,9 @@ PLy_modify_tuple(PLyProcedure *proc, PyObject *pltd, TriggerData *tdata,
 			}
 			else if (plval != Py_None)
 			{
-				plstr = PyObject_Str(plval);
-				if (!plstr)
-					PLy_elog(ERROR, "could not create string representation of Python object");
-				src = PyString_AsString(plstr);
-
-				modvalues[i] =
-					InputFunctionCall(&proc->result.out.r.atts[atti].typfunc,
-									  src,
-									proc->result.out.r.atts[atti].typioparam,
-									  tupdesc->attrs[atti]->atttypmod);
+				PLyObToDatum *att = &proc->result.out.r.atts[atti];
+				modvalues[i] = (att->func) (&proc->result, att, plval);
 				modnulls[i] = ' ';
-
-				Py_DECREF(plstr);
-				plstr = NULL;
 			}
 			else
 			{
@@ -830,8 +837,6 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 	Datum		rv;
 	PyObject   *volatile plargs = NULL;
 	PyObject   *volatile plrv = NULL;
-	PyObject   *volatile plrv_so = NULL;
-	char	   *plrv_sc;
 	ErrorContextCallback plerrcontext;
 
 	PG_TRY();
@@ -909,7 +914,6 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 
 				Py_XDECREF(plargs);
 				Py_XDECREF(plrv);
-				Py_XDECREF(plrv_so);
 
 				PLy_function_delete_args(proc);
 
@@ -927,6 +931,8 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 		plerrcontext.previous = error_context_stack;
 		error_context_stack = &plerrcontext;
 
+		/* Convert python return value into postgres datatypes */
+
 		/*
 		 * If the function is declared to return void, the Python return value
 		 * must be None. For void-returning functions, we also treat a None
@@ -983,21 +989,17 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 		else
 		{
 			fcinfo->isnull = false;
-			plrv_so = PyObject_Str(plrv);
-			if (!plrv_so)
-				PLy_elog(ERROR, "could not create string representation of Python object");
-			plrv_sc = PyString_AsString(plrv_so);
-			rv = InputFunctionCall(&proc->result.out.d.typfunc,
-								   plrv_sc,
-								   proc->result.out.d.typioparam,
-								   -1);
+			rv = (proc->result.out.d.func) (&proc->result,
+											&proc->result.out.d,
+											plrv);
 		}
+
+		error_context_stack = plerrcontext.previous;
 	}
 	PG_CATCH();
 	{
 		Py_XDECREF(plargs);
 		Py_XDECREF(plrv);
-		Py_XDECREF(plrv_so);
 
 		PG_RE_THROW();
 	}
@@ -1007,7 +1009,6 @@ PLy_function_handler(FunctionCallInfo fcinfo, PLyProcedure *proc)
 
 	Py_XDECREF(plargs);
 	Py_DECREF(plrv);
-	Py_XDECREF(plrv_so);
 
 	return rv;
 }
@@ -1090,12 +1091,8 @@ PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
 					arg = NULL;
 				else
 				{
-					char	   *ct;
-
-					ct = OutputFunctionCall(&(proc->args[i].in.d.typfunc),
-											fcinfo->arg[i]);
-					arg = (proc->args[i].in.d.func) (ct);
-					pfree(ct);
+					arg = (proc->args[i].in.d.func) (&(proc->args[i].in.d),
+													 fcinfo->arg[i]);
 				}
 			}
 
@@ -1646,6 +1643,24 @@ PLy_output_datum_func2(PLyObToDatum *arg, HeapTuple typeTup)
 	arg->typoid = HeapTupleGetOid(typeTup);
 	arg->typioparam = getTypeIOParam(typeTup);
 	arg->typbyval = typeStruct->typbyval;
+
+	/*
+	 * Select a conversion function to convert Python objects to
+	 * PostgreSQL datums.  Most data types can go through the generic
+	 * function.
+	 */
+	switch (getBaseType(arg->typoid))
+	{
+		case BOOLOID:
+			arg->func = PLyObject_ToBool;
+			break;
+		case BYTEAOID:
+			arg->func = PLyObject_ToBytea;
+			break;
+		default:
+			arg->func = PLyObject_ToDatum;
+			break;
+	}
 }
 
 static void
@@ -1672,22 +1687,31 @@ PLy_input_datum_func2(PLyDatumToOb *arg, Oid typeOid, HeapTuple typeTup)
 	switch (getBaseType(typeOid))
 	{
 		case BOOLOID:
-			arg->func = PLyBool_FromString;
+			arg->func = PLyBool_FromBool;
 			break;
 		case FLOAT4OID:
+			arg->func = PLyFloat_FromFloat4;
+			break;
 		case FLOAT8OID:
+			arg->func = PLyFloat_FromFloat8;
+			break;
 		case NUMERICOID:
-			arg->func = PLyFloat_FromString;
+			arg->func = PLyFloat_FromNumeric;
 			break;
 		case INT2OID:
+			arg->func = PLyInt_FromInt16;
+			break;
 		case INT4OID:
-			arg->func = PLyInt_FromString;
+			arg->func = PLyInt_FromInt32;
 			break;
 		case INT8OID:
-			arg->func = PLyLong_FromString;
+			arg->func = PLyLong_FromInt64;
+			break;
+		case BYTEAOID:
+			arg->func = PLyString_FromBytea;
 			break;
 		default:
-			arg->func = PLyString_FromString;
+			arg->func = PLyString_FromDatum;
 			break;
 	}
 }
@@ -1713,9 +1737,8 @@ PLy_typeinfo_dealloc(PLyTypeInfo *arg)
 	}
 }
 
-/* assumes that a bool is always returned as a 't' or 'f' */
 static PyObject *
-PLyBool_FromString(const char *src)
+PLyBool_FromBool(PLyDatumToOb *arg, Datum d)
 {
 	/*
 	 * We would like to use Py_RETURN_TRUE and Py_RETURN_FALSE here for
@@ -1723,47 +1746,75 @@ PLyBool_FromString(const char *src)
 	 * Python >= 2.3, and we support older versions.
 	 * http://docs.python.org/api/boolObjects.html
 	 */
-	if (src[0] == 't')
+	if (DatumGetBool(d))
 		return PyBool_FromLong(1);
 	return PyBool_FromLong(0);
 }
 
 static PyObject *
-PLyFloat_FromString(const char *src)
+PLyFloat_FromFloat4(PLyDatumToOb *arg, Datum d)
 {
-	double		v;
-	char	   *eptr;
+	return PyFloat_FromDouble(DatumGetFloat4(d));
+}
 
-	errno = 0;
-	v = strtod(src, &eptr);
-	if (*eptr != '\0' || errno)
-		return NULL;
-	return PyFloat_FromDouble(v);
+static PyObject *
+PLyFloat_FromFloat8(PLyDatumToOb *arg, Datum d)
+{
+	return PyFloat_FromDouble(DatumGetFloat8(d));
 }
 
 static PyObject *
-PLyInt_FromString(const char *src)
+PLyFloat_FromNumeric(PLyDatumToOb *arg, Datum d)
 {
-	long		v;
-	char	   *eptr;
+	/*
+	 * Numeric is cast to a PyFloat:
+	 *   This results in a loss of precision
+	 *   Would it be better to cast to PyString?
+	 */
+	Datum  f = DirectFunctionCall1(numeric_float8, d);
+	double x = DatumGetFloat8(f);
+	return PyFloat_FromDouble(x);
+}
 
-	errno = 0;
-	v = strtol(src, &eptr, 0);
-	if (*eptr != '\0' || errno)
-		return NULL;
-	return PyInt_FromLong(v);
+static PyObject *
+PLyInt_FromInt16(PLyDatumToOb *arg, Datum d)
+{
+	return PyInt_FromLong(DatumGetInt16(d));
+}
+
+static PyObject *
+PLyInt_FromInt32(PLyDatumToOb *arg, Datum d)
+{
+	return PyInt_FromLong(DatumGetInt32(d));
 }
 
 static PyObject *
-PLyLong_FromString(const char *src)
+PLyLong_FromInt64(PLyDatumToOb *arg, Datum d)
 {
-	return PyLong_FromString((char *) src, NULL, 0);
+	/* on 32 bit platforms "long" may be too small */
+	if (sizeof(int64) > sizeof(long))
+		return PyLong_FromLongLong(DatumGetInt64(d));
+	else
+		return PyLong_FromLong(DatumGetInt64(d));
 }
 
 static PyObject *
-PLyString_FromString(const char *src)
+PLyString_FromBytea(PLyDatumToOb *arg, Datum d)
 {
-	return PyString_FromString(src);
+	text     *txt = DatumGetByteaP(d);
+	char     *str = VARDATA(txt);
+	size_t    size = VARSIZE(txt) - VARHDRSZ;
+
+	return PyString_FromStringAndSize(str, size);
+}
+
+static PyObject *
+PLyString_FromDatum(PLyDatumToOb *arg, Datum d)
+{
+	char     *x = OutputFunctionCall(&arg->typfunc, d);
+	PyObject *r = PyString_FromString(x);
+	pfree(x);
+	return r;
 }
 
 static PyObject *
@@ -1783,8 +1834,7 @@ PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
 	{
 		for (i = 0; i < info->in.r.natts; i++)
 		{
-			char	   *key,
-					   *vsrc;
+			char	   *key;
 			Datum		vattr;
 			bool		is_null;
 			PyObject   *value;
@@ -1799,14 +1849,7 @@ PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
 				PyDict_SetItemString(dict, key, Py_None);
 			else
 			{
-				vsrc = OutputFunctionCall(&info->in.r.atts[i].typfunc,
-										  vattr);
-
-				/*
-				 * no exceptions allowed
-				 */
-				value = info->in.r.atts[i].func(vsrc);
-				pfree(vsrc);
+				value = (info->in.r.atts[i].func) (&info->in.r.atts[i], vattr);
 				PyDict_SetItemString(dict, key, value);
 				Py_DECREF(value);
 			}
@@ -1822,6 +1865,116 @@ PLyDict_FromTuple(PLyTypeInfo *info, HeapTuple tuple, TupleDesc desc)
 	return dict;
 }
 
+/*
+ * Convert a Python object to a PostgreSQL bool datum.  This can't go
+ * through the generic conversion function, because Python attaches a
+ * Boolean value to everything, more things than the PostgreSQL bool
+ * type can parse.
+ */
+static Datum
+PLyObject_ToBool(PLyTypeInfo *info,
+				 PLyObToDatum *arg,
+				 PyObject *plrv)
+{
+	Datum		rv;
+
+	Assert(plrv != Py_None);
+	rv = BoolGetDatum(PyObject_IsTrue(plrv));
+
+	if (get_typtype(arg->typoid) == TYPTYPE_DOMAIN)
+		domain_check(rv, false, arg->typoid, &arg->typfunc.fn_extra, arg->typfunc.fn_mcxt);
+
+	return rv;
+}
+
+/*
+ * Convert a Python object to a PostgreSQL bytea datum.  This doesn't
+ * go through the generic conversion function to circumvent problems
+ * with embedded nulls.  And it's faster this way.
+ */
+static Datum
+PLyObject_ToBytea(PLyTypeInfo *info,
+				  PLyObToDatum *arg,
+				  PyObject *plrv)
+{
+	PyObject   *volatile plrv_so = NULL;
+	Datum       rv;
+
+	Assert(plrv != Py_None);
+
+	plrv_so = PyObject_Str(plrv);
+	if (!plrv_so)
+		PLy_elog(ERROR, "could not create string representation of Python object");
+
+	PG_TRY();
+	{
+		char *plrv_sc = PyString_AsString(plrv_so);
+		size_t len = PyString_Size(plrv_so);
+		size_t size = len + VARHDRSZ;
+		bytea *result = (bytea*) palloc(size);
+
+		SET_VARSIZE(result, size);
+		memcpy(VARDATA(result), plrv_sc, len);
+		rv = PointerGetDatum(result);
+	}
+	PG_CATCH();
+	{
+		Py_XDECREF(plrv_so);
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+
+	Py_XDECREF(plrv_so);
+
+	if (get_typtype(arg->typoid) == TYPTYPE_DOMAIN)
+		domain_check(rv, false, arg->typoid, &arg->typfunc.fn_extra, arg->typfunc.fn_mcxt);
+
+	return rv;
+}
+
+/*
+ * Generic conversion function: Convert PyObject to cstring and
+ * cstring into PostgreSQL type.
+ */
+static Datum
+PLyObject_ToDatum(PLyTypeInfo *info,
+				  PLyObToDatum *arg,
+				  PyObject *plrv)
+{
+	PyObject *volatile plrv_so = NULL;
+	Datum     rv;
+
+	Assert(plrv != Py_None);
+
+	plrv_so = PyObject_Str(plrv);
+	if (!plrv_so)
+		PLy_elog(ERROR, "could not create string representation of Python object");
+
+	PG_TRY();
+	{
+		char *plrv_sc = PyString_AsString(plrv_so);
+		size_t plen = PyString_Size(plrv_so);
+		size_t slen = strlen(plrv_sc);
+
+		if (slen < plen)
+			ereport(ERROR,
+					(errcode(ERRCODE_DATATYPE_MISMATCH),
+					 errmsg("could not convert Python object into cstring: Python string representation appears to contain null bytes")));
+		else if (slen > plen)
+			elog(ERROR, "could not convert Python object into cstring: Python string longer than reported length");
+		rv = InputFunctionCall(&arg->typfunc, plrv_sc, arg->typioparam, -1);
+	}
+	PG_CATCH();
+	{
+		Py_XDECREF(plrv_so);
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+
+	Py_XDECREF(plrv_so);
+
+	return rv;
+}
 
 static HeapTuple
 PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
@@ -1845,11 +1998,12 @@ PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
 	for (i = 0; i < desc->natts; ++i)
 	{
 		char	   *key;
-		PyObject   *volatile value,
-				   *volatile so;
+		PyObject   *volatile value;
+		PLyObToDatum *att;
 
 		key = NameStr(desc->attrs[i]->attname);
-		value = so = NULL;
+		value = NULL;
+		att = &info->out.r.atts[i];
 		PG_TRY();
 		{
 			value = PyMapping_GetItemString(mapping, key);
@@ -1860,19 +2014,7 @@ PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
 			}
 			else if (value)
 			{
-				char	   *valuestr;
-
-				so = PyObject_Str(value);
-				if (so == NULL)
-					PLy_elog(ERROR, "could not compute string representation of Python object");
-				valuestr = PyString_AsString(so);
-
-				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-											  ,valuestr
-											  ,info->out.r.atts[i].typioparam
-											  ,-1);
-				Py_DECREF(so);
-				so = NULL;
+				values[i] = (att->func) (info, att, value);
 				nulls[i] = false;
 			}
 			else
@@ -1887,7 +2029,6 @@ PLyMapping_ToTuple(PLyTypeInfo *info, PyObject *mapping)
 		}
 		PG_CATCH();
 		{
-			Py_XDECREF(so);
 			Py_XDECREF(value);
 			PG_RE_THROW();
 		}
@@ -1934,10 +2075,11 @@ PLySequence_ToTuple(PLyTypeInfo *info, PyObject *sequence)
 	nulls = palloc(sizeof(bool) * desc->natts);
 	for (i = 0; i < desc->natts; ++i)
 	{
-		PyObject   *volatile value,
-				   *volatile so;
+		PyObject   *volatile value;
+		PLyObToDatum *att;
 
-		value = so = NULL;
+		value = NULL;
+		att = &info->out.r.atts[i];
 		PG_TRY();
 		{
 			value = PySequence_GetItem(sequence, i);
@@ -1949,18 +2091,7 @@ PLySequence_ToTuple(PLyTypeInfo *info, PyObject *sequence)
 			}
 			else if (value)
 			{
-				char	   *valuestr;
-
-				so = PyObject_Str(value);
-				if (so == NULL)
-					PLy_elog(ERROR, "could not compute string representation of Python object");
-				valuestr = PyString_AsString(so);
-				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-											  ,valuestr
-											  ,info->out.r.atts[i].typioparam
-											  ,-1);
-				Py_DECREF(so);
-				so = NULL;
+				values[i] = (att->func) (info, att, value);
 				nulls[i] = false;
 			}
 
@@ -1969,7 +2100,6 @@ PLySequence_ToTuple(PLyTypeInfo *info, PyObject *sequence)
 		}
 		PG_CATCH();
 		{
-			Py_XDECREF(so);
 			Py_XDECREF(value);
 			PG_RE_THROW();
 		}
@@ -2005,11 +2135,12 @@ PLyObject_ToTuple(PLyTypeInfo *info, PyObject *object)
 	for (i = 0; i < desc->natts; ++i)
 	{
 		char	   *key;
-		PyObject   *volatile value,
-				   *volatile so;
+		PyObject   *volatile value;
+		PLyObToDatum *att;
 
 		key = NameStr(desc->attrs[i]->attname);
-		value = so = NULL;
+		value = NULL;
+		att = &info->out.r.atts[i];
 		PG_TRY();
 		{
 			value = PyObject_GetAttrString(object, key);
@@ -2020,18 +2151,7 @@ PLyObject_ToTuple(PLyTypeInfo *info, PyObject *object)
 			}
 			else if (value)
 			{
-				char	   *valuestr;
-
-				so = PyObject_Str(value);
-				if (so == NULL)
-					PLy_elog(ERROR, "could not compute string representation of Python object");
-				valuestr = PyString_AsString(so);
-				values[i] = InputFunctionCall(&info->out.r.atts[i].typfunc
-											  ,valuestr
-											  ,info->out.r.atts[i].typioparam
-											  ,-1);
-				Py_DECREF(so);
-				so = NULL;
+				values[i] = (att->func) (info, att, value);
 				nulls[i] = false;
 			}
 			else
@@ -2047,7 +2167,6 @@ PLyObject_ToTuple(PLyTypeInfo *info, PyObject *object)
 		}
 		PG_CATCH();
 		{
-			Py_XDECREF(so);
 			Py_XDECREF(value);
 			PG_RE_THROW();
 		}
diff --git a/src/pl/plpython/sql/plpython_types.sql b/src/pl/plpython/sql/plpython_types.sql
index 79fbbb9..a68e7a8 100644
--- a/src/pl/plpython/sql/plpython_types.sql
+++ b/src/pl/plpython/sql/plpython_types.sql
@@ -16,6 +16,35 @@ SELECT * FROM test_type_conversion_bool(false);
 SELECT * FROM test_type_conversion_bool(null);
 
 
+-- test various other ways to expression Booleans in Python
+CREATE FUNCTION test_type_conversion_bool_other(n int) RETURNS bool AS $$
+# numbers
+if n == 0:
+   ret = 0
+elif n == 1:
+   ret = 5
+# strings
+elif n == 2:
+   ret = ''
+elif n == 3:
+   ret = 'fa' # true in Python, false in PostgreSQL
+# containers
+elif n == 4:
+   ret = []
+elif n == 5:
+   ret = [0]
+plpy.info(ret, not not ret)
+return ret
+$$ LANGUAGE plpythonu;
+
+SELECT * FROM test_type_conversion_bool_other(0);
+SELECT * FROM test_type_conversion_bool_other(1);
+SELECT * FROM test_type_conversion_bool_other(2);
+SELECT * FROM test_type_conversion_bool_other(3);
+SELECT * FROM test_type_conversion_bool_other(4);
+SELECT * FROM test_type_conversion_bool_other(5);
+
+
 CREATE FUNCTION test_type_conversion_char(x char) RETURNS char AS $$
 plpy.info(x, type(x))
 return x
@@ -105,6 +134,7 @@ return x
 $$ LANGUAGE plpythonu;
 
 SELECT * FROM test_type_conversion_bytea('hello world');
+SELECT * FROM test_type_conversion_bytea(E'null\\000byte');
 SELECT * FROM test_type_conversion_bytea(null);
 
 
@@ -121,8 +151,6 @@ except ValueError, e:
     return 'FAILED: ' + str(e)
 $$ LANGUAGE plpythonu;
 
-/* This will currently fail because the bytea datum is presented to
-   Python as a string in bytea-encoding, which Python doesn't understand. */
 SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
 
 
@@ -130,6 +158,17 @@ SELECT test_type_unmarshal(x) FROM test_type_marshal() x;
 -- Domains
 --
 
+CREATE DOMAIN booltrue AS bool CHECK (VALUE IS TRUE OR VALUE IS NULL);
+
+CREATE FUNCTION test_type_conversion_booltrue(x booltrue, y bool) RETURNS booltrue AS $$
+return y
+$$ LANGUAGE plpythonu;
+
+SELECT * FROM test_type_conversion_booltrue(true, true);
+SELECT * FROM test_type_conversion_booltrue(false, true);
+SELECT * FROM test_type_conversion_booltrue(true, false);
+
+
 CREATE DOMAIN uint2 AS int2 CHECK (VALUE >= 0);
 
 CREATE FUNCTION test_type_conversion_uint2(x uint2, y int) RETURNS uint2 AS $$
@@ -142,6 +181,17 @@ SELECT * FROM test_type_conversion_uint2(100::uint2, -50);
 SELECT * FROM test_type_conversion_uint2(null, 1);
 
 
+CREATE DOMAIN nnint AS int CHECK (VALUE IS NOT NULL);
+
+CREATE FUNCTION test_type_conversion_nnint(x nnint, y int) RETURNS nnint AS $$
+return y
+$$ LANGUAGE plpythonu;
+
+SELECT * FROM test_type_conversion_nnint(10, 20);
+SELECT * FROM test_type_conversion_nnint(null, 20);
+SELECT * FROM test_type_conversion_nnint(10, null);
+
+
 CREATE DOMAIN bytea10 AS bytea CHECK (octet_length(VALUE) = 10 AND VALUE IS NOT NULL);
 
 CREATE FUNCTION test_type_conversion_bytea10(x bytea10, y bytea) RETURNS bytea10 AS $$
#24Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#23)
Re: [PATCH] plpythonu datatype conversion improvements

On mån, 2009-08-31 at 23:41 +0300, Peter Eisentraut wrote:

On sön, 2009-08-16 at 02:44 +0300, Peter Eisentraut wrote:

The remaining problem is that the patch loses domain checking on the
return types, because some paths no longer go through the data type's
input function. I have marked these places as FIXME, and the regression
tests also contain a failing test case for this.

What's needed here, I think, is an API that takes a datum plus type
information and checks whether the datum is valid within the domain. I
haven't found one that is exported, but maybe someone could give a tip.

Got that fixed now. Updated patch is attached. I will sleep over it,
but I think it's good to go.

committed

#25Hannu Krosing
hannu@2ndQuadrant.com
In reply to: Caleb Welton (#6)
Re: [PATCH] plpythonu datatype conversion improvements

On Wed, 2009-05-27 at 14:25 -0700, Caleb Welton wrote:

Yes, in Python >= 2.4 there is the Decimal datatype.

However, unlike the other mappings employed by plpythonu, Decimal
requires an import statement to be in scope.

adding it as already-imported module should not be hard

I think that moving to saner mappings should at least be discussed

and even if it is not in scope for the user-defined function body there
is nothing that prevents one from using it for conversion.

The Decimal _type_ needs not to be in scope for using Decimal
_instances_

maybe this should/could be controlled by a GUC.

btw, can we currently use funtions in setting GUC parameters ?

if we can , then we could define some python environment initializing
function and then do

ALTER USER xxx SET pyinit = initialise_python_for_xxx()

-Caleb

On 5/27/09 2:07 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On Wednesday 27 May 2009 21:53:31 Caleb Welton wrote:

... My own
feeling on the matter is that PyFloat is the wrong mapping

for numeric, but

I didn't want to muddy this patch by changing that.

Yeah, that one had me wondering for a while as well, but as

you say it is

better to address that separately.

That was making me itch as well, in my very cursory look at
the patch.
Does Python have a saner mapping for it?

regards, tom lane

--
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training