Mention column name in error messages

Started by Franck Verrotalmost 11 years ago27 messageshackers
Jump to latest
#1Franck Verrot
franck@verrot.fr

Hi all,

As far as I know, there is currently no way to find which column is triggering
an error on an INSERT or ALTER COLUMN statement. Example:

# create table foo(bar varchar(4), baz varchar(2));
CREATE TABLE
# insert into foo values ('foo!', 'ok');
INSERT 0 1
# insert into foo values ('foo2!', 'ok');
ERROR: value too long for type character varying(4)
# insert into foo values ('foo!', 'ok2');
ERROR: value too long for type character varying(2)

I browsed the list and found a conversation from last year
(/messages/by-id/3214.1390155040@sss.pgh.pa.us) that
discussed adding the actual value in the output.

The behavior I am proposing differs in the sense we will be able to see in
the "ERROR: xxxx" what column triggered the error:

# create table foo(bar varchar(4), baz varchar(2));
CREATE TABLE
# insert into foo values ('foo!', 'ok');
INSERT 0 1
# insert into foo values ('foo2!', 'ok');
ERROR: value too long for bar of type character varying(4)
# insert into foo values ('foo!', 'ok2');
ERROR: value too long for baz of type character varying(2)

In that same conversation, Tom Lane said:

[...] People have speculated about ways to
name the target column in the error report, which would probably be
far more useful; but it's not real clear how to do that in our system
structure.

Given my very restricted knowledge of PG's codebase I am not sure whether my
modifications are legitimate or not, so please don't hesitate to comment on
it and pointing where things are subpar to PG's codebase. In any case, it's
to be considered as WIP for the moment.

Thanks in advance,
Franck

--
Franck Verrot
https://github.com/franckverrot
https://twitter.com/franckverrot

Attachments:

verbose-validation.patchapplication/octet-stream; name=verbose-validation.patchDownload+106-57
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Franck Verrot (#1)
Re: Mention column name in error messages

Franck Verrot <franck@verrot.fr> writes:

As far as I know, there is currently no way to find which column is triggering
an error on an INSERT or ALTER COLUMN statement. Example:

Indeed ...

Given my very restricted knowledge of PG's codebase I am not sure whether my
modifications are legitimate or not, so please don't hesitate to comment on
it and pointing where things are subpar to PG's codebase. In any case, it's
to be considered as WIP for the moment.

I think the way you're going at this is fundamentally wrong. It amounts
to an (undocumented) change to the API for datatype input functions, and
the only way it can work is to change *every datatype's* input function
to know about it. That's not practical. More, it doesn't cover errors
that are thrown from someplace other than a datatype input function.
Evaluation errors in domain check constraint expressions are an example
that would be very hard to manage this way. And the approach definitely
doesn't scale nicely when you start thinking about other contexts wherein
a datatype input operation or coercion might fail.

What seems more likely to lead to a usable patch is to arrange for the
extra information you want to be emitted as error "context", via an error
context callback that gets installed at the right times. We already have
that set up for datatype-specific errors detected by COPY IN, for example.
If you feed garbage data to COPY you get something like

ERROR: invalid input syntax for integer: "foo"
CONTEXT: COPY int8_tbl, line 2, column q2: "foo"

with no need for int8in to be directly aware of the context. You should
try adapting that methodology for the cases you're worried about.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Franck Verrot
franck@verrot.fr
In reply to: Tom Lane (#2)
Re: Mention column name in error messages

On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What seems more likely to lead to a usable patch is to arrange for the
extra information you want to be emitted as error "context", via an error
context callback that gets installed at the right times. ...
...
with no need for int8in to be directly aware of the context. You should
try adapting that methodology for the cases you're worried about.

Hi Tom (and others),

Sorry it took so long for me to follow up on this, hopefully I found a
couple
a hours today to try writing another patch.

In any case, thanks for reviewing my first attempt and taking time to write
such a detailed critique... I've learned a lot!

I am now using the error context callback stack. The current column name
and column type are passed to the callback packed inside a new structure
of type "TransformExprState".
Those information are then passed to `errhint` and will be presented to the
user later on (in case of coercion failure).

Please find the WIP patch attached.
(I've pushed the patch on my GH fork[1]https://github.com/franckverrot/postgres/commit/73dd2cd096c91cee1b501d5f94ba81037de30fd1 too).

Thanks again,
Franck

[1]: https://github.com/franckverrot/postgres/commit/73dd2cd096c91cee1b501d5f94ba81037de30fd1
https://github.com/franckverrot/postgres/commit/73dd2cd096c91cee1b501d5f94ba81037de30fd1

Attachments:

0001-Report-column-for-which-type-coercion-fails.patchapplication/octet-stream; name=0001-Report-column-for-which-type-coercion-fails.patchDownload+30-1
#4Robert Haas
robertmhaas@gmail.com
In reply to: Franck Verrot (#3)
Re: Mention column name in error messages

On Sun, Aug 9, 2015 at 11:44 AM, Franck Verrot <franck@verrot.fr> wrote:

On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What seems more likely to lead to a usable patch is to arrange for the
extra information you want to be emitted as error "context", via an error
context callback that gets installed at the right times. ...
...
with no need for int8in to be directly aware of the context. You should
try adapting that methodology for the cases you're worried about.

Hi Tom (and others),

Sorry it took so long for me to follow up on this, hopefully I found a
couple
a hours today to try writing another patch.

In any case, thanks for reviewing my first attempt and taking time to write
such a detailed critique... I've learned a lot!

I am now using the error context callback stack. The current column name
and column type are passed to the callback packed inside a new structure
of type "TransformExprState".
Those information are then passed to `errhint` and will be presented to the
user later on (in case of coercion failure).

Please find the WIP patch attached.
(I've pushed the patch on my GH fork[1] too).

To make sure this doesn't get forgotten about, you may want to add it here:

https://commitfest.postgresql.org/action/commitfest_view/open

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Jeff Janes
jeff.janes@gmail.com
In reply to: Franck Verrot (#3)
Re: Mention column name in error messages

On Sun, Aug 9, 2015 at 8:44 AM, Franck Verrot <franck@verrot.fr> wrote:

On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What seems more likely to lead to a usable patch is to arrange for the
extra information you want to be emitted as error "context", via an error
context callback that gets installed at the right times. ...
...
with no need for int8in to be directly aware of the context. You should
try adapting that methodology for the cases you're worried about.

Hi Tom (and others),

Sorry it took so long for me to follow up on this, hopefully I found a
couple
a hours today to try writing another patch.

In any case, thanks for reviewing my first attempt and taking time to write
such a detailed critique... I've learned a lot!

I am now using the error context callback stack. The current column name
and column type are passed to the callback packed inside a new structure
of type "TransformExprState".
Those information are then passed to `errhint` and will be presented to the
user later on (in case of coercion failure).

Please find the WIP patch attached.
(I've pushed the patch on my GH fork[1] too).

Hi Franck,

I like the idea of having the column name.

I took this for a test drive, and had some comments.on the user visible
parts.

The hints you add end in a new line, which then gives two new lines once
they are emitted. This is contrary to how other HINTs are formatted.

Other HINTs are complete sentences (start with a capital letter, end with a
period).

But I think these belong as CONTEXT or as DETAIL, not as HINT. The
messages are giving me details about where (which column) the error
occured, they aren't giving suggestions to me about what to do about it.

Cheers,

Jeff

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#5)
Re: Mention column name in error messages

Jeff Janes <jeff.janes@gmail.com> writes:

The hints you add end in a new line, which then gives two new lines once
they are emitted. This is contrary to how other HINTs are formatted.
Other HINTs are complete sentences (start with a capital letter, end with a
period).
But I think these belong as CONTEXT or as DETAIL, not as HINT. The
messages are giving me details about where (which column) the error
occured, they aren't giving suggestions to me about what to do about it.

We have specific style guidelines about this:
http://www.postgresql.org/docs/devel/static/error-message-reporting.html
http://www.postgresql.org/docs/devel/static/error-style-guide.html

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Franck Verrot
franck@verrot.fr
In reply to: Jeff Janes (#5)
Re: Mention column name in error messages

On Wed, Aug 19, 2015 at 11:31 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

I took this for a test drive, and had some comments.on the user visible
parts.
[...]
But I think these belong as CONTEXT or as DETAIL, not as HINT. The
messages are giving me details about where (which column) the error
occurred, they aren't giving suggestions to me about what to do about it.

Hi Jeff,

Somehow my email never went through. So thank you for the test drive, I've
tried to update my patch (that you will find attached) to reflect what you
said (DETAIL instead of HINT).

Thanks Tom for pointing me at the docs, I clearly wasn't respecting the
guidelines.

Cheers,
Franck

Attachments:

0001-Report-column-for-which-type-coercion-fails.patchapplication/octet-stream; name=0001-Report-column-for-which-type-coercion-fails.patchDownload+30-1
#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Franck Verrot (#7)
Re: Mention column name in error messages

Franck Verrot wrote:

On Wed, Aug 19, 2015 at 11:31 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

I took this for a test drive, and had some comments.on the user visible
parts.
[...]
But I think these belong as CONTEXT or as DETAIL, not as HINT. The
messages are giving me details about where (which column) the error
occurred, they aren't giving suggestions to me about what to do about it.

Hi Jeff,

Somehow my email never went through. So thank you for the test drive, I've
tried to update my patch (that you will find attached) to reflect what you
said (DETAIL instead of HINT).

I think you need errcontext(), not errdetail(). The most important
difference is that you can have multiple CONTEXT lines but only one
DETAIL; I think if you attach a second errdetail(), the first one is
overwritten.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Franck Verrot
franck@verrot.fr
In reply to: Alvaro Herrera (#8)
Re: Mention column name in error messages

On Tue, Sep 15, 2015 at 7:12 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

I think you need errcontext(), not errdetail(). The most important
difference is that you can have multiple CONTEXT lines but only one
DETAIL; I think if you attach a second errdetail(), the first one is
overwritten.

Indeed, the first errdetail() will be overwritten. Here's another try.

Thanks again for looking at my patches!

--
Franck Verrot

Attachments:

0001-Report-column-for-which-type-coercion-fails.patchapplication/octet-stream; name=0001-Report-column-for-which-type-coercion-fails.patchDownload+30-1
#10Andres Freund
andres@anarazel.de
In reply to: Franck Verrot (#9)
Re: Mention column name in error messages

On 2015-09-15 12:00:25 +0200, Franck Verrot wrote:

diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index 1b3fcd6..78f82cd 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -389,6 +389,17 @@ markTargetListOrigin(ParseState *pstate, TargetEntry *tle,
* omits the column name list.  So we should usually prefer to use
* exprLocation(expr) for errors that can happen in a default INSERT.
*/
+
+void
+TransformExprCallback(void *arg)
+{
+	TransformExprState *state = (TransformExprState *) arg;
+
+	errcontext("coercion failed for column \"%s\" of type %s",
+			state->column_name,
+			format_type_be(state->expected_type));
+}

Why is this not a static function?

Expr *
transformAssignedExpr(ParseState *pstate,
Expr *expr,
@@ -405,6 +416,14 @@ transformAssignedExpr(ParseState *pstate,
Oid attrcollation; /* collation of target column */
ParseExprKind sv_expr_kind;

+ ErrorContextCallback errcallback;
+ TransformExprState testate;

Why the newline between the declarations? Why none afterwards?

diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h
index a86b79d..5e12c12 100644
--- a/src/include/parser/parse_target.h
+++ b/src/include/parser/parse_target.h
@@ -42,4 +42,11 @@ extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var,
extern char *FigureColname(Node *node);
extern char *FigureIndexColname(Node *node);
+/* Support for TransformExprCallback */
+typedef struct TransformExprState
+{
+	const char *column_name;
+	Oid expected_type;
+} TransformExprState;

I see no need for this to be a struct defined in the header. Given that
TransformExprCallback isn't public, and the whole thing is specific to
TransformExprState...

My major complaint though, is that this doesn't contain any tests...

Could you update the patch to address these issues?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#10)
Re: Mention column name in error messages

On Sun, Oct 4, 2015 at 12:23 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-09-15 12:00:25 +0200, Franck Verrot wrote:

diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index 1b3fcd6..78f82cd 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -389,6 +389,17 @@ markTargetListOrigin(ParseState *pstate, TargetEntry *tle,
* omits the column name list.  So we should usually prefer to use
* exprLocation(expr) for errors that can happen in a default INSERT.
*/
+
+void
+TransformExprCallback(void *arg)
+{
+     TransformExprState *state = (TransformExprState *) arg;
+
+     errcontext("coercion failed for column \"%s\" of type %s",
+                     state->column_name,
+                     format_type_be(state->expected_type));
+}

Why is this not a static function?

Expr *
transformAssignedExpr(ParseState *pstate,
Expr *expr,
@@ -405,6 +416,14 @@ transformAssignedExpr(ParseState *pstate,
Oid attrcollation; /* collation of target column */
ParseExprKind sv_expr_kind;

+ ErrorContextCallback errcallback;
+ TransformExprState testate;

Why the newline between the declarations? Why none afterwards?

diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h
index a86b79d..5e12c12 100644
--- a/src/include/parser/parse_target.h
+++ b/src/include/parser/parse_target.h
@@ -42,4 +42,11 @@ extern TupleDesc expandRecordVariable(ParseState *pstate, Var *var,
extern char *FigureColname(Node *node);
extern char *FigureIndexColname(Node *node);
+/* Support for TransformExprCallback */
+typedef struct TransformExprState
+{
+     const char *column_name;
+     Oid expected_type;
+} TransformExprState;

I see no need for this to be a struct defined in the header. Given that
TransformExprCallback isn't public, and the whole thing is specific to
TransformExprState...

My major complaint though, is that this doesn't contain any tests...

Could you update the patch to address these issues?

Patch is marked as returned with feedback, it has been two months
since the last review, and comments have not been addressed.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Franck Verrot
franck@verrot.fr
In reply to: Michael Paquier (#11)
Re: Mention column name in error messages

Thanks Andres for the review.

Michael, please find attached a revised patch addressing, amongst some
other changes, the testing issue (`make check` passes now) and the
visibility of the ` TransformExprState` struct.

Thanks,
Franck

On Tue, Dec 22, 2015 at 1:49 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Sun, Oct 4, 2015 at 12:23 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-09-15 12:00:25 +0200, Franck Verrot wrote:

diff --git a/src/backend/parser/parse_target.c

b/src/backend/parser/parse_target.c

index 1b3fcd6..78f82cd 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -389,6 +389,17 @@ markTargetListOrigin(ParseState *pstate,

TargetEntry *tle,

* omits the column name list.  So we should usually prefer to use
* exprLocation(expr) for errors that can happen in a default INSERT.
*/
+
+void
+TransformExprCallback(void *arg)
+{
+     TransformExprState *state = (TransformExprState *) arg;
+
+     errcontext("coercion failed for column \"%s\" of type %s",
+                     state->column_name,
+                     format_type_be(state->expected_type));
+}

Why is this not a static function?

Expr *
transformAssignedExpr(ParseState *pstate,
Expr *expr,
@@ -405,6 +416,14 @@ transformAssignedExpr(ParseState *pstate,
Oid attrcollation; /* collation of target

column */

ParseExprKind sv_expr_kind;

+ ErrorContextCallback errcallback;
+ TransformExprState testate;

Why the newline between the declarations? Why none afterwards?

diff --git a/src/include/parser/parse_target.h

b/src/include/parser/parse_target.h

index a86b79d..5e12c12 100644
--- a/src/include/parser/parse_target.h
+++ b/src/include/parser/parse_target.h
@@ -42,4 +42,11 @@ extern TupleDesc expandRecordVariable(ParseState

*pstate, Var *var,

extern char *FigureColname(Node *node);
extern char *FigureIndexColname(Node *node);

+/* Support for TransformExprCallback */
+typedef struct TransformExprState
+{
+     const char *column_name;
+     Oid expected_type;
+} TransformExprState;

I see no need for this to be a struct defined in the header. Given that
TransformExprCallback isn't public, and the whole thing is specific to
TransformExprState...

My major complaint though, is that this doesn't contain any tests...

Could you update the patch to address these issues?

Patch is marked as returned with feedback, it has been two months
since the last review, and comments have not been addressed.
--
Michael

--
Franck Verrot

Attachments:

0001-Report-column-for-which-type-coercion-fails.patchapplication/octet-stream; name=0001-Report-column-for-which-type-coercion-fails.patchDownload+166-2
#13Michael Paquier
michael@paquier.xyz
In reply to: Franck Verrot (#12)
Re: Mention column name in error messages

On Thu, Oct 6, 2016 at 2:58 PM, Franck Verrot <franck@verrot.fr> wrote:

Michael, please find attached a revised patch addressing, amongst some other
changes, the testing issue (`make check` passes now) and the visibility of
the ` TransformExprState` struct.

+   /* Set up callback to identify error line number. */
+   errcallback.callback = TransformExprCallback;
Er, no. You want to know at least column name here, not a line number
*** /Users/mpaquier/git/postgres/src/test/regress/expected/xml_2.out
 Mon Oct 17 11:32:26 2016
--- /Users/mpaquier/git/postgres/src/test/regress/results/xml.out
 Mon Oct 17 15:58:42 2016
***************
*** 9,14 ****
--- 9,15 ----
  LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
                                         ^
  DETAIL:  line 1: Couldn't find end of Start Tag wrong line 1
+ CONTEXT:  coercion failed for column "data" of type xml
  SELECT * FROM xmltest;
   id |        data
  ----+--------------------
make check is still broken here. You did not update the expected
output used when build is done with the --with-libxml switch. It would
be good to check other cases as well.

On top of that, I have found that a couple of other regression tests
are broken in contrib/, citext being one.

The context message is specifying only the column type and name. I
think that it would be useful to provide as well the relation name.
Imagine for example the case of a CTE using an INSERT query...
Providing the query type would be also useful I think. You can look at
state->p_is_insert for that. In short the context message could just
have this shape:
CONTEXT { INSERT | UPDATE } relname, column "col", type coltype.

Andres wrote:

+/* Support for TransformExprCallback */
+typedef struct TransformExprState
+{
+     const char *column_name;
+     Oid expected_type;
+} TransformExprState;
I see no need for this to be a struct defined in the header. Given that
TransformExprCallback isn't public, and the whole thing is specific to
TransformExprState...

That's a matter of taste, really. Personally I find cleaner to declare
that with the other static declarations at the top of the fil, and
keep the comments of transformAssignedExpr clean of everything.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#13)
Re: Mention column name in error messages

On Mon, Oct 17, 2016 at 3:18 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Andres wrote:

+/* Support for TransformExprCallback */
+typedef struct TransformExprState
+{
+     const char *column_name;
+     Oid expected_type;
+} TransformExprState;
I see no need for this to be a struct defined in the header. Given that
TransformExprCallback isn't public, and the whole thing is specific to
TransformExprState...

That's a matter of taste, really. Personally I find cleaner to declare
that with the other static declarations at the top of the fil, and
keep the comments of transformAssignedExpr clean of everything.

It's pretty standard practice for PostgreSQL to keep declarations
private to particular files whenever they are used only in that file.
And I'd argue that it's good practice in general.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#14)
Re: Mention column name in error messages

On Wed, Oct 19, 2016 at 3:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Oct 17, 2016 at 3:18 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Andres wrote:

+/* Support for TransformExprCallback */
+typedef struct TransformExprState
+{
+     const char *column_name;
+     Oid expected_type;
+} TransformExprState;
I see no need for this to be a struct defined in the header. Given that
TransformExprCallback isn't public, and the whole thing is specific to
TransformExprState...

That's a matter of taste, really. Personally I find cleaner to declare
that with the other static declarations at the top of the fil, and
keep the comments of transformAssignedExpr clean of everything.

It's pretty standard practice for PostgreSQL to keep declarations
private to particular files whenever they are used only in that file.
And I'd argue that it's good practice in general.

Yes, definitely. My comment was referring to the fact that the
declaration of this structure was not at the top of this .c file as
the last version of the patch is doing (would be better to declare
them at the beginning of this .c file for clarity actually). It seems
like I did not get Andres' review point, sorry for that.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Franck Verrot (#12)
Re: Mention column name in error messages

On Mon, Oct 17, 2016 at 12:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

*** /Users/mpaquier/git/postgres/src/test/regress/expected/xml_2.out
Mon Oct 17 11:32:26 2016
--- /Users/mpaquier/git/postgres/src/test/regress/results/xml.out
Mon Oct 17 15:58:42 2016
***************
*** 9,14 ****
--- 9,15 ----
LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
^
DETAIL:  line 1: Couldn't find end of Start Tag wrong line 1
+ CONTEXT:  coercion failed for column "data" of type xml
SELECT * FROM xmltest;
id |        data
----+--------------------
make check is still broken here. You did not update the expected
output used when build is done with the --with-libxml switch. It would
be good to check other cases as well.

On top of that, I have found that a couple of other regression tests
are broken in contrib/, citext being one.

I've also tested with the patch. As Michael already pointed out, you
should update the expected output for citext and xml regression tests.

+   /* Set up callback to identify error line number. */
+   errcallback.callback = TransformExprCallback;
Er, no. You want to know at least column name here, not a line number

Please change the comment accordingly.

The context message is specifying only the column type and name. I
think that it would be useful to provide as well the relation name.
Imagine for example the case of a CTE using an INSERT query...
Providing the query type would be also useful I think. You can look at
state->p_is_insert for that. In short the context message could just
have this shape:
CONTEXT { INSERT | UPDATE } relname, column "col", type coltype.

+1 for providing relation name in the context message.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#16)
Re: Mention column name in error messages

On Wed, Oct 26, 2016 at 3:15 PM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:

On Mon, Oct 17, 2016 at 12:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

*** /Users/mpaquier/git/postgres/src/test/regress/expected/xml_2.out
Mon Oct 17 11:32:26 2016
--- /Users/mpaquier/git/postgres/src/test/regress/results/xml.out
Mon Oct 17 15:58:42 2016
***************
*** 9,14 ****
--- 9,15 ----
LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
^
DETAIL:  line 1: Couldn't find end of Start Tag wrong line 1
+ CONTEXT:  coercion failed for column "data" of type xml
SELECT * FROM xmltest;
id |        data
----+--------------------
make check is still broken here. You did not update the expected
output used when build is done with the --with-libxml switch. It would
be good to check other cases as well.

On top of that, I have found that a couple of other regression tests
are broken in contrib/, citext being one.

I've also tested with the patch. As Michael already pointed out, you
should update the expected output for citext and xml regression tests.

+   /* Set up callback to identify error line number. */
+   errcallback.callback = TransformExprCallback;
Er, no. You want to know at least column name here, not a line number

Please change the comment accordingly.

The context message is specifying only the column type and name. I
think that it would be useful to provide as well the relation name.
Imagine for example the case of a CTE using an INSERT query...
Providing the query type would be also useful I think. You can look at
state->p_is_insert for that. In short the context message could just
have this shape:
CONTEXT { INSERT | UPDATE } relname, column "col", type coltype.

+1 for providing relation name in the context message.

Okay, so I have reworked the patch a bit and finished with the
attached, adapting the context message to give more information. I
have noticed as well a bug in the patch: the context callback was set
before checking if the column used for transformation is checked on
being a system column or not, the problem being that the callback
could be called without the fields set.

I have updated the regression tests that I found, the main portion of
the patch being dedicated to that and being sure that all the
alternate outputs are correctly refreshed. In this case int8, float4,
float8, xml and contrib/citext are the ones impacted by the change
with alternate outputs.

I am passing that down to a committer for review. The patch looks
large, but at 95% it involves diffs in the regression tests,
alternative outputs taking a large role in the bloat.
--
Michael

Attachments:

transform_errcontext.patchapplication/x-download; name=transform_errcontext.patchDownload+359-0
#18Franck Verrot
franck@verrot.fr
In reply to: Michael Paquier (#17)
Re: Mention column name in error messages

On Sun, Oct 30, 2016 at 12:04 AM, Michael Paquier <michael.paquier@gmail.com

wrote:

Okay, so I have reworked the patch a bit and finished with the
attached, adapting the context message to give more information. I
have noticed as well a bug in the patch: the context callback was set
before checking if the column used for transformation is checked on
being a system column or not, the problem being that the callback
could be called without the fields set.

Interesting. I wasn't sure it was set at the right time indeed.

I have updated the regression tests that I found, the main portion of

the patch being dedicated to that and being sure that all the
alternate outputs are correctly refreshed. In this case int8, float4,
float8, xml and contrib/citext are the ones impacted by the change
with alternate outputs.

Thanks! I couldn't find time to update it last week, so thanks for chiming
in.

I am passing that down to a committer for review. The patch looks
large, but at 95% it involves diffs in the regression tests,
alternative outputs taking a large role in the bloat.

Great, thanks Michael!

--
Franck

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#17)
Re: Mention column name in error messages

Michael Paquier <michael.paquier@gmail.com> writes:

I am passing that down to a committer for review. The patch looks
large, but at 95% it involves diffs in the regression tests,
alternative outputs taking a large role in the bloat.

This is kind of cute, but it doesn't seem to cover very much territory,
because it only catches errors that are found in the parse stage.
For instance, it fails to cover Franck's original example:

# create table foo(bar varchar(4), baz varchar(2));
CREATE TABLE
# insert into foo values ('foo2!', 'ok');
ERROR: value too long for type character varying(4)

That's still all you get, because the parser sets that up as a varchar
literal with a runtime length coercion, and the failure doesn't occur
till later. In this particular example it happens during the planner's
constant-folding stage, but with a non-constant expression it would
happen in the executor.

Maybe it'd be all right to commit this anyway, but I'm afraid the most
common reaction would be "why's it give me this info some of the time,
but not when I really need it?" I'm inclined to think that an acceptable
patch will need to provide context for the plan-time and run-time cases
too, and I'm not very sure where would be a sane place to plug in for
those cases.

Less important comments:

* I don't really see the point of including the column type name in the
error message. We don't do that in the COPY context message this is
based on. If we were going to print it, I should think we'd need the
typmod as well --- eg, the difference between varchar(4) and varchar(4000)
could be pretty relevant.

* The coding order here is subtly wrong:

+	/* Set up callback to fetch more details regarding the error */
+	errcallback.callback = TransformExprCallback;
+	errcallback.arg = (void *) &te_state;
+	errcallback.previous = error_context_stack;
+	error_context_stack = &errcallback;
+	te_state.relation_name = RelationGetRelationName(rd);
+	te_state.column_name = colname;
+	te_state.expected_type = attrtype;
+	te_state.is_insert = pstate->p_is_insert;

The callback is "live" the instant you assign to error_context_stack;
any data it depends on had better be valid before that. In this case
you're effectively depending on the assumption that
RelationGetRelationName can't throw an error. While it probably can't
today, better style would be to set up te_state first, eg

+	/* Set up callback to fetch more details regarding the error */
+	te_state.relation_name = RelationGetRelationName(rd);
+	te_state.column_name = colname;
+	te_state.expected_type = attrtype;
+	te_state.is_insert = pstate->p_is_insert;
+	errcallback.callback = TransformExprCallback;
+	errcallback.arg = (void *) &te_state;
+	errcallback.previous = error_context_stack;
+	error_context_stack = &errcallback;

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#19)
Re: Mention column name in error messages

On Fri, Nov 4, 2016 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

I am passing that down to a committer for review. The patch looks
large, but at 95% it involves diffs in the regression tests,
alternative outputs taking a large role in the bloat.

This is kind of cute, but it doesn't seem to cover very much territory,
because it only catches errors that are found in the parse stage.
For instance, it fails to cover Franck's original example:
​[...]

Maybe it'd be all right to commit this anyway, but I'm afraid the most
common reaction would be "why's it give me this info some of the time,
but not when I really need it?" I'm inclined to think that an acceptable
patch will need to provide context for the plan-time and run-time cases
too, and I'm not very sure where would be a sane place to plug in for
those cases.

​Agreed.

David J.

#21Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#19)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#19)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#22)
#24Franck Verrot
franck@verrot.fr
In reply to: Tom Lane (#22)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Franck Verrot (#24)
#26Franck Verrot
franck@verrot.fr
In reply to: Tom Lane (#25)
#27Michael Paquier
michael@paquier.xyz
In reply to: Franck Verrot (#26)