More on elog and error codes
I've looked at the elog calls in the source, about 1700 in total (only
elog(ERROR)). If we mapped these to the SQL error codes then we'd have
about two dozen calls with an assigned code and the rest being "other".
The way I estimate it (I didn't really look at *each* call, of course) is
that about 2/3 of the calls are internal panic calls ("cache lookup of %s
failed"), 1/6 are SQL-level problems, and the rest are operating system,
storage problems, "not implemented", misconfigurations, etc.
A problem that makes this quite hard to manage is that many errors can be
reported from several places, e.g., the parser, the executor, the access
method. Some of these messages are probably not readily reproduceable
because they are caught elsewhere.
Consequentially, the most pragmatic approach to assigning error codes
might be to just pick some numbers and give them out gradually. A
hierarchical subsystem+code might be useful, beyond that it really depends
on what we expect from error codes in the first place. Does anyone have
good experiences from other products?
Essentially, I envision making up a new function, say "elogc", which has
elogc(<level>, [<subsys>,?] <code>, message...)
where the code is some macro, the expansion of which is to be determined.
A call to "elogc" would also require a formalized message wording, adding
the error code to the documentation, which also requires having a fairly
good idea how the error can happen and how to handle it. This could
perhaps even be automated to some extent.
All the calls that are not converted yet will be assigned a to the generic
"internal error" class; most of them will stay this way.
As for translations, I don't think we have to worry about this right now.
Assuming that we would use gettext or something similar, we can tell it
that all calls to elog (or "elogc" or whatever) contain translatable
strings, so we don't have to uglify it with gettext(...) or _(...) calls
or what else.
So we need some good error numbering scheme. Any ideas?
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
At 23:56 19/03/01 +0100, Peter Eisentraut wrote:
Essentially, I envision making up a new function, say "elogc", which has
elogc(<level>, [<subsys>,?] <code>, message...)
where the code is some macro, the expansion of which is to be determined.
A call to "elogc" would also require a formalized message wording, adding
the error code to the documentation, which also requires having a fairly
good idea how the error can happen and how to handle it. This could
perhaps even be automated to some extent.All the calls that are not converted yet will be assigned a to the generic
"internal error" class; most of them will stay this way.
...
So we need some good error numbering scheme. Any ideas?
FWIW, the VMS scheme has error numbers broken down to include system,
subsystem, error number & severity. These are maintained in an error
message source file. eg. the file system's 'file not found' error message
is something like:
FACILITY RMS (the file system)
...
SEVERITY WARNING
...
FILNFND "File %AS not found"
...
It's a while since I used VMS messages files regularly, this is at least
representative. It has the drawback that severity is often tied to the
message, not the circumstance, but this is a problem only rarely.
In code, the messages are used as external symbols (probably in our case
representing pointers to C format strings). In making extensive use of such
a mnemonics, I never really needed to have full text messages. Once a set
of standards is in place for message abbreviations, the most people can
read the message codes. This would mean that:
elogc(<level>, [<subsys>,?] <code>, message...)
becomes:
elogc(<code> [, parameter...])
eg.
"cache lookup of %s failed"
might be replaced by:
elog(CACHELOOKUPFAIL, cacheItemThatFailed);
and
"internal error: %s"
becomes
elog(INTERNAL, "could not find the VeryImportantThing");
Unlike VMS, it's probably a good idea to separate the severity from the
error code, since a CACHELOOKUPFAIL in one place may be less significant
than another (eg. severity=debug).
I also think it's important that we get the source file and line number
somewhere in the message, and if we have these, we may not need the subsystem.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
I also think it's important that we get the source file and line number
somewhere in the message, and if we have these, we may not need the
subsystem.
I agree that the subsystem concept is not necessary, except possibly as
a means of avoiding collisions in the error-symbol namespace, and for
that it would only be a naming convention (PGERR_subsys_IDENTIFIER).
We probably do not need it considering that we have much less than 1000
distinct error identifiers to assign, judging from Peter's survey.
We do need severity to be distinct from the error code ("internal
errors" are surely not all the same severity, even if we don't bother
to assign formal error codes to each one).
BTW, the symbols used in the source code do need to have a common prefix
(PGERR_CACHELOOKUPFAIL not CACHELOOKUPFAIL) to avoid namespace pollution
problems. We blew this before with "DEBUG" and friends, let's learn
from that mistake.
regards, tom lane
So we need some good error numbering scheme. Any ideas?
SQL9x specifies some error codes, with no particular numbering scheme
other than negative numbers indicate a problem afaicr.
Shouldn't we map to those where possible?
- Thomas
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
So we need some good error numbering scheme. Any ideas?
SQL9x specifies some error codes, with no particular numbering scheme
other than negative numbers indicate a problem afaicr.Shouldn't we map to those where possible?
Good point, but I guess most of the errors produced are pgsql
specific. If I remember right Sybase had several different SQL types of error
mapped to one of the standard error codes.
Also the JDBC API provides methods to look at the database dependent error
code and standard error code. I've found both useful when working with
Sybase.
cheers,
Gunnar
Import Notes
Reply to msg id not found: ThomasLockhartsmessageofTue20Mar2001060119+0000
So we need some good error numbering scheme. Any ideas?
SQL9x specifies some error codes, with no particular numbering scheme
other than negative numbers indicate a problem afaicr.Shouldn't we map to those where possible?
Yes, it defines at least a few dozen char(5) error codes. These are hierarchical,
grouped into Warnings and Errors, and have room for implementation specific
message codes.
Imho there is no room for inventing something new here, or only in addition.
Andreas
Import Notes
Resolved by subject fallback
Philip Warner writes:
elog(CACHELOOKUPFAIL, cacheItemThatFailed);
The disadvantage of this approach, which I tried to explain in a previous
message, is that we might want to have different wordings for different
occurences of the same class of error.
Additionally, the whole idea behind having error *codes* is that the
client program can easily distinguish errors that it can handle specially.
Thus the codes should be numeric or some other short, fixed scheme. In
the backend they could be replaced by macros.
Example:
#define PGERR_TYPE 1854
/* somewhere... */
elogc(ERROR, PGERR_TYPE, "type %s cannot be created because it already exists", ...)
/* elsewhere... */
elogc(ERROR, PGERR_TYPE, "type %s used as argument %d of function %s doesn't exist", ...)
In fact, this is my proposal. The "1854" can be argued, but I like the
rest.
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Zeugswetter Andreas SB writes:
SQL9x specifies some error codes, with no particular numbering scheme
other than negative numbers indicate a problem afaicr.Shouldn't we map to those where possible?
Yes, it defines at least a few dozen char(5) error codes. These are hierarchical,
grouped into Warnings and Errors, and have room for implementation specific
message codes.
Let's use those then to start with.
Anyone got a good idea for a client API to this? I think we could just
prefix the actual message with the error code, at least as a start.
Since they're all fixed width the client could take them apart easily. I
recall other RDBMS' (Oracle?) also having an error code before each
message.
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
#define PGERR_TYPE 1854
#define PGSQLSTATE_TYPE "S0021" // char(5) SQLSTATE
The standard calls this error variable SQLSTATE
(look up in ESQL standard)
first 2 chars are class next 3 are subclass
"00000" is e.g. Success
"02000" is Data not found
"U0xxx" user defined routine error xxx is user defined
/* somewhere... */
elogc(ERROR, PGERR_TYPE, "type %s cannot be created because it already exists", ...)
PGELOG(ERROR, PGSQLSTATE_TYPE, ("type %s cannot be created because it already exists", ...))
put varargs into parentheses to avoid need for ... macros see Tom's proposal
I also agree, that we can group different text messages into the same SQLSTATE,
if it seems appropriate for the client to handle them alike.
Andreas
Import Notes
Resolved by subject fallback
Coming from an IBM Mainframe background, I'm used to ALL OS/Product
messages having a message number, and a fat messages and codes book.
I hope we can do that eventually.
(maybe a database of the error numbers and codes?)
LER
Original Message <<<<<<<<<<<<<<<<<<
On 3/20/01, 10:53:42 AM, Peter Eisentraut <peter_e@gmx.net> wrote regarding
Re: AW: [HACKERS] Re: More on elog and error codes:
Zeugswetter Andreas SB writes:
SQL9x specifies some error codes, with no particular numbering scheme
other than negative numbers indicate a problem afaicr.Shouldn't we map to those where possible?
Yes, it defines at least a few dozen char(5) error codes. These are
hierarchical,
grouped into Warnings and Errors, and have room for implementation
specific
Show quoted text
message codes.
Let's use those then to start with.
Anyone got a good idea for a client API to this? I think we could just
prefix the actual message with the error code, at least as a start.
Since they're all fixed width the client could take them apart easily. I
recall other RDBMS' (Oracle?) also having an error code before each
message.
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:
PGELOG(ERROR, PGSQLSTATE_TYPE, ("type %s cannot be created because it already exists", ...))
put varargs into parentheses to avoid need for ... macros see Tom's proposal
I'd be inclined to make it
PGELOG((ERROR, PGSQLSTATE_TYPE, "type %s cannot be created because it already exists", ...))
The extra parens are ugly and annoying in any case, but they seem
slightly less so if you just double the parens associated with the
PGELOG call. Takes less thought than adding a paren somewhere in the
middle of the call. IMHO anyway...
regards, tom lane
On Tue, 20 Mar 2001 10:56, you wrote:
I've looked at the elog calls in the source, about 1700 in total (only
[ ... ]
So we need some good error numbering scheme. Any ideas?
Just that it might be a good idea to incorporate the version / release
details in some way so that when somebody on the list is squeaking about
an error message it is obvious to the helper that the advice needed is to
upgrade from the Cretatious Period version to a modern release, and have
another go.
--
Sincerely etc.,
NAME Christopher Sawtell
CELL PHONE 021 257 4451
ICQ UIN 45863470
EMAIL csawtell @ xtra . co . nz
CNOTES ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz
-->> Please refrain from using HTML or WORD attachments in e-mails to me
<<--
So we need some good error numbering scheme. Any ideas?
I'm a newbie, but have been following dev and have a few comments
and these are thoughts not criticisms:
1) I've seen a huge mixture of "how to implement" to support some
desired feature without first knowing "all" of the features that
are desired. Examination over all of the mailings reveals some
but not all of possible features you may want to include.
2) Define what you want to have without worrying about how to do it.
3) Design something that can implement all of the features.
4) Reconsider design if there are performance issues.
e.g.
Features desired
* system
* subsystem
* function
* file, line, etc
* severity
* user-ability-to-recover
* standards conformance - e.g.. SQL std
* default msg statement
* locale msg statement lookup mech, os dep or indep (careful here)
* success/warning/failure
* semantic taxonomy
* syntactic taxonomy
* forced to user, available to api, logging or not, tracing
* concept of level
* reports filtering on some attribute
* interoperation with existing system reports e.g. syslog, event log,...
* system environment snapshot option
(e.g. resource low/empty may/should trigger a log of conn cnt,
sys resource counts, load, etc)
* non-mnemonic internal numbers (mnemonic only to obey stds and then
only as a function call, not by implementation)
* ease of use (i.e. pgsql-dev-hacker use)
* ease of use (i.e. api development use)
* ease of use (i.e. rolling into an existing system, e.g. during
transition both may need to be in use.)
* ease of use (i.e. looking through existing errors to find one
that may "correctly" fit the situation, instead of
creating yet-another-error-message.)
* ease of use (i.e. maybe having each "sub-system" having its own
"error domain" but using the same error mechanism)
* distinction btwn error report, debug report, tracing report, etc
* separate the concepts of
- report creation
- report delivery
- report reception
- report interpretation
* what do other's do, other's as in os, db, middleware, etc
along with their strong and weak points
... what else do you want... and lets flesh out the meaning of
each of these. Then we can go on to a design...
Sorry if this sounds like a lecture.
With regards to mnemonic things - ugh - this is a database.
I've worked with a LARGE electronics company that had
10 and 12 digit mnemonic part numbers. The mnemonic-ness
begins to break down. (So you have a part number of an eprom,
what is the part number when it is blown - still an eprom?
how about including the version of the sw on the eprom? is it
now an assembly? opps that tended to mean multiple parts attached
together, humm still looks like an eprom?) They have gone through
a huge transition to move away, as has the industry from mnemonic
numbers to simply an id number. You look up the id number in a
database< :-) to find out what it is.
So why not drop the mnemonic concept and apply a function to a
blackbox dataitem to determine its attribute? But again first
determine what attributes you want, which are mandatory, optional,
system supplied (e.g. __LINE__ etc), is it for erroring, tracing,
debugging, some combo; then the appropriate dataitem can be
designed and functions defined. Functions (macros) for both the
report creation, report distribution, report reception, and
report interpretation. Some other email pointed out that
there are different people doing different things. Each of these
people-groups should identify what they need with regards to
error, debug, tracing reports. Each may have some nuances that
are not needed elsewhere, but the reporting system should be able
to support them all.
Ok, so I've got my flame suit on... but I am really trying to give
an "outsiders" birdseye view of what I've been reading, hopefully
which may be helpful.
Best regards,
.. Otto
Otto Hirr
OLAB Inc.
otto.hirr@olabinc.com
503 / 617-6595
On Wed, Mar 21, 2001 at 09:41:44AM +1200, Christopher Sawtell wrote:
On Tue, 20 Mar 2001 10:56, you wrote:
Just that it might be a good idea to incorporate the version / release
details in some way so that when somebody on the list is squeaking about
an error message it is obvious to the helper that the advice needed is to
upgrade from the Cretatious Period version to a modern release, and have
ROFL - parsed this as Cretinous period on the first pass.
Ross
At 17:35 20/03/01 +0100, Peter Eisentraut wrote:
Philip Warner writes:
elog(CACHELOOKUPFAIL, cacheItemThatFailed);
The disadvantage of this approach, which I tried to explain in a previous
message, is that we might want to have different wordings for different
occurences of the same class of error.Additionally, the whole idea behind having error *codes* is that the
client program can easily distinguish errors that it can handle specially.
Thus the codes should be numeric or some other short, fixed scheme. In
the backend they could be replaced by macros.
This seems to be just an argument for constructing the value of
PGERR_CACHELOOKUPFAIL carefully (which is what the VMS message source files
did). The point is that when they are used by a developer, they are simple.
#define PGERR_TYPE 1854
/* somewhere... */
elogc(ERROR, PGERR_TYPE, "type %s cannot be created because it already
exists", ...)
/* elsewhere... */
elogc(ERROR, PGERR_TYPE, "type %s used as argument %d of function %s
doesn't exist", ...)
I can appreciate that there may be cases where the same message is reused,
but that is where parameter substitution comes in.
In the specific example above, returning the same error code is not going
to help the client. What if they want to handle "type %s used as argument
%d of function %s doesn't exist" by creating the type, and silently ignore
"type %s cannot be created because it already exists"?
How do you handle "type %s can not be used as a function return type"? Is
this PGERR_FUNC or PGERR_TYPE?
If the motivation behind this is to alloy easy translation to SQL error
codes, then I suggest we have an error definition file with explicit
translation:
Code SQL Text
PGERR_TYPALREXI 02xxx "type %s cannot be created because it already exists"
PGERR_FUNCNOTYPE 02xxx "type %s used as argument %d of function %s doesn't
exist"
and if we want a generic 'type does not exist', then:
PGERR_NOSUCHTYPE 02xxx "type %s does not exist - %s"
where the %s might contain 'it can't be used as a function argument'.
the we just have
elogc(ERROR, PGERR_TYPALEXI, ...)
/* elsewhere... */
elogc(ERROR, PGERR_FUNCNOTYPE, ...)
Creating central message files/objects has the added advantage of a much
simpler locale support - they're just resource files, and they're NOT
embedded throughout the code.
Finally, if you do want to have some kind of error classification beyond
the SQL code, it could be encoded in the error message file.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
At 09:41 21/03/01 +1200, Christopher Sawtell wrote:
Just that it might be a good idea to incorporate the version / release
details in some way so that when somebody on the list is squeaking about
an error message it is obvious to the helper that the advice needed is to
upgrade from the Cretatious Period version to a modern release, and have
another go.
This is better handled by the bug *reporting* system; the users can easily
get the current version number from PG and send it with their reports. We
don't really want all the error codes changing between releases.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
At 09:43 21/03/01 +1100, Philip Warner wrote:
Code SQL Text
PGERR_TYPALREXI 02xxx "type %s cannot be created because it already exists"
PGERR_FUNCNOTYPE 02xxx "type %s used as argument %d of function %s doesn't
exist"
Peter,
Just to clarify, because in a previous email you seemed to believe that I
wanted 'PGERR_TYPALREXI' to resolve to a string. I have no such desire; a
meaningful number is fine, but we should never have to type it. One
possibility is that it is the address of an error-info function (built by
'compiling' the message file). Another possibility is that it could be a
prefix to several external symbols, PGERR_TYPALREXI_msg,
PGERR_TYPALREXI_code, PGERR_TYPALREXI_num, PGERR_TYPALREXI_sqlcode etc,
which are again built by compiling the message file. We can then encode
whatever we like into the message, have flexible text, and ease of use for
developers.
Hope this clarifies things...
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Creating central message files/objects has the added advantage of a much
simpler locale support - they're just resource files, and they're NOT
embedded throughout the code.
Finally, if you do want to have some kind of error classification beyond
the SQL code, it could be encoded in the error message file.
We could also (automatically) build a DBMS reference table *from* this
message file (or files), which would allow lookup of messages from codes
for applications which are not "message-aware".
Not a requirement, and it does not meet all needs (e.g. you would have
to be connected to get the messages in that case) but it would be
helpful for some use cases...
- Thomas
At 03:28 21/03/01 +0000, Thomas Lockhart wrote:
Creating central message files/objects has the added advantage of a much
simpler locale support - they're just resource files, and they're NOT
embedded throughout the code.
Finally, if you do want to have some kind of error classification beyond
the SQL code, it could be encoded in the error message file.We could also (automatically) build a DBMS reference table *from* this
message file (or files), which would allow lookup of messages from codes
for applications which are not "message-aware".Not a requirement, and it does not meet all needs (e.g. you would have
to be connected to get the messages in that case) but it would be
helpful for some use cases...
If we extended the message definitions to have (optional) description &
user-resolution sections, then we have the possibilty of asking psql to
explain the last error, and (broadly) how to fix it. Of course, in the
first pass, these would all be empty.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
Philip Warner writes:
If the motivation behind this is to alloy easy translation to SQL error
codes, then I suggest we have an error definition file with explicit
translation:Code SQL Text
PGERR_TYPALREXI 02xxx "type %s cannot be created because it already exists"
PGERR_FUNCNOTYPE 02xxx "type %s used as argument %d of function %s doesn't
exist"and if we want a generic 'type does not exist', then:
PGERR_NOSUCHTYPE 02xxx "type %s does not exist - %s"
where the %s might contain 'it can't be used as a function argument'.
the we just have
elogc(ERROR, PGERR_TYPALEXI, ...)
/* elsewhere... */
elogc(ERROR, PGERR_FUNCNOTYPE, ...)
This is going to be a disaster for the coder. Every time you look at an
elog you don't know what it does? Is the first arg a %s or a %d? What's
the first %s, what the second? How can this be checked against bugs? (I
know GCC can be pretty helpful here, but does it catch all problems?)
Conversely, when you look at the error message you don't know from what
contexts it's called. The error messages will degrade rapidly in quality
because changing one will become a major project.
Creating central message files/objects has the added advantage of a much
simpler locale support - they're just resource files, and they're NOT
embedded throughout the code.
Actually, the fact that the messages are in the code, where they're used,
and not in a catalog file is a reason why gettext is so popular and
catgets gets laughed at.
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/