message string fixes

Started by Alvaro Herreraalmost 18 years ago12 messages
#1Alvaro Herrera
alvherre@commandprompt.com
1 attachment(s)

Hi,

While translating the backend's message catalog I found several things
that should probably be improved.

For example, in regis.c there are several strings talking about "regis
pattern". I had never heard of regis patterns. Turns out they are a
fast regex subset, used AFAICT only by the ispell code. Searching the
web I don't find any other reference to "regises" (regisen? reges?), so
I think we should avoid using the term. How about just changing the
messages to just say "regular expression" instead?

Additionally, I would like to apply the attached patch. Are there
objections?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachments:

string-fix.patchtext/x-diff; charset=us-asciiDownload
? cscope.files
? cscope.out
? msg
? src/include/access/.clog.h.swp
? src/tools/entab/entab
? src/tools/entab/entab.fix.diff
Index: src/backend/catalog/dependency.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/catalog/dependency.c,v
retrieving revision 1.69
diff -c -p -r1.69 dependency.c
*** src/backend/catalog/dependency.c	1 Jan 2008 19:45:48 -0000	1.69
--- src/backend/catalog/dependency.c	20 Jan 2008 00:10:54 -0000
*************** getObjectDescription(const ObjectAddress
*** 2002,2007 ****
--- 2002,2008 ----
  				SysScanDesc amscan;
  				HeapTuple	tup;
  				Form_pg_amop amopForm;
+ 				StringInfoData opfam;
  
  				amopDesc = heap_open(AccessMethodOperatorRelationId,
  									 AccessShareLock);
*************** getObjectDescription(const ObjectAddress
*** 2022,2031 ****
  
  				amopForm = (Form_pg_amop) GETSTRUCT(tup);
  
! 				appendStringInfo(&buffer, _("operator %d %s of "),
  								 amopForm->amopstrategy,
! 								 format_operator(amopForm->amopopr));
! 				getOpFamilyDescription(&buffer, amopForm->amopfamily);
  
  				systable_endscan(amscan);
  				heap_close(amopDesc, AccessShareLock);
--- 2023,2040 ----
  
  				amopForm = (Form_pg_amop) GETSTRUCT(tup);
  
! 				initStringInfo(&opfam);
! 				getOpFamilyDescription(&opfam, amopForm->amopfamily);
! 				/*
! 				 * translator: %d is the operator strategy (a number), the
! 				 * first %s is the textual form of the operator, and the second
! 				 * %s is the description of the operator family.
! 				 */
! 				appendStringInfo(&buffer, _("operator %d %s of %s"),
  								 amopForm->amopstrategy,
! 								 format_operator(amopForm->amopopr),
! 								 opfam.data);
! 				pfree(opfam.data);
  
  				systable_endscan(amscan);
  				heap_close(amopDesc, AccessShareLock);
*************** getObjectDescription(const ObjectAddress
*** 2039,2044 ****
--- 2048,2054 ----
  				SysScanDesc amscan;
  				HeapTuple	tup;
  				Form_pg_amproc amprocForm;
+ 				StringInfoData opfam;
  
  				amprocDesc = heap_open(AccessMethodProcedureRelationId,
  									   AccessShareLock);
*************** getObjectDescription(const ObjectAddress
*** 2059,2068 ****
  
  				amprocForm = (Form_pg_amproc) GETSTRUCT(tup);
  
! 				appendStringInfo(&buffer, _("function %d %s of "),
  								 amprocForm->amprocnum,
! 								 format_procedure(amprocForm->amproc));
! 				getOpFamilyDescription(&buffer, amprocForm->amprocfamily);
  
  				systable_endscan(amscan);
  				heap_close(amprocDesc, AccessShareLock);
--- 2069,2086 ----
  
  				amprocForm = (Form_pg_amproc) GETSTRUCT(tup);
  
! 				initStringInfo(&opfam);
! 				getOpFamilyDescription(&opfam, amprocForm->amprocfamily);
! 				/*
! 				 * translator: %d is the function number, the first %s is the
! 				 * textual form of the function with arguments, and the second
! 				 * %s is the description of the operator family.
! 				 */
! 				appendStringInfo(&buffer, _("function %d %s of %s"),
  								 amprocForm->amprocnum,
! 								 format_procedure(amprocForm->amproc),
! 								 opfam.data);
! 				pfree(opfam.data);
  
  				systable_endscan(amscan);
  				heap_close(amprocDesc, AccessShareLock);
Index: src/backend/catalog/pg_enum.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/catalog/pg_enum.c,v
retrieving revision 1.4
diff -c -p -r1.4 pg_enum.c
*** src/backend/catalog/pg_enum.c	1 Jan 2008 19:45:48 -0000	1.4
--- src/backend/catalog/pg_enum.c	20 Jan 2008 00:10:54 -0000
*************** EnumValuesCreate(Oid enumTypeOid, List *
*** 87,94 ****
  		if (strlen(lab) > (NAMEDATALEN - 1))
  			ereport(ERROR,
  					(errcode(ERRCODE_INVALID_NAME),
! 			errmsg("invalid enum label \"%s\", must be %d characters or less",
! 				   lab,
  				   NAMEDATALEN - 1)));
  
  
--- 87,94 ----
  		if (strlen(lab) > (NAMEDATALEN - 1))
  			ereport(ERROR,
  					(errcode(ERRCODE_INVALID_NAME),
! 			errmsg("invalid enum label \"%s\"", lab),
! 			errdetail("Labels must be %d characters or less.",
  				   NAMEDATALEN - 1)));
  
  
#2Alvaro Herrera
alvherre@commandprompt.com
In reply to: Alvaro Herrera (#1)
1 attachment(s)
Re: [pgtranslation-translators] message string fixes

Further fix attached. I think "of character type" suggests that the
column must be of type char, which is not the case -- varchar and text
work fine too AFAICS.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachments:

string-fix-2.patchtext/x-diff; charset=us-asciiDownload
Index: src/backend/utils/adt/tsvector_op.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/utils/adt/tsvector_op.c,v
retrieving revision 1.12
diff -c -p -r1.12 tsvector_op.c
*** src/backend/utils/adt/tsvector_op.c	1 Jan 2008 19:45:53 -0000	1.12
--- src/backend/utils/adt/tsvector_op.c	20 Jan 2008 01:09:30 -0000
*************** tsvector_update_trigger(PG_FUNCTION_ARGS
*** 1374,1380 ****
  		if (!istexttype(SPI_gettypeid(rel->rd_att, numattr)))
  			ereport(ERROR,
  					(errcode(ERRCODE_DATATYPE_MISMATCH),
! 					 errmsg("column \"%s\" is not of character type",
  							trigger->tgargs[i])));
  
  		datum = SPI_getbinval(rettuple, rel->rd_att, numattr, &isnull);
--- 1374,1380 ----
  		if (!istexttype(SPI_gettypeid(rel->rd_att, numattr)))
  			ereport(ERROR,
  					(errcode(ERRCODE_DATATYPE_MISMATCH),
! 					 errmsg("column \"%s\" is not of a character type",
  							trigger->tgargs[i])));
  
  		datum = SPI_getbinval(rettuple, rel->rd_att, numattr, &isnull);
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: message string fixes

Alvaro Herrera <alvherre@commandprompt.com> writes:

For example, in regis.c there are several strings talking about "regis
pattern". I had never heard of regis patterns. Turns out they are a
fast regex subset, used AFAICT only by the ispell code. Searching the
web I don't find any other reference to "regises" (regisen? reges?), so
I think we should avoid using the term. How about just changing the
messages to just say "regular expression" instead?

Then people would think that full regular expressions were meant.

It seems to me that a proper solution is to say something like "fast
regular expressions" and then document what that means in the ispell
dictionary documentation.

Additionally, I would like to apply the attached patch. Are there
objections?

Let's see, the first change is just so you can fool with the word
ordering, right? Seems OK to me, but are the other translators
going to complain about changing strings at this late date?

regards, tom lane

#4Teodor Sigaev
teodor@sigaev.ru
In reply to: Alvaro Herrera (#1)
Re: message string fixes

For example, in regis.c there are several strings talking about "regis
pattern". I had never heard of regis patterns. Turns out they are a
fast regex subset, used AFAICT only by the ispell code. Searching the
web I don't find any other reference to "regises" (regisen? reges?), so
I think we should avoid using the term. How about just changing the
messages to just say "regular expression" instead?

It's just a combination of "regular expression for ispell". It implements
subset of regex. It much faster that any other implementation, and uses subset
widely used in ispell.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#4)
Re: message string fixes

Teodor Sigaev <teodor@sigaev.ru> writes:

web I don't find any other reference to "regises" (regisen? reges?), so
I think we should avoid using the term. How about just changing the
messages to just say "regular expression" instead?

It's just a combination of "regular expression for ispell".

Maybe the right phrase to use is "ispell regular expression". In any
case we need to document what the limitations are compared to "regular"
regular expressions (ahem). Do you know offhand what the rules are?

regards, tom lane

#6Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#5)
Re: message string fixes

Maybe the right phrase to use is "ispell regular expression". In any
case we need to document what the limitations are compared to "regular"
regular expressions (ahem). Do you know offhand what the rules are?

There is a fallback to regex if expression isn't supported by regis (see call
of RS_isRegis() in spell.c).

Regis supports only matches as is, range of characters ( [abc] ), negotiation of
characters range ( [^abc] ) and can match begin or end of string. AFAIK, ispell
allows full regex but in practice I never seen something unsupported by regis.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#6)
Re: message string fixes

Teodor Sigaev <teodor@sigaev.ru> writes:

There is a fallback to regex if expression isn't supported by regis (see call
of RS_isRegis() in spell.c).

Oh. So in that case, the messages Alvaro is worried about

ereport(ERROR,
(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
errmsg("invalid regis pattern: \"%s\"",
str)));

aren't user-facing errors at all, and should be demoted to elog's,
correct?

elog(ERROR, "invalid regis pattern: \"%s\"", str);

regards, tom lane

#8Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#7)
Re: message string fixes

aren't user-facing errors at all, and should be demoted to elog's,
correct?

elog(ERROR, "invalid regis pattern: \"%s\"", str);

Hmm. If regis detects an error in expression then it will be an error for regex
library too. At least, it was supposed to be.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#9Alvaro Herrera
alvherre@commandprompt.com
In reply to: Teodor Sigaev (#8)
Re: message string fixes

Teodor Sigaev wrote:

aren't user-facing errors at all, and should be demoted to elog's,
correct?

elog(ERROR, "invalid regis pattern: \"%s\"", str);

Hmm. If regis detects an error in expression then it will be an error for
regex library too. At least, it was supposed to be.

And those that are not, probably are not what the user intends anyway,
with the pattern language being so narrow.

If all invalid regis patterns are indeed invalid regex patterns, then
just changing "regis" for "regex" should be enough.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#9)
Re: message string fixes

Alvaro Herrera <alvherre@commandprompt.com> writes:

Teodor Sigaev wrote:

Hmm. If regis detects an error in expression then it will be an error for
regex library too. At least, it was supposed to be.

And those that are not, probably are not what the user intends anyway,
with the pattern language being so narrow.

It looks to me like RS_isRegis() needs to be tightened up a bit anyway:
it will accept "^foo" which is valid regex but not valid regis, leading
to an error being thrown which is not what we want.

If we tighten it to exactly match what RS_compile() will take ... say
by using the same state-machine logic ... then indeed the ereports
are internal and can be demoted to elog's. If we make them elogs then
ISTM they ought to keep saying regis, just so we know where to look
if they ever do fail ;-)

regards, tom lane

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#10)
Re: message string fixes

I wrote:

It looks to me like RS_isRegis() needs to be tightened up a bit anyway:
it will accept "^foo" which is valid regex but not valid regis, leading
to an error being thrown which is not what we want.

I experimented with this and verified that the error could be reached
with a hacked-up affix file.

If we tighten it to exactly match what RS_compile() will take ... say
by using the same state-machine logic ... then indeed the ereports
are internal and can be demoted to elog's. If we make them elogs then
ISTM they ought to keep saying regis, just so we know where to look
if they ever do fail ;-)

Patch committed along these lines.

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: message string fixes

Alvaro Herrera <alvherre@commandprompt.com> writes:

Additionally, I would like to apply the attached patch. Are there
objections?

So far I think you only applied one half of that?

regards, tom lane