db encoding

Started by Andrew Dunstanover 22 years ago16 messages
#1Andrew Dunstan
andrew@dunslane.net

pg_encoding appears to have 2 personalities, (name=>number and vice
versa) depending on whther or not its parameter begins with a digit
(which is in itself fragile - what if you gave it "3foo"?).

However, from an initdb POV I am assuming that we are only interested in
the name=>number conversion, even though initdb.sh does no apparent
checking on the parameter it is passing to pg_encoding. Please tell me
if this is incorrect.

thanks

andrew

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#1)
Re: db encoding

Andrew Dunstan <andrew@dunslane.net> writes:

However, from an initdb POV I am assuming that we are only interested in
the name=>number conversion, even though initdb.sh does no apparent
checking on the parameter it is passing to pg_encoding. Please tell me
if this is incorrect.

That's correct. I believe we intended to eliminate pg_encoding as a
separate program altogether, given a C version of initdb, since the C
code could perfectly well call pg_char_to_encoding and
pg_valid_server_encoding for itself.

regards, tom lane

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: db encoding

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

However, from an initdb POV I am assuming that we are only interested in
the name=>number conversion, even though initdb.sh does no apparent
checking on the parameter it is passing to pg_encoding. Please tell me
if this is incorrect.

That's correct. I believe we intended to eliminate pg_encoding as a
separate program altogether, given a C version of initdb, since the C
code could perfectly well call pg_char_to_encoding and
pg_valid_server_encoding for itself.

Yes, but when I asked that question at least one voice piped up (Debian
package maintainer, I think) to say that these were still needed as
standalone programs. However, I have already replaced the calls I
previously had to these from the C version (pg_id a few days ago,
pg_encoding a few minutes ago ;-) ) There will be a new C version on my
web site tonight, including inline calls to
pg_char_to_encoding()/pg_valid_server_encoding and signal handling
(these are actually the only 2 things we need libpq for).

cheers

andrew

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#3)
Re: db encoding

Andrew Dunstan writes:

Yes, but when I asked that question at least one voice piped up (Debian
package maintainer, I think) to say that these were still needed as
standalone programs. However, I have already replaced the calls I
previously had to these from the C version (pg_id a few days ago,
pg_encoding a few minutes ago ;-) )

There is no reason to keep pg_id, because the only reason it exists is
that the standard 'id' program does not behave sanely on some platforms.
pg_id is in fact a near-copy of a subset of an existing 'id'
implementation.

About pg_encoding. There is currently no way to tell whether an encoding
exists. Normally you would put this kind of thing into a system table,
but doing that is a bit tricky with the encodings. I would like to see
pg_encoding go, so let's hear what information people need and give them a
direct way to access it.

--
Peter Eisentraut peter_e@gmx.net

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#4)
Re: db encoding

Peter Eisentraut wrote:

About pg_encoding. There is currently no way to tell whether an encoding
exists. Normally you would put this kind of thing into a system table,
but doing that is a bit tricky with the encodings. I would like to see
pg_encoding go, so let's hear what information people need and give them a
direct way to access it.

You need the encoding ID before you have a system to have tables in - while you are running the bootstrap code - unless things change.

That isn't to say that putting the available encodings into a table isn't a good idea, though.

I'm not sure I see anywhere in the docs a ref to what encodings are available, or how to find that out - if I haven't missed it that's something to be remedied.

cheers

andrew

#6Oliver Elphick
olly@lfix.co.uk
In reply to: Peter Eisentraut (#4)
Re: db encoding

On Mon, 2003-10-06 at 19:30, Peter Eisentraut wrote:

About pg_encoding. There is currently no way to tell whether an encoding
exists. Normally you would put this kind of thing into a system table,
but doing that is a bit tricky with the encodings. I would like to see
pg_encoding go, so let's hear what information people need and give them a
direct way to access it.

I currently use pg_encoding in Debian's automatic upgrade script to
extract the existing default encoding from pg_database, thus:

$ psql -q -t -d template1 -c "select encoding from pg_database where
datname = 'template1'"
0

and then I use it to translate that number into an encoding name that
can be fed to initdb.

However, on looking at this, I can see that I don't need it, since I can
just as well do

$ psql -l | grep template1 | awk '{print $5}'
SQL_ASCII

so as to achieve the same result with only a single command.

Therefore, you don't need to keep pg_encoding for my (the Debian
package's) sake.

--
Oliver Elphick Oliver.Elphick@lfix.co.uk
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C
========================================
"Blessed is the man that walketh not in the counsel of
the ungodly, nor standeth in the way of sinners, nor
sitteth in the seat of the scornful. But his delight
is in the law of the LORD; and in his law doth he
meditate day and night." Psalms 1:1,2

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Oliver Elphick (#6)
Re: db encoding

Oliver Elphick wrote:

On Mon, 2003-10-06 at 19:30, Peter Eisentraut wrote:

About pg_encoding. There is currently no way to tell whether an encoding
exists. Normally you would put this kind of thing into a system table,
but doing that is a bit tricky with the encodings. I would like to see
pg_encoding go, so let's hear what information people need and give them a
direct way to access it.

I currently use pg_encoding in Debian's automatic upgrade script to
extract the existing default encoding from pg_database, thus:

$ psql -q -t -d template1 -c "select encoding from pg_database where
datname = 'template1'"
0

and then I use it to translate that number into an encoding name that
can be fed to initdb.

However, on looking at this, I can see that I don't need it, since I can
just as well do

$ psql -l | grep template1 | awk '{print $5}'
SQL_ASCII

so as to achieve the same result with only a single command.

Therefore, you don't need to keep pg_encoding for my (the Debian
package's) sake.

or save yourself a grep with this :-)

psql -l | awk '/template1/ { print $5 }'

Anyway, it looks like maybe we can get rid of pg_id and pg_encoding
after all.

cheers

andrew

(previous fan of the useless use of cat awards).

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oliver Elphick (#6)
Re: db encoding

Oliver Elphick <olly@lfix.co.uk> writes:

I currently use pg_encoding in Debian's automatic upgrade script to
extract the existing default encoding from pg_database, thus:
$ psql -q -t -d template1 -c "select encoding from pg_database where
datname = 'template1'"
0
and then I use it to translate that number into an encoding name that
can be fed to initdb.

But you can do that with pg_encoding_to_char:

regression=# select pg_encoding_to_char(encoding) from pg_database where datname = 'template1';
pg_encoding_to_char
---------------------
SQL_ASCII
(1 row)

AFAICS the standalone pg_encoding program is only useful if you need to
do encoding-number conversions while the postmaster is not available.

regards, tom lane

#9Oliver Elphick
olly@lfix.co.uk
In reply to: Tom Lane (#8)
Re: db encoding

On Mon, 2003-10-06 at 21:31, Tom Lane wrote:

Oliver Elphick <olly@lfix.co.uk> writes:

I currently use pg_encoding in Debian's automatic upgrade script to
extract the existing default encoding from pg_database, thus:
$ psql -q -t -d template1 -c "select encoding from pg_database where
datname = 'template1'"
0
and then I use it to translate that number into an encoding name that
can be fed to initdb.

But you can do that with pg_encoding_to_char:

So I see -- now, but I had missed its very existence, I'm afraid.

--
Oliver Elphick Oliver.Elphick@lfix.co.uk
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C
========================================
"Blessed is the man that walketh not in the counsel of
the ungodly, nor standeth in the way of sinners, nor
sitteth in the seat of the scornful. But his delight
is in the law of the LORD; and in his law doth he
meditate day and night." Psalms 1:1,2

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#1)
new initdb.c available

----- Original Message -----
From: "Andrew Dunstan" <andrew@dunslane.net>

Yes, but when I asked that question at least one voice piped up (Debian
package maintainer, I think) to say that these were still needed as
standalone programs. However, I have already replaced the calls I
previously had to these from the C version (pg_id a few days ago,
pg_encoding a few minutes ago ;-) ) There will be a new C version on my
web site tonight, including inline calls to
pg_char_to_encoding()/pg_valid_server_encoding and signal handling
(these are actually the only 2 things we need libpq for).

New version has passed basic Windows tests, and is available (with new
Makefile too) at http://www.dunslane.net/~andrew/Initdb_In_C.html

constructive comments (very) welcome.

cheers

andrew

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#10)
Re: new initdb.c available

Andrew Dunstan writes:

New version has passed basic Windows tests, and is available (with new
Makefile too) at http://www.dunslane.net/~andrew/Initdb_In_C.html

constructive comments (very) welcome.

That looks very nice. Just some nitpicking comments. (Most of these
comments should be extrapolated to similar source code fragments.)

#include <getopt_long.h>

Must be "getopt_long.h" because it might be our own replacement file.

#include <sys/types.h>

Already done in c.h.

/* who we are */
#define PG_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"

Should be "initdb (PostgreSQL) ...".

#define WRITEMODE "wb"

Use #define PG_BINARY_W "wb", if you are writing "binary" files, which
isn't quite clear.

/*
* macros for running pipes to postgres
* note lack of trailing ';' must be placed there when macros are used.
* this keeps emacs indentation happy
*/

#define PG_CMD_DECL char cmd[MAXPGPATH]; char ** line ; FILE * pg

Use

#define MACRO do { code; here; } while(0)

That's the standard mechanism.

#endif

Write "#endif /* WIN32 */", or whatever the case may be.

#define malloc(x) xmalloc( (x) )

You are not allowed to redefine or otherwise fiddle with standard library
functions. Just write xmalloc when you mean xmalloc.

if (strcmp(file->d_name,".") && strcmp(file->d_name,".."))

Please compare the result of strcmp() with 0. Code like this makes my
brain hurt. Do

#define streq(a, b) (strcmp((a), (b))==0)

if you must.

snprintf(filepath,MAXPGPATH,"%s/%s",path,*filename);

Spaces after the commas. Spaces after semicolons. Spaces before and
after binary operators. More spaces everywhere.

static char *
pg_getlocale(char * arg)

This is redundant. In C you can just use, for example,

locale = setlocale(LC_CTYPE, NULL);

This is actually one of the nice things we'll get out of having this in C.

sizeof(char)

... is always 1.

newtext = replace_token(usage_text,"$CMDNAME",progname);

for (i=0; newtext[i]; i++)
fputs(newtext[i],stdout);

Uh, why not use printf directly?

if (show_version)
{
printf("%s (PostgreSQL) %s\n",argv[0],PG_VERSION);
exit(0);
}

For the --version output, the program name is always hardcoded, to allow
identification in case the program was renamed.

if (!mkdatadir(NULL))
{
exit_nicely();
}
else
check_ok();

Bizarre use of braces.

--
Peter Eisentraut peter_e@gmx.net

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#11)
Re: new initdb.c available

Thanks. I will attend to most of this. A couple of points:

. using "wb" for writing out on Windows is so that we don't get Windows'
gratuitous addition of carriage returns. I will document that.
. I can't use do { stuff; } while(0) for a macro that does declarations
- the declarations would be local to the do block.

Doesn't pg_indent do the spacing for us when code is merged? I guess I
can get BSD indent - my Linux box only has GNU indent. If indent won't
do spacing I'll go through and do it by hand.

Expect a new version in a few days - with the removal of the
initdb-scripts.h as well as these changes.

cheers

andrew

Peter Eisentraut wrote:

Show quoted text

Andrew Dunstan writes:

New version has passed basic Windows tests, and is available (with new
Makefile too) at http://www.dunslane.net/~andrew/Initdb_In_C.html

constructive comments (very) welcome.

That looks very nice. Just some nitpicking comments. (Most of these
comments should be extrapolated to similar source code fragments.)

#include <getopt_long.h>

Must be "getopt_long.h" because it might be our own replacement file.

#include <sys/types.h>

Already done in c.h.

/* who we are */
#define PG_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"

Should be "initdb (PostgreSQL) ...".

#define WRITEMODE "wb"

Use #define PG_BINARY_W "wb", if you are writing "binary" files, which
isn't quite clear.

/*
* macros for running pipes to postgres
* note lack of trailing ';' must be placed there when macros are used.
* this keeps emacs indentation happy
*/

#define PG_CMD_DECL char cmd[MAXPGPATH]; char ** line ; FILE * pg

Use

#define MACRO do { code; here; } while(0)

That's the standard mechanism.

#endif

Write "#endif /* WIN32 */", or whatever the case may be.

#define malloc(x) xmalloc( (x) )

You are not allowed to redefine or otherwise fiddle with standard library
functions. Just write xmalloc when you mean xmalloc.

if (strcmp(file->d_name,".") && strcmp(file->d_name,".."))

Please compare the result of strcmp() with 0. Code like this makes my
brain hurt. Do

#define streq(a, b) (strcmp((a), (b))==0)

if you must.

snprintf(filepath,MAXPGPATH,"%s/%s",path,*filename);

Spaces after the commas. Spaces after semicolons. Spaces before and
after binary operators. More spaces everywhere.

static char *
pg_getlocale(char * arg)

This is redundant. In C you can just use, for example,

locale = setlocale(LC_CTYPE, NULL);

This is actually one of the nice things we'll get out of having this in C.

sizeof(char)

... is always 1.

newtext = replace_token(usage_text,"$CMDNAME",progname);

for (i=0; newtext[i]; i++)
fputs(newtext[i],stdout);

Uh, why not use printf directly?

if (show_version)
{
printf("%s (PostgreSQL) %s\n",argv[0],PG_VERSION);
exit(0);
}

For the --version output, the program name is always hardcoded, to allow
identification in case the program was renamed.

if (!mkdatadir(NULL))
{
exit_nicely();
}
else
check_ok();

Bizarre use of braces.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#12)
Re: new initdb.c available

Andrew Dunstan <andrew@dunslane.net> writes:

Doesn't pg_indent do the spacing for us when code is merged?

For the most part it will. You can ask Bruce to run the code through it
for you if you don't have BSD indent handy.

regards, tom lane

#14Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Andrew Dunstan (#12)
Re: new initdb.c available

Andrew Dunstan wrote:

Thanks. I will attend to most of this. A couple of points:

. using "wb" for writing out on Windows is so that we don't get Windows'
gratuitous addition of carriage returns. I will document that.
. I can't use do { stuff; } while(0) for a macro that does declarations
- the declarations would be local to the do block.

Doesn't pg_indent do the spacing for us when code is merged? I guess I
can get BSD indent - my Linux box only has GNU indent. If indent won't
do spacing I'll go through and do it by hand.

Patched BSD indent is our our ftp server. It is mentioned in the
pgindent README in current CVS.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#15Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Bruce Momjian (#14)
Re: new initdb.c available

. using "wb" for writing out on Windows is so that we don't
get Windows' gratuitous addition of carriage returns. I will document that.

Please use the #define PG_BINARY_W from c.h which is defined
with the correct letters for all platforms ("wb" on Windows).

That is how Peter's comment was meant.

Andreas

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Zeugswetter Andreas SB SD (#15)
Re: new initdb.c available

Zeugswetter Andreas SB SD wrote:

. using "wb" for writing out on Windows is so that we don't
get Windows' gratuitous addition of carriage returns. I will document that.

Please use the #define PG_BINARY_W from c.h which is defined
with the correct letters for all platforms ("wb" on Windows).

That is how Peter's comment was meant.

Ahh. Ok. thanks.

cheers

andrew