Thread: db encoding

db encoding

From

Andrew Dunstan

Date:

06 October 2003, 14:36:58

pg_encoding appears to have 2 personalities, (name=>number and vice 
versa) depending on whther or not its parameter begins with a digit 
(which is in itself fragile - what if you gave it "3foo"?).

However, from an initdb POV I am assuming that we are only interested in 
the name=>number conversion, even though initdb.sh does no apparent 
checking on the parameter it is passing to pg_encoding. Please tell me 
if this is incorrect.

thanks

andrew

Re: db encoding

From

Tom Lane

Date:

06 October 2003, 14:48:45

Andrew Dunstan <andrew@dunslane.net> writes:
> However, from an initdb POV I am assuming that we are only interested in 
> the name=>number conversion, even though initdb.sh does no apparent 
> checking on the parameter it is passing to pg_encoding. Please tell me 
> if this is incorrect.

That's correct.  I believe we intended to eliminate pg_encoding as a
separate program altogether, given a C version of initdb, since the C
code could perfectly well call pg_char_to_encoding and
pg_valid_server_encoding for itself.
        regards, tom lane

Re: db encoding

From

Andrew Dunstan

Date:

06 October 2003, 15:02:05

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>However, from an initdb POV I am assuming that we are only interested in 
>>the name=>number conversion, even though initdb.sh does no apparent 
>>checking on the parameter it is passing to pg_encoding. Please tell me 
>>if this is incorrect.
>>    
>>
>
>That's correct.  I believe we intended to eliminate pg_encoding as a
>separate program altogether, given a C version of initdb, since the C
>code could perfectly well call pg_char_to_encoding and
>pg_valid_server_encoding for itself.
>
>  
>
Yes, but when I asked that question at least one voice piped up (Debian 
package maintainer, I think) to say that these were still needed as 
standalone programs. However, I have already replaced the calls I 
previously had to these from the C version (pg_id a few days ago, 
pg_encoding a few minutes ago ;-) ) There will be a new C version on my 
web site tonight, including inline calls to 
pg_char_to_encoding()/pg_valid_server_encoding and signal handling 
(these are actually the only 2 things we need libpq for).

cheers

andrew

Re: db encoding

From

Peter Eisentraut

Date:

06 October 2003, 15:31:37

Andrew Dunstan writes:

> Yes, but when I asked that question at least one voice piped up (Debian
> package maintainer, I think) to say that these were still needed as
> standalone programs. However, I have already replaced the calls I
> previously had to these from the C version (pg_id a few days ago,
> pg_encoding a few minutes ago ;-) )

There is no reason to keep pg_id, because the only reason it exists is
that the standard 'id' program does not behave sanely on some platforms.
pg_id is in fact a near-copy of a subset of an existing 'id'
implementation.

About pg_encoding.  There is currently no way to tell whether an encoding
exists.  Normally you would put this kind of thing into a system table,
but doing that is a bit tricky with the encodings.  I would like to see
pg_encoding go, so let's hear what information people need and give them a
direct way to access it.

-- 
Peter Eisentraut   peter_e@gmx.net

Re: db encoding

From

Andrew Dunstan

Date:

06 October 2003, 16:06:59

Peter Eisentraut wrote:

>About pg_encoding.  There is currently no way to tell whether an encoding
>exists.  Normally you would put this kind of thing into a system table,
>but doing that is a bit tricky with the encodings.  I would like to see
>pg_encoding go, so let's hear what information people need and give them a
>direct way to access it.
>
You need the encoding ID before you have a system to have tables in - while you are running the bootstrap code - unless
thingschange.

That isn't to say that putting the available encodings into a table isn't a good idea, though.

I'm not sure I see anywhere in the docs a ref to what encodings are available, or how to find that out - if I haven't
missedit that's something to be remedied.

cheers

andrew

Re: db encoding

From

Oliver Elphick

Date:

06 October 2003, 16:43:22

On Mon, 2003-10-06 at 19:30, Peter Eisentraut wrote:

> About pg_encoding.  There is currently no way to tell whether an encoding
> exists.  Normally you would put this kind of thing into a system table,
> but doing that is a bit tricky with the encodings.  I would like to see
> pg_encoding go, so let's hear what information people need and give them a
> direct way to access it.

I currently use pg_encoding in Debian's automatic upgrade script to
extract the existing default encoding from pg_database, thus:
$ psql  -q -t -d template1 -c "select encoding from pg_database where
datname = 'template1'"       0

and then I use it to translate that number into an encoding name that
can be fed to initdb.

However, on looking at this, I can see that I don't need it, since I can
just as well do

$ psql -l | grep template1 | awk '{print $5}'
SQL_ASCII

so as to achieve the same result with only a single command.

Therefore, you don't need to keep pg_encoding for my (the Debian
package's) sake.

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
========================================   "Blessed is the man that walketh not in the counsel of      the ungodly, nor
standethin the way of sinners, nor      sitteth in the seat of the scornful. But his delight      is in the law of the
LORD;and in his law doth he      meditate day and night."         Psalms 1:1,2

Re: db encoding

From

Andrew Dunstan

Date:

06 October 2003, 16:52:59

Oliver Elphick wrote:

>On Mon, 2003-10-06 at 19:30, Peter Eisentraut wrote:
>
>  
>
>>About pg_encoding.  There is currently no way to tell whether an encoding
>>exists.  Normally you would put this kind of thing into a system table,
>>but doing that is a bit tricky with the encodings.  I would like to see
>>pg_encoding go, so let's hear what information people need and give them a
>>direct way to access it.
>>    
>>
>
>I currently use pg_encoding in Debian's automatic upgrade script to
>extract the existing default encoding from pg_database, thus:
> 
>$ psql  -q -t -d template1 -c "select encoding from pg_database where
>datname = 'template1'"
>        0
>
>and then I use it to translate that number into an encoding name that
>can be fed to initdb.
>
>However, on looking at this, I can see that I don't need it, since I can
>just as well do
>
>$ psql -l | grep template1 | awk '{print $5}'
>SQL_ASCII
>
>so as to achieve the same result with only a single command.
>
>Therefore, you don't need to keep pg_encoding for my (the Debian
>package's) sake.
>
>  
>
or save yourself a grep with this :-)
 psql -l | awk '/template1/ { print $5 }'

Anyway, it looks like maybe we can get rid of pg_id and pg_encoding 
after all.

cheers

andrew

(previous fan of the useless use of cat awards).

Re: db encoding

From

Tom Lane

Date:

06 October 2003, 17:32:24

Oliver Elphick <olly@lfix.co.uk> writes:
> I currently use pg_encoding in Debian's automatic upgrade script to
> extract the existing default encoding from pg_database, thus:
> $ psql  -q -t -d template1 -c "select encoding from pg_database where
> datname = 'template1'"
>         0
> and then I use it to translate that number into an encoding name that
> can be fed to initdb.

But you can do that with pg_encoding_to_char:

regression=# select pg_encoding_to_char(encoding) from pg_database where datname = 'template1';pg_encoding_to_char
---------------------SQL_ASCII
(1 row)

AFAICS the standalone pg_encoding program is only useful if you need to
do encoding-number conversions while the postmaster is not available.
        regards, tom lane

Re: db encoding

From

Oliver Elphick

Date:

06 October 2003, 18:06:08

On Mon, 2003-10-06 at 21:31, Tom Lane wrote:
> Oliver Elphick <olly@lfix.co.uk> writes:
> > I currently use pg_encoding in Debian's automatic upgrade script to
> > extract the existing default encoding from pg_database, thus:
> > $ psql  -q -t -d template1 -c "select encoding from pg_database where
> > datname = 'template1'"
> >         0
> > and then I use it to translate that number into an encoding name that
> > can be fed to initdb.
> 
> But you can do that with pg_encoding_to_char:

So I see -- now, but I had missed its very existence, I'm afraid.

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
========================================   "Blessed is the man that walketh not in the counsel of      the ungodly, nor
standethin the way of sinners, nor      sitteth in the seat of the scornful. But his delight      is in the law of the
LORD;and in his law doth he      meditate day and night."         Psalms 1:1,2

new initdb.c available

From

"Andrew Dunstan"

Date:

06 October 2003, 19:32:53

----- Original Message -----
From: "Andrew Dunstan" <andrew@dunslane.net>
> Yes, but when I asked that question at least one voice piped up (Debian
> package maintainer, I think) to say that these were still needed as
> standalone programs. However, I have already replaced the calls I
> previously had to these from the C version (pg_id a few days ago,
> pg_encoding a few minutes ago ;-) ) There will be a new C version on my
> web site tonight, including inline calls to
> pg_char_to_encoding()/pg_valid_server_encoding and signal handling
> (these are actually the only 2 things we need libpq for).
>

New version has passed basic Windows tests, and is available (with new
Makefile too) at http://www.dunslane.net/~andrew/Initdb_In_C.html

constructive comments (very) welcome.

cheers

andrew

Re: new initdb.c available

From

Peter Eisentraut

Date:

06 October 2003, 20:19:01

Andrew Dunstan writes:

> New version has passed basic Windows tests, and is available (with new
> Makefile too) at http://www.dunslane.net/~andrew/Initdb_In_C.html
>
> constructive comments (very) welcome.

That looks very nice.  Just some nitpicking comments.  (Most of these
comments should be extrapolated to similar source code fragments.)

> #include <getopt_long.h>

Must be "getopt_long.h" because it might be our own replacement file.

> #include <sys/types.h>

Already done in c.h.

> /* who we are */
> #define PG_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"

Should be "initdb (PostgreSQL) ...".

> #define WRITEMODE "wb"

Use #define PG_BINARY_W "wb", if you are writing "binary" files, which
isn't quite clear.

> /*
>  * macros for running pipes to postgres
>  * note lack of trailing ';' must be placed there when macros are used.
>  * this keeps emacs indentation happy
>  */
>
> #define PG_CMD_DECL     char cmd[MAXPGPATH]; char ** line ; FILE * pg

Use

#define MACRO do { code; here; } while(0)

That's the standard mechanism.

> #endif

Write "#endif /* WIN32 */", or whatever the case may be.

> #define malloc(x) xmalloc( (x) )

You are not allowed to redefine or otherwise fiddle with standard library
functions.  Just write xmalloc when you mean xmalloc.

> if (strcmp(file->d_name,".") && strcmp(file->d_name,".."))

Please compare the result of strcmp() with 0.  Code like this makes my
brain hurt.  Do

#define streq(a, b) (strcmp((a), (b))==0)

if you must.

> snprintf(filepath,MAXPGPATH,"%s/%s",path,*filename);

Spaces after the commas.  Spaces after semicolons.  Spaces before and
after binary operators.  More spaces everywhere.

> static char *
> pg_getlocale(char * arg)

This is redundant.  In C you can just use, for example,

locale = setlocale(LC_CTYPE, NULL);

This is actually one of the nice things we'll get out of having this in C.

> sizeof(char)

... is always 1.

> newtext = replace_token(usage_text,"$CMDNAME",progname);
>
> for (i=0; newtext[i]; i++)
>     fputs(newtext[i],stdout);

Uh, why not use printf directly?

> if (show_version)
> {
>     printf("%s (PostgreSQL) %s\n",argv[0],PG_VERSION);
>     exit(0);
> }

For the --version output, the program name is always hardcoded, to allow
identification in case the program was renamed.

>        if (!mkdatadir(NULL))
>        {
>            exit_nicely();
>        }
>        else
>            check_ok();

Bizarre use of braces.

-- 
Peter Eisentraut   peter_e@gmx.net

Re: new initdb.c available

From

Andrew Dunstan

Date:

06 October 2003, 21:04:57

Thanks. I will attend to most of this. A couple of points:

. using "wb" for writing out on Windows is so that we don't get Windows' 
gratuitous addition of carriage returns. I will document that.
. I can't use do { stuff; } while(0) for a macro that does declarations 
- the declarations would be local to the do block.

Doesn't pg_indent do the spacing for us when code is merged? I guess I 
can get BSD indent - my Linux box only has GNU indent. If indent won't 
do spacing I'll go through and do it by hand.

Expect a new version in a few days - with the removal of the 
initdb-scripts.h as well as these changes.

cheers

andrew




Peter Eisentraut wrote:

>Andrew Dunstan writes:
>
>  
>
>>New version has passed basic Windows tests, and is available (with new
>>Makefile too) at http://www.dunslane.net/~andrew/Initdb_In_C.html
>>
>>constructive comments (very) welcome.
>>    
>>
>
>That looks very nice.  Just some nitpicking comments.  (Most of these
>comments should be extrapolated to similar source code fragments.)
>
>
>  
>
>>#include <getopt_long.h>
>>    
>>
>
>Must be "getopt_long.h" because it might be our own replacement file.
>
>  
>
>>#include <sys/types.h>
>>    
>>
>
>Already done in c.h.
>
>  
>
>>/* who we are */
>>#define PG_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
>>    
>>
>
>Should be "initdb (PostgreSQL) ...".
>
>  
>
>>#define WRITEMODE "wb"
>>    
>>
>
>Use #define PG_BINARY_W "wb", if you are writing "binary" files, which
>isn't quite clear.
>
>  
>
>>/*
>> * macros for running pipes to postgres
>> * note lack of trailing ';' must be placed there when macros are used.
>> * this keeps emacs indentation happy
>> */
>>
>>#define PG_CMD_DECL     char cmd[MAXPGPATH]; char ** line ; FILE * pg
>>    
>>
>
>Use
>
>#define MACRO do { code; here; } while(0)
>
>That's the standard mechanism.
>
>  
>
>>#endif
>>    
>>
>
>Write "#endif /* WIN32 */", or whatever the case may be.
>
>  
>
>>#define malloc(x) xmalloc( (x) )
>>    
>>
>
>You are not allowed to redefine or otherwise fiddle with standard library
>functions.  Just write xmalloc when you mean xmalloc.
>
>  
>
>>if (strcmp(file->d_name,".") && strcmp(file->d_name,".."))
>>    
>>
>
>Please compare the result of strcmp() with 0.  Code like this makes my
>brain hurt.  Do
>
>#define streq(a, b) (strcmp((a), (b))==0)
>
>if you must.
>
>  
>
>>snprintf(filepath,MAXPGPATH,"%s/%s",path,*filename);
>>    
>>
>
>Spaces after the commas.  Spaces after semicolons.  Spaces before and
>after binary operators.  More spaces everywhere.
>
>  
>
>>static char *
>>pg_getlocale(char * arg)
>>    
>>
>
>This is redundant.  In C you can just use, for example,
>
>locale = setlocale(LC_CTYPE, NULL);
>
>This is actually one of the nice things we'll get out of having this in C.
>
>  
>
>>sizeof(char)
>>    
>>
>
>... is always 1.
>
>  
>
>>newtext = replace_token(usage_text,"$CMDNAME",progname);
>>
>>for (i=0; newtext[i]; i++)
>>    fputs(newtext[i],stdout);
>>    
>>
>
>Uh, why not use printf directly?
>
>  
>
>>if (show_version)
>>{
>>    printf("%s (PostgreSQL) %s\n",argv[0],PG_VERSION);
>>    exit(0);
>>}
>>    
>>
>
>For the --version output, the program name is always hardcoded, to allow
>identification in case the program was renamed.
>
>  
>
>>       if (!mkdatadir(NULL))
>>       {
>>           exit_nicely();
>>       }
>>       else
>>           check_ok();
>>    
>>
>
>Bizarre use of braces.
>
>  
>

Re: new initdb.c available

From

Tom Lane

Date:

06 October 2003, 21:12:09

Andrew Dunstan <andrew@dunslane.net> writes:
> Doesn't pg_indent do the spacing for us when code is merged?

For the most part it will.  You can ask Bruce to run the code through it
for you if you don't have BSD indent handy.
        regards, tom lane

Re: new initdb.c available

From

Bruce Momjian

Date:

06 October 2003, 22:43:38

Andrew Dunstan wrote:
> 
> Thanks. I will attend to most of this. A couple of points:
> 
> . using "wb" for writing out on Windows is so that we don't get Windows' 
> gratuitous addition of carriage returns. I will document that.
> . I can't use do { stuff; } while(0) for a macro that does declarations 
> - the declarations would be local to the do block.
> 
> Doesn't pg_indent do the spacing for us when code is merged? I guess I 
> can get BSD indent - my Linux box only has GNU indent. If indent won't 
> do spacing I'll go through and do it by hand.

Patched BSD indent is our our ftp server.  It is mentioned in the
pgindent README in current CVS.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073