UTF-8 safe ascii() function

Started by Jean-Michel POUREover 23 years ago12 messages
#1Jean-Michel POURE
jm.poure@freesurf.fr

Dear all,

I would like to transform UTF-8 strings into Java-Unicode. Example :
- Latin1 : 'é'
- UTF-8 : 'é'
- Java Unicode = '\u00233'

Basically, a Unicode compatible ascii() function would be fine.
ascii('é') should return 233.

1) Has anyone written an ascii UTF-8 safe wrapper to ascii() function? If yes,
would you be so kind to publish this function on the list.

2) Are there plans to add an ascii() UTF-8 safe function to PostrgeSQL?

Best regards,
Jean-Michel POURE

#2Patrice Hédé
phede-ml@islande.org
In reply to: Jean-Michel POURE (#1)
3 attachment(s)
Re: [HACKERS] UTF-8 safe ascii() function

Hi Jean-Michel,

Jean-Michel POURE <jm.poure@freesurf.fr> a ᅵcrit :

Dear all,

I would like to transform UTF-8 strings into Java-Unicode. Example :
- Latin1 : 'ᅵ'
- UTF-8 : 'é'
- Java Unicode = '\u00233'

Basically, a Unicode compatible ascii() function would be fine.
ascii('é') should return 233.

1) Has anyone written an ascii UTF-8 safe wrapper to ascii() function?
If yes, would you be so kind to publish this function on the list.

OK, I just gave it a try, see the attachment.

The function is taking the first character of a TEXT element, and
returns its UCS2 value. I just did some basic test (i.e. I have not
tried with 3 or 4 bytes UTF-8 chars). The function is following the
Unicode 3.2 spec.

SELECT utf8toucs2('a'), utf8toucs2('ᅵ');
utf8toucs2 | utf8toucs2
------------+------------
97 | 233
(1 row)

The function returns -1 on error.

2) Are there plans to add an ascii() UTF-8 safe function to
PostrgeSQL?

I don't think the function I did is useful as such. It would be better
to make a function that converts the whole string or something.

By the way, what is the encoding for Java Unicode ? is it always "\u"
followed by 5 hex digits (in which case your example is wrong) ? Then,
it shouldn't be too difficult to make the relevant function, though I'm
wondering if the Java programme would convert an incoming '\' 'u' '0'
'0' '2' '3' '3' to the corresponding UCS2/UTF16 character ?

Maybe we should have some similar input (and output ?) functionality in
psql, but then I would much prefer the Perl way, which is
\x{hex_digits}, which is unambiguous.

Regards,

Patrice

--
Patrice Hᅵdᅵ
email: patrice hede(ᅵ)islande org
www : http://www.islande.org/

Attachments:

utf8toucs2.ctext/x-csrc; name=utf8toucs2.cDownload
utf8toucs2.sqltext/x-sql; name=utf8toucs2.sqlDownload
Makefiletext/x-makefile; name=MakefileDownload
#3Jean-Michel POURE
jm.poure@freesurf.fr
In reply to: Patrice Hédé (#2)
Re: [HACKERS] UTF-8 safe ascii() function

Dear Patrice,

Thank you very much. This will save the lives of Java users.

I don't think the function I did is useful as such. It would be better
to make a function that converts the whole string or something.

Yes, this would save the lives of some Javascript users. Java Unicode notation
is the only Unicode understood by Javascript.

By the way, what is the encoding for Java Unicode ? is it always "\u"
followed by 5 hex digits (in which case your example is wrong) ? Then,
it shouldn't be too difficult to make the relevant function, though I'm
wondering if the Java programme would convert an incoming '\' 'u' '0'
'0' '2' '3' '3' to the corresponding UCS2/UTF16 character ?

Java Unicode notation is not case sensitive ('\u' = '\U') and is followed by
an hexadecimal value.

Maybe we should have some similar input (and output ?) functionality in
psql, but then I would much prefer the Perl way, which is
\x{hex_digits}, which is unambiguous.

This would be perfect. We should also handle the HTML unicode nation :
&#{dec_digits} and &#x{hex_digits} as it is unambiguous.

Cheers,
Jean-Michel

#4Jean-Michel POURE
jm.poure@freesurf.fr
In reply to: Patrice Hédé (#2)
Re: [HACKERS] UTF-8 safe ascii() function

Le Dimanche 19 Mai 2002 11:44, Patrice Hédé a écrit :

The function is taking the first character of a TEXT element, and
returns its UCS2 value. I just did some basic test (i.e. I have not
tried with 3 or 4 bytes UTF-8 chars). The function is following the
Unicode 3.2 spec.

Hi Patrice,

I tried a Japanese character :
SELECT utf8toucs2 ('支'::text) which returns -1

Do you know why it does not return the UCS-2 value?

Cheers,
Jean-Michel POURE

#5Patrice Hédé
phede-ml@islande.org
In reply to: Jean-Michel POURE (#4)
1 attachment(s)
Re: [HACKERS] UTF-8 safe ascii() function

Jean-Michel POURE <jm.poure@freesurf.fr> a ᅵcrit :

I tried a Japanese character :
SELECT utf8toucs2 ('ᅵ_ᅵ'::text) which returns -1

Do you know why it does not return the UCS-2 value?

Oops, my mistake. I forgot to update a test after a copy-paste. Here is
a new version which should be correct this time ! :)

Patrice

--
Patrice Hᅵdᅵ
email: patrice hede ᅵ islande org
www : http://www.islande.org/

Attachments:

utf8toucs2.ctext/x-csrc; name=utf8toucs2.cDownload
#6Jean-Michel POURE
jm.poure@freesurf.fr
In reply to: Patrice Hédé (#5)
Re: [HACKERS] UTF-8 safe ascii() function

Le Dimanche 19 Mai 2002 21:14, Patrice Hédé a écrit :

Oops, my mistake. I forgot to update a test after a copy-paste. Here is
a new version which should be correct this time ! :)

Thanks Patrice, merci Patrice !

#7Gareth Kirwan
gbjk@thermeoneurope.com
In reply to: Jean-Michel POURE (#1)
Interval to number

Postgres 7.2
I have an interval selected from a max(occurance) - min(occurance) where
bla.
I now want to multiply this by a rate - to create a charge...

If I use to_char( interval, 'SSSS');
I will get a seconds conversion - but that works on seconds since midnight -
hence
with a one day period.

Are there any better ways of converting a timestamp to an integer?

Thanks

Gareth

#8Brian McCane
bmccane@mccons.net
In reply to: Gareth Kirwan (#7)
Re: Interval to number

EXTRACT is your friend :)

SELECT EXTRACT(EPOCH FROM max(occurrance) - min(occurrance))::integer ;

- brian

k=# SELECT EXTRACT(EPOCH FROM now() - '2001-01-01') ;
date_part
----------------
43583467.94995
(1 row)

On Mon, 20 May 2002, Gareth Kirwan wrote:

Postgres 7.2
I have an interval selected from a max(occurance) - min(occurance) where
bla.
I now want to multiply this by a rate - to create a charge...

If I use to_char( interval, 'SSSS');
I will get a seconds conversion - but that works on seconds since midnight -
hence
with a one day period.

Are there any better ways of converting a timestamp to an integer?

Thanks

Gareth

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Wm. Brian McCane | Life is full of doors that won't open
Search http://recall.maxbaud.net/ | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"

#9Gareth Kirwan
gbjk@thermeoneurope.com
In reply to: Brian McCane (#8)
Re: Interval to number

Oh :(

I'd given up waiting for a response.
Thanks though Brian ... I currently have the triggered function:

CREATE FUNCTION logSession () RETURNS opaque AS '
DECLARE
client_rate numeric(10,2);
period interval;
to_charge numeric(10,2);
BEGIN
SELECT INTO client_rate rate from clients c where c.id=OLD.client;
SELECT INTO period max(time) - min(time) FROM convs WHERE
session_id=OLD.id;
SELECT INTO to_charge (to_number(to_char(period, ''SSSS''), ''99999D99'')
/ 60 * client_rate);

INSERT INTO previous_sessions SELECT * from current_sessions c WHERE
c.id=OLD.id;
INSERT INTO logged_convs SELECT * from convs c WHERE c.session_id=OLD.id;

INSERT INTO session_logs (session_id, time, length, charge, paid) VALUES
(OLD.id,OLD.time,period, to_charge, ''false'');
RETURN OLD;
END;'
language 'plpgsql';

So I'll try to build it into that.

-----Original Message-----
From: Brian McCane [mailto:bmccane@mccons.net]
Sent: 20 May 2002 17:35
To: Gareth Kirwan
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Interval to number

EXTRACT is your friend :)

SELECT EXTRACT(EPOCH FROM max(occurrance) - min(occurrance))::integer ;

- brian

k=# SELECT EXTRACT(EPOCH FROM now() - '2001-01-01') ;
date_part
----------------
43583467.94995
(1 row)

On Mon, 20 May 2002, Gareth Kirwan wrote:

Postgres 7.2
I have an interval selected from a max(occurance) - min(occurance) where
bla.
I now want to multiply this by a rate - to create a charge...

If I use to_char( interval, 'SSSS');
I will get a seconds conversion - but that works on seconds since

midnight -

hence
with a one day period.

Are there any better ways of converting a timestamp to an integer?

Thanks

Gareth

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Wm. Brian McCane | Life is full of doors that won't open
Search http://recall.maxbaud.net/ | when you knock, equally spaced amid
those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"

#10lee johnson
lee@imyourhandiman.com
In reply to: Jean-Michel POURE (#1)
no pg_hba.conf

hi..

redhat 7.3 ..can't seem to get pgaccess to want to load my database
which btw is created and for which a user is also..postmaster is running
fine with -i -D /usr/local/pgsql/data..

when I try to load my database into pgaccess it says: no pg_hba.conf
entry for host 127.0.0.1.user lee..database handiman..

thx anyone
lee
-====

#11Joel Burton
joel@joelburton.com
In reply to: lee johnson (#10)
Re: no pg_hba.conf

-----Original Message-----
From: pgsql-interfaces-owner@postgresql.org
[mailto:pgsql-interfaces-owner@postgresql.org]On Behalf Of lee johnson
Sent: Wednesday, May 22, 2002 10:20 AM
To: pgsql-interfaces@postgresql.org
Subject: [INTERFACES] no pg_hba.conf

hi..

redhat 7.3 ..can't seem to get pgaccess to want to load my database
which btw is created and for which a user is also..postmaster is running
fine with -i -D /usr/local/pgsql/data..

when I try to load my database into pgaccess it says: no pg_hba.conf
entry for host 127.0.0.1.user lee..database handiman..

So... is that true? Have you looked in pg_hba.conf? What did you add to
that?

-J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#12lee
lee@imyourhandiman.com
In reply to: Joel Burton (#11)
Re: no pg_hba.conf

So... is that true? Have you looked in pg_hba.conf? What did you add to
that?

okay its working now ...I had previously upgraded 'to'
RH7.3 but backed out due to needing temporarily to install windows and
upon a fresh install of 7.3 all is fine now .

thx for efforts mucho
lee
-=