convert function

Started by Jan Sunavecover 18 years ago7 messagesgeneral
Jump to latest
#1Jan Sunavec
jan.sunavec@gmail.com

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('j�n', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('j�n', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jan Sunavec (#1)
Re: convert function

Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('ján', 'UNICODE', 'SQL_ASCII'),'escape');
encode
--------
ján
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('ján',
'latin2'),'escape'),'latin2');
to_ascii
----------
jan
(1 row)

Regards
Pavel Stehule

convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('Příliš žlutý kůň' using
utf8_to_iso_8859_2),'latin2');
to_ascii
------------------
Prilis zluty kun
(1 row)

Show quoted text

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#3Jan Sunavec
jan.sunavec@gmail.com
In reply to: Pavel Stehule (#2)
Re: convert function

Thanks a lot

Lots like nice a easy solution.. I am not sure if this is fast solution..
Many convertions you know.. :-(
Thanks a lot anyway.

John

On Wed, 12 Dec 2007 17:13:01 +0100, Pavel Stehule
<pavel.stehule@gmail.com> wrote:

Show quoted text

Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('j�n', 'UNICODE',
'SQL_ASCII'),'escape');
encode
--------
j�n
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('j�n',
'latin2'),'escape'),'latin2');
to_ascii
----------
jan
(1 row)

Regards
Pavel Stehule

convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('P��li� �lut� k��' using
utf8_to_iso_8859_2),'latin2');
to_ascii
------------------
Prilis zluty kun
(1 row)

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('j�n', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('j�n', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jan Sunavec (#3)
Re: convert function

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Thanks a lot

Lots like nice a easy solution.. I am not sure if this is fast solution..
Many convertions you know.. :-(
Thanks a lot anyway.

If you do this often, use functional index.

Pavel

Show quoted text

John

On Wed, 12 Dec 2007 17:13:01 +0100, Pavel Stehule
<pavel.stehule@gmail.com> wrote:

Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('ján', 'UNICODE',
'SQL_ASCII'),'escape');
encode
--------
ján
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('ján',
'latin2'),'escape'),'latin2');
to_ascii
----------
jan
(1 row)

Regards
Pavel Stehule

convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('Příliš žlutý kůň' using
utf8_to_iso_8859_2),'latin2');
to_ascii
------------------
Prilis zluty kun
(1 row)

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

#5Jan Sunavec
jan.sunavec@gmail.com
In reply to: Pavel Stehule (#4)
tsearch2 headline options

Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more
more more more more more', to_tsquery('asd'), '');

I got this

"<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more
more more"

So result is shorted than original text. I tryed set MinWords and
MaxWords. But it doesn't help me. So question is how can I get original
text?

Best regards

John

#6Oleg Bartunov
oleg@sai.msu.su
In reply to: Jan Sunavec (#5)
Re: tsearch2 headline options

On Wed, 2 Jan 2008, Jan Sunavec wrote:

Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more
more more more more more', to_tsquery('asd'), '');

I got this

"<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more more more"

So result is shorted than original text. I tryed set MinWords and MaxWords.
But it doesn't help me. So question is how can I get original text?

try 'HighlightAll=TRUE

arxiv=# select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more more more more more more', to_tsquery('asd'), 'HighlightAll=TRUE');
headline
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more more more more more more more

Best regards

John

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#7Jan Sunavec
jan.sunavec@gmail.com
In reply to: Oleg Bartunov (#6)
Re: tsearch2 headline options

Thanks a lot. It helps.

On Fri, 04 Jan 2008 16:54:32 +0100, Oleg Bartunov <oleg@sai.msu.su> wrote:

Show quoted text

On Wed, 2 Jan 2008, Jan Sunavec wrote:

Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more
more more more more more more', to_tsquery('asd'), '');

I got this

"<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more
more more"

So result is shorted than original text. I tryed set MinWords and
MaxWords. But it doesn't help me. So question is how can I get original
text?

try 'HighlightAll=TRUE

arxiv=# select headline('asd asd asd asd asd asd asd asd asd asd asd asd
more more more more more more more', to_tsquery('asd'),
'HighlightAll=TRUE');
headline
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b>
<b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> <b>asd</b> more
more more more more more more

Best regards

John

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83