convert function

Started by Jan Sunavecover 18 years ago7 messagesgeneral

jan.sunavec@gmail.com

over 18 years ago

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('jďż˝n', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('jďż˝n', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

Pavel Stehule

pavel.stehule@gmail.com

over 18 years ago

In reply to: Jan Sunavec (#1)

Re: convert function

Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('ján', 'UNICODE', 'SQL_ASCII'),'escape');
encode
--------
ján
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('ján',
'latin2'),'escape'),'latin2');
to_ascii
----------
jan
(1 row)

Regards
Pavel Stehule

convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('Příliš žlutý kůň' using
utf8_to_iso_8859_2),'latin2');
to_ascii
------------------
Prilis zluty kun
(1 row)

Show quoted text

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Jan Sunavec

jan.sunavec@gmail.com

over 18 years ago

In reply to: Pavel Stehule (#2)

Re: convert function

Thanks a lot

Lots like nice a easy solution.. I am not sure if this is fast solution..
Many convertions you know.. :-(
Thanks a lot anyway.

John

On Wed, 12 Dec 2007 17:13:01 +0100, Pavel Stehule
<pavel.stehule@gmail.com> wrote:

Show quoted text

Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('jďż˝n', 'UNICODE',
'SQL_ASCII'),'escape');
encode
--------
jďż˝n
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('jďż˝n',
'latin2'),'escape'),'latin2');
to_ascii
----------
jan
(1 row)

Regards
Pavel Stehule

convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('Pďż˝ďż˝liďż˝ ďż˝lutďż˝ kďż˝ďż˝' using
utf8_to_iso_8859_2),'latin2');
to_ascii
------------------
Prilis zluty kun
(1 row)

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('jďż˝n', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('jďż˝n', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Pavel Stehule

pavel.stehule@gmail.com

over 18 years ago

In reply to: Jan Sunavec (#3)

Re: convert function

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Thanks a lot

Lots like nice a easy solution.. I am not sure if this is fast solution..
Many convertions you know.. :-(
Thanks a lot anyway.

If you do this often, use functional index.

Pavel

Show quoted text

John

On Wed, 12 Dec 2007 17:13:01 +0100, Pavel Stehule
<pavel.stehule@gmail.com> wrote:

Hello

It's look like SQL_ASCII support diacritic chars now. First you have
to encode from bytea to text

postgres=# SELECT encode(convert('ján', 'UNICODE',
'SQL_ASCII'),'escape');
encode
--------
ján
(1 row)

you wont
postgres=# SELECT to_ascii(encode(convert_to('ján',
'latin2'),'escape'),'latin2');
to_ascii
----------
jan
(1 row)

Regards
Pavel Stehule

convert do conversion from text to bytea type. For diacritic
elimination use to_ascii function:

postgres=# select to_ascii(convert('Příliš žlutý kůň' using
utf8_to_iso_8859_2),'latin2');
to_ascii
------------------
Prilis zluty kun
(1 row)

On 12/12/2007, Jan Sunavec <jan.sunavec@gmail.com> wrote:

Hi all

I have problem with "convert" function. Previous behaviour was
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
=======================================
jan

In postgresql 8.3 is quite new behaviour.
SELECT convert('ján', 'UNICODE', 'SQL_ASCII');
======================================
"j\241n"

This, drives me crazy. I mean, this is not useable for non english
country. I don't need convert to \241 characters. I understand that
someone need this behavour. But there should be possibility switch to
"normal" behaviour.

John

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Jan Sunavec

jan.sunavec@gmail.com

over 18 years ago

In reply to: Pavel Stehule (#4)

tsearch2 headline options

Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more
more more more more more', to_tsquery('asd'), '');

I got this

"asd asd asd asd asd asd
asd asd asd asd asd asd more
more more"

So result is shorted than original text. I tryed set MinWords and
MaxWords. But it doesn't help me. So question is how can I get original
text?

Best regards

John

Oleg Bartunov

oleg@sai.msu.su

over 18 years ago

In reply to: Jan Sunavec (#5)

Re: tsearch2 headline options

On Wed, 2 Jan 2008, Jan Sunavec wrote:

Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more
more more more more more', to_tsquery('asd'), '');

I got this

"asd asd asd asd asd asd asd
asd asd asd asd asd more more more"

So result is shorted than original text. I tryed set MinWords and MaxWords.
But it doesn't help me. So question is how can I get original text?

try 'HighlightAll=TRUE

arxiv=# select headline('asd asd asd asd asd asd asd asd asd asd asd asd more more more more more more more', to_tsquery('asd'), 'HighlightAll=TRUE');
headline
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
asd asd asd asd asd asd asd asd asd asd asd asd more more more more more more more

Best regards

John

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Jan Sunavec

jan.sunavec@gmail.com

over 18 years ago

In reply to: Oleg Bartunov (#6)

Re: tsearch2 headline options

Thanks a lot. It helps.

On Fri, 04 Jan 2008 16:54:32 +0100, Oleg Bartunov <oleg@sai.msu.su> wrote:

Show quoted text

On Wed, 2 Jan 2008, Jan Sunavec wrote:

Hi all

I have following problem when I use this

select headline('asd asd asd asd asd asd asd asd asd asd asd asd more
more more more more more more', to_tsquery('asd'), '');

I got this

"asd asd asd asd asd asd
asd asd asd asd asd asd more
more more"

So result is shorted than original text. I tryed set MinWords and
MaxWords. But it doesn't help me. So question is how can I get original
text?

try 'HighlightAll=TRUE

arxiv=# select headline('asd asd asd asd asd asd asd asd asd asd asd asd
more more more more more more more', to_tsquery('asd'),
'HighlightAll=TRUE');
headline
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
asd asd asd asd asd asd
asd asd asd asd asd asd more
more more more more more more

Best regards

John

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83