Query inside RTF

Started by Leonardo M. Raméover 16 years ago9 messagesgeneral
Jump to latest
#1Leonardo M. Ramé
l.rame@griensu.com

Hi, in my database I store RTF files inside a Bytea field. Now a
customer is requesting a search inside RTF fields and I'm trying to
implement it by issuing this query:

select
*
from my_table
where
cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

Any hint on this?

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Leonardo M. Ramé (#1)
Re: Query inside RTF

2009/10/8 Leonardo M. <l.rame@griensu.com>:

Hi, in my database I store RTF files inside a Bytea field. Now a
customer is requesting a search inside RTF fields and I'm trying to
implement it by issuing this query:

select
 *
from my_table
where
 cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

Any hint on this?

first, try dollar quoting:
cast(rtf_field as varchar) like $merlin$%condition%$merlin$

:-)

merlin

#3Leonardo M. Ramé
l.rame@griensu.com
In reply to: Merlin Moncure (#2)
Re: Query inside RTF

On jue, 2009-10-08 at 11:28 -0400, Merlin Moncure wrote:

2009/10/8 Leonardo M. <l.rame@griensu.com>:

Hi, in my database I store RTF files inside a Bytea field. Now a
customer is requesting a search inside RTF fields and I'm trying to
implement it by issuing this query:

select
*
from my_table
where
cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

Any hint on this?

first, try dollar quoting:
cast(rtf_field as varchar) like $merlin$%condition%$merlin$

:-)

merlin

Thanks Merlin, I found the solution, it was related to
"standard_conforming_strings". My query is this now:

set standard_conforming_strings = 1;

SELECT
idturno,
Upper(cast(InfRes as Varchar))
as InfRes
from turno
where
cast(InfRes as Varchar) like '%diagn\\\\''f3stica%';

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979

#4Sam Mason
sam@samason.me.uk
In reply to: Leonardo M. Ramé (#1)
Re: Query inside RTF

On Thu, Oct 08, 2009 at 10:58:05AM -0300, Leonardo M. Rammm wrote:

cast(rtf_field as varchar) like '%condition%'

I don't think that cast isn't doing what you expect at all. As an
example, try doing something like:

select cast(bytea e'hi\nbye' as varchar);

encode() may be what you want instead. How do RTF files handle
encoding? You may have better luck with using a specialized program to
pull the text out of the RTF file and then using the text search stuff
in PG. LIKE conditions can't be indexed well and hence the above is
going to be slow for any reasonable number of documents.

--
Sam http://samason.me.uk/

#5Emanuel Calvo Franco
postgres.arg@gmail.com
In reply to: Leonardo M. Ramé (#1)
Re: Query inside RTF

select
 *
from my_table
where
 cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

I prefer use

select * from table where i_bytea::text ~~ $$%\\row%$$;

Dollar quoting is more clean to put strings inside than ' '. ~~ is
like operator.
And :: operand is more clean to the syntax of the query.
Don't think so? :)

--
Emanuel Calvo Franco
DBA at: www.siu.edu.ar
www.emanuelcalvofranco.com.ar

#6Leonardo M. Ramé
l.rame@griensu.com
In reply to: Emanuel Calvo Franco (#5)
Re: Query inside RTF

On jue, 2009-10-08 at 17:01 -0300, Emanuel Calvo Franco wrote:

select
*
from my_table
where
cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

I prefer use

select * from table where i_bytea::text ~~ $$%\\row%$$;

Dollar quoting is more clean to put strings inside than ' '. ~~ is
like operator.
And :: operand is more clean to the syntax of the query.
Don't think so? :)

It looks nice!, the only problem is I'm still forced to set
standard_conforming_strings = 1;

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979

#7Emanuel Calvo Franco
postgres.arg@gmail.com
In reply to: Leonardo M. Ramé (#6)
Re: Query inside RTF

2009/10/8 Leonardo M. <l.rame@griensu.com>:

On jue, 2009-10-08 at 17:01 -0300, Emanuel Calvo Franco wrote:

select
 *
from my_table
where
 cast(rtf_field as varchar) like '%condition%'

This works ok when the condition doesn't include accented chars. RTF
escapes accented characters as "\'f1" for í, "\'f3" for ó, and so on.

To escape \ and ', I'd read it shuld be used \\ and '', so I thaugth
that a like '%diagn\\\\''f3stica%' should get "diagnóstica", but it
doesn't.

I prefer use

select * from table where i_bytea::text ~~ $$%\\row%$$;

Dollar quoting is more clean to put strings inside than ' '. ~~ is
like operator.
And :: operand is more clean to the syntax of the query.
Don't think so? :)

It looks nice!, the only problem is I'm still forced to set
standard_conforming_strings = 1;

part=# set standard_conforming_strings = 0;
SET
part=# select * from p where i::text ~~ $$%\\\\f3%$$;
i
--------
\\\\f3
\\\\f3
(2 rows)

part=# select * from p where i::text ~~ $$%\\\\"f3%$$;
i
---------
\\\\"f3
(1 row)

part=# select * from p where i::text ~~ $$%\\\\''f3%$$;
i
----------
\\\\''f3
(1 row)

I don't have any problem, wich enconding you are using?
(select getdatabaseencoding(); )

--
Emanuel Calvo Franco
DBA at: www.siu.edu.ar
www.emanuelcalvofranco.com.ar

#8Leonardo M. Ramé
l.rame@griensu.com
In reply to: Emanuel Calvo Franco (#7)
Re: Query inside RTF

On jue, 2009-10-08 at 17:51 -0300, Emanuel Calvo Franco wrote:

part=# select * from p where i::text ~~ $$%\\\\''f3%$$;
i
----------
\\\\''f3
(1 row)

I don't have any problem, wich enconding you are using?
(select getdatabaseencoding(); )

Thanks, now this works:

set standard_conforming_strings = 0;

select
idturno,
infres::text
from turno
where
infres::text ~~ $$%diagn\\\\'f3stico%$$;

This database has WIN1252 encoding.

--
Leonardo M. Ramé
Griensu S.A. - Medical IT Córdoba
Tel.: 0351-4247979

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Leonardo M. Ramé (#8)
Re: Query inside RTF

Leonardo M. Ram� escribi�:

select
idturno,
infres::text
from turno
where
infres::text ~~ $$%diagn\\\\'f3stico%$$;

Just make sure you don't have $$ in the search pattern ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support