Re: [PATCHES] Changes in /contrib/fulltextindex

Started by Christopher Kings-Lynneover 23 years ago15 messages
#1Christopher Kings-Lynne
chriskl@familyhealth.com.au

Hi Florian,

The most recent patches were submitted by me, so I guess you

could call me

the defacto "maintainer".

Okay - glad someone answered me :)

Actually, I replied to you 5 minutes after you posted, but I think my emails
were being stalled somewhere...

I will - please give me a few days for an up to date documentation
concerning the changed and new features.

And yes - I really appreciate your offer for code review!

To generate the diff, do this:

cd contrib/fulltextindex
cvs diff -c > ftidiff.txt

Then email -hackers the ftidiff.txt.

The changes made include:

+ Changed the split up behaviour from checking via isalpha to
using a list of delimiters as isalpha is a pain used with
data containing german umlauts, etc. ATM this list contains:

" ,;.:-_#/*+~^�!?\"\\�$%&()[]{}=<>|0123456789\n\r\t@�"

Good idea. Is there a locale-aware version of isalpha anywhere?

If there is - I couldn't find it. I did find a lot of frustated
posts about
isalpha and locale-awareness although.

Yeah, I can't find anything in the man pages either. Maybe we can ask the
list. People?

List: what should we do about the backward compatibility problem?

I think the only reasonable way for keeping backward compatibiliy might be
to leave the old fti function alone and introduce a new one with
the changes
(e.g. ftia). Even another fti parameter which activates the new features
breaks the compatibility concerning the call. Activiation via DEFINE is
another option, but this requires messing around with the source code
(although very little) on the user side. Maybe a ./configure option is a
good way (but this is beyond my C and friends skills).

I think that creating a new function, called ftia or ftix or something is
the best solution. I think I can handle doing that...

Chris

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#1)
Re: [PATCHES] Changes in /contrib/fulltextindex

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

Good idea. Is there a locale-aware version of isalpha anywhere?

If there is - I couldn't find it. I did find a lot of frustated
posts about isalpha and locale-awareness although.

Yeah, I can't find anything in the man pages either. Maybe we can ask the
list. People?

Huh? isalpha() *is* locale-aware according to the ANSI C spec.
For instance, the attached test program finds 52 alpha characters
in C locale and 114 in fr_FR locale under HPUX.

I am not at all sure that this aspect of Florian's change is a good
idea, as it appears to eliminate locale-awareness in favor of a hard
coded delimiter list.

regards, tom lane

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <locale.h>

int main(int argc, char **argv)
{
int i;

setlocale(LC_ALL, "");

for (i = 0; i < 256; i++)
if (isalpha(i))
printf("%d %c\n", i, i);

return 0;
}

#3Florian Helmberger
f.helmberger@uptime.at
In reply to: Tom Lane (#2)
Re: [PATCHES] Changes in /contrib/fulltextindex

Hi.

Huh? isalpha() *is* locale-aware according to the ANSI C spec.
For instance, the attached test program finds 52 alpha characters
in C locale and 114 in fr_FR locale under HPUX.

I am not at all sure that this aspect of Florian's change is a good
idea, as it appears to eliminate locale-awareness in favor of a hard
coded delimiter list.

Just tried your example - you're right of course! I will remove the hard
coded delimited list and replace it with the proper calls as shown in the
code you've sent.

Florian

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Florian Helmberger (#3)
Re: [PATCHES] Changes in /contrib/fulltextindex

"Florian Helmberger" <f.helmberger@uptime.at> writes:

Just tried your example - you're right of course! I will remove the hard
coded delimited list and replace it with the proper calls as shown in the
code you've sent.

Well, that was a quick hack not clean code. Coding rules for stuff
inside the backend are
- don't do setlocale; it's already been done.
- explicitly cast the argument of any ctype.h macro to
(unsigned char).

Without the latter you have portability problems depending on whether
chars are signed or unsigned.

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#1)
Re: [PATCHES] Changes in /contrib/fulltextindex

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

I think that creating a new function, called ftia or ftix or something is
the best solution. I think I can handle doing that...

Why change the name? If it's got a different argument list then you
can just overload the same old name.

regards, tom lane

#6Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Florian Helmberger (#3)
Re: [PATCHES] Changes in /contrib/fulltextindex

I am not at all sure that this aspect of Florian's change is a good
idea, as it appears to eliminate locale-awareness in favor of a hard
coded delimiter list.

Just tried your example - you're right of course! I will remove the hard
coded delimited list and replace it with the proper calls as shown in the
code you've sent.

OK Florian, submit a diff with your changes and I'll give them a run.

I forgot that we could just overload functions with different parameter
lists! That sounds like a good idea.

Chris

#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Florian Helmberger (#3)
Re: [PATCHES] Changes in /contrib/fulltextindex

Florian, I haven't seen this patch yet. Did you send it in?

---------------------------------------------------------------------------

Florian Helmberger wrote:

Hi.

Huh? isalpha() *is* locale-aware according to the ANSI C spec.
For instance, the attached test program finds 52 alpha characters
in C locale and 114 in fr_FR locale under HPUX.

I am not at all sure that this aspect of Florian's change is a good
idea, as it appears to eliminate locale-awareness in favor of a hard
coded delimiter list.

Just tried your example - you're right of course! I will remove the hard
coded delimited list and replace it with the proper calls as shown in the
code you've sent.

Florian

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Bruce Momjian (#7)
Re: [PATCHES] Changes in /contrib/fulltextindex

Yeah, I've got it Bruce - I still haven't had time to look into it and I
really don't know what to do about the backward compatibility issue. How do
I set up 2 identically named C functions with different parameter lists?

Chris

Show quoted text

-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Friday, 12 July 2002 6:02 AM
To: Florian Helmberger
Cc: Tom Lane; Christopher Kings-Lynne; Hackers
Subject: Re: [HACKERS] [PATCHES] Changes in /contrib/fulltextindex

Florian, I haven't seen this patch yet. Did you send it in?

------------------------------------------------------------------
---------

Florian Helmberger wrote:

Hi.

Huh? isalpha() *is* locale-aware according to the ANSI C spec.
For instance, the attached test program finds 52 alpha characters
in C locale and 114 in fr_FR locale under HPUX.

I am not at all sure that this aspect of Florian's change is a good
idea, as it appears to eliminate locale-awareness in favor of a hard
coded delimiter list.

Just tried your example - you're right of course! I will remove the hard
coded delimited list and replace it with the proper calls as

shown in the

code you've sent.

Florian

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#9Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Christopher Kings-Lynne (#8)
Re: [PATCHES] Changes in /contrib/fulltextindex

Christopher Kings-Lynne wrote:

Yeah, I've got it Bruce - I still haven't had time to look into it and I
really don't know what to do about the backward compatibility issue. How do
I set up 2 identically named C functions with different parameter lists?

Oh, that is easy. When you CREATE FUNCTION, you just specify the
different params. However, if you are calling it _from_ C, then it is
impossible. Just break backward compatibility, I think was Tom's
suggestion, and I agree.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Bruce Momjian (#9)
Re: [PATCHES] Changes in /contrib/fulltextindex

Christopher Kings-Lynne wrote:

Yeah, I've got it Bruce - I still haven't had time to look into it and I
really don't know what to do about the backward compatibility

issue. How do

I set up 2 identically named C functions with different parameter lists?

Oh, that is easy. When you CREATE FUNCTION, you just specify the
different params. However, if you are calling it _from_ C, then it is
impossible. Just break backward compatibility, I think was Tom's
suggestion, and I agree.

I mean, can I code up 2 functions called "fti" and put them both in the
fti.c and then have them both in the fti.so? Then when CREATE FUNCTION is
run it will link to the correct function in the fti.so depending on the
parameter list?

It's easy for you guys to say "break backward", but you aren't using it ;)

Chris

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Christopher Kings-Lynne (#10)
Re: [PATCHES] Changes in /contrib/fulltextindex

Christopher Kings-Lynne wrote:

Christopher Kings-Lynne wrote:

Yeah, I've got it Bruce - I still haven't had time to look into it and I
really don't know what to do about the backward compatibility

issue. How do

I set up 2 identically named C functions with different parameter lists?

Oh, that is easy. When you CREATE FUNCTION, you just specify the
different params. However, if you are calling it _from_ C, then it is
impossible. Just break backward compatibility, I think was Tom's
suggestion, and I agree.

I mean, can I code up 2 functions called "fti" and put them both in the
fti.c and then have them both in the fti.so? Then when CREATE FUNCTION is
run it will link to the correct function in the fti.so depending on the
parameter list?

Call them different C names, but name them the same in CREATE FUNCTION
funcname. Just use a different symbol name here:

CREATE [ OR REPLACE ] FUNCTION name ( [ argtype [, ...] ] )
^^^^ same here
RETURNS rettype
AS 'obj_file', 'link_symbol'
^^^^^^^^^^^^^ different here
LANGUAGE langname
[ WITH ( attribute [, ...] ) ]

Does that help?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Bruce Momjian (#11)
Re: [PATCHES] Changes in /contrib/fulltextindex

Call them different C names, but name them the same in CREATE FUNCTION
funcname. Just use a different symbol name here:

CREATE [ OR REPLACE ] FUNCTION name ( [ argtype [, ...] ] )
^^^^ same here
RETURNS rettype
AS 'obj_file', 'link_symbol'
^^^^^^^^^^^^^ different here
LANGUAGE langname
[ WITH ( attribute [, ...] ) ]

Does that help?

Yes, I get it now - I should be able to set it up quite nicely.

Chris

#13Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Christopher Kings-Lynne (#12)
Re: [PATCHES] Changes in /contrib/fulltextindex

Christopher Kings-Lynne wrote:

Call them different C names, but name them the same in CREATE FUNCTION
funcname. Just use a different symbol name here:

CREATE [ OR REPLACE ] FUNCTION name ( [ argtype [, ...] ] )
^^^^ same here
RETURNS rettype
AS 'obj_file', 'link_symbol'
^^^^^^^^^^^^^ different here
LANGUAGE langname
[ WITH ( attribute [, ...] ) ]

Does that help?

Yes, I get it now - I should be able to set it up quite nicely.

Yea, this function overloading is a nifty feature. No wonder C++ has
it.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#14Florian Helmberger
f.helmberger@uptime.at
In reply to: Bruce Momjian (#7)
Re: [PATCHES] Changes in /contrib/fulltextindex

Hi.

Florian, I haven't seen this patch yet. Did you send it in?

Yes, I sent it to Christopher for reviewing, as allready mentioned by
himself :)
I still had not the time to update the docs though, hope to get this done
next week.

Florian

#15Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Florian Helmberger (#14)
Re: [PATCHES] Changes in /contrib/fulltextindex

Florian Helmberger wrote:

Hi.

Florian, I haven't seen this patch yet. Did you send it in?

Yes, I sent it to Christopher for reviewing, as allready mentioned by
himself :)
I still had not the time to update the docs though, hope to get this done
next week.

Yes, I had an email exchange with Christopher last night and he is
working on the backward compatibility issues with overloaded function
parameters.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026