tsearch2 regression test failures

Started by Magnus Haganderalmost 19 years ago8 messages
#1Magnus Hagander
magnus@hagander.net
1 attachment(s)

tsearch2 regression tests are also failing on win32/msvc, with attached
diffs.

Any pointers on where to start? ;)

//Magnus

Attachments:

regression.diffstext/plain; name=regression.diffsDownload
*** ./expected/tsearch2.out	2006-09-10 19:36:52.000000000 +0200
--- ./results/tsearch2.out	2007-03-24 15:03:01.593750000 +0100
***************
*** 799,806 ****
  /usr/local/fff /awdf/dwqe/4325 rewt/ewr wefjn /wqe-324/ewr gist.h gist.h.c gist.c. readline 4.2 4.2. 4.2, readline-4.2 readline-4.2. 234 
  <i <b> wow  < jqw <> qwerty');
                                                                                                                                                                                                                                                                                                                                                                                                                               to_tsvector                                                                                                                                                                                                                                                                                                                                                                                                                              
! ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  'ad':17 'dw':19 'jf':39 '234':63 '345':1 '4.2':54,55,56,59,62 '455':31 'jqw':66 'qwe':2,18,27,28,35 'wer':36 'wow':65 'asdf':37 'ewr1':43 'qwer':38 'sdjk':40 '5.005':32 'efd.r':3 'ewri2':44 'hjwer':42 'qwqwe':29 'wefjn':48 'gist.c':52 'gist.h':50 'qwerti':67 '234.435':30 'qwe-wer':34 'readlin':53,58,61 'www.com':4 '+4.0e-10':26 'gist.h.c':51 'rewt/ewr':47 '/?ad=qwe&dw':7,10,14,22 '/wqe-324/ewr':49 'aew.werc.ewr':6 'readline-4.2':57,60 '1aew.werc.ewr':9 '2aew.werc.ewr':11 '3aew.werc.ewr':13 '4aew.werc.ewr':15 '/usr/local/fff':45 '/awdf/dwqe/4325':46 'teodor@stack.net':33 '/?ad=qwe&dw=%20%32':25 '5aew.werc.ewr:8100':16 '6aew.werc.ewr:8100':21 '7aew.werc.ewr:8100':24 'aew.werc.ewr/?ad=qwe&dw':5 '1aew.werc.ewr/?ad=qwe&dw':8 '3aew.werc.ewr/?ad=qwe&dw':12 '6aew.werc.ewr:8100/?ad=qwe&dw':20 '7aew.werc.ewr:8100/?ad=qwe&dw=%20%32':23
  (1 row)
  
  SELECT length(to_tsvector('default', '345 qw'));
--- 799,806 ----
  /usr/local/fff /awdf/dwqe/4325 rewt/ewr wefjn /wqe-324/ewr gist.h gist.h.c gist.c. readline 4.2 4.2. 4.2, readline-4.2 readline-4.2. 234 
  <i <b> wow  < jqw <> qwerty');
                                                                                                                                                                                                                                                                                                                                                                                                                                       to_tsvector                                                                                                                                                                                                                                                                                                                                                                                                                                     
! -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  'i':64 'ad':17 'dw':19 'jf':39 'we':41 '234':63 '345':1 '4.2':54,55,56,59,62 '455':31 'jqw':66 'qwe':2,18,27,28,35 'wer':36 'wow':65 'asdf':37 'ewr1':43 'qwer':38 'sdjk':40 '5.005':32 'efd.r':3 'ewri2':44 'hjwer':42 'qwqwe':29 'wefjn':48 'gist.c':52 'gist.h':50 'qwerti':67 '234.435':30 'qwe-wer':34 'readlin':53,58,61 'www.com':4 '+4.0e-10':26 'gist.h.c':51 'rewt/ewr':47 '/?ad=qwe&dw':7,10,14,22 '/wqe-324/ewr':49 'aew.werc.ewr':6 'readline-4.2':57,60 '1aew.werc.ewr':9 '2aew.werc.ewr':11 '3aew.werc.ewr':13 '4aew.werc.ewr':15 '/usr/local/fff':45 '/awdf/dwqe/4325':46 'teodor@stack.net':33 '/?ad=qwe&dw=%20%32':25 '5aew.werc.ewr:8100':16 '6aew.werc.ewr:8100':21 '7aew.werc.ewr:8100':24 'aew.werc.ewr/?ad=qwe&dw':5 '1aew.werc.ewr/?ad=qwe&dw':8 '3aew.werc.ewr/?ad=qwe&dw':12 '6aew.werc.ewr:8100/?ad=qwe&dw':20 '7aew.werc.ewr:8100/?ad=qwe&dw=%20%32':23
  (1 row)
  
  SELECT length(to_tsvector('default', '345 qw'));
***************
*** 814,820 ****
  <i <b> wow  < jqw <> qwerty'));
   length 
  --------
!      51
  (1 row)
  
  select to_tsquery('default', 'qwe & sKies '); 
--- 814,820 ----
  <i <b> wow  < jqw <> qwerty'));
   length 
  --------
!      53
  (1 row)
  
  select to_tsquery('default', 'qwe & sKies '); 
***************
*** 831,868 ****
  
  select to_tsquery('default', '''the wether'':dc & ''           sKies '':BC ');
         to_tsquery       
! ------------------------
!  'wether':CD & 'sky':BC
  (1 row)
  
  select to_tsquery('default', 'asd&(and|fghj)');
     to_tsquery   
! ----------------
!  'asd' & 'fghj'
  (1 row)
  
  select to_tsquery('default', '(asd&and)|fghj');
     to_tsquery   
! ----------------
!  'asd' | 'fghj'
  (1 row)
  
  select to_tsquery('default', '(asd&!and)|fghj');
     to_tsquery   
! ----------------
!  'asd' | 'fghj'
  (1 row)
  
  select to_tsquery('default', '(the|and&(i&1))&fghj');
    to_tsquery  
! --------------
!  '1' & 'fghj'
  (1 row)
  
  select plainto_tsquery('default', 'the and z 1))& fghj');
    plainto_tsquery   
! --------------------
!  'z' & '1' & 'fghj'
  (1 row)
  
  select plainto_tsquery('default', 'foo bar') && plainto_tsquery('default', 'asd');
--- 831,868 ----
  
  select to_tsquery('default', '''the wether'':dc & ''           sKies '':BC ');
              to_tsquery             
! -----------------------------------
!  'the':CD & 'wether':CD & 'sky':BC
  (1 row)
  
  select to_tsquery('default', 'asd&(and|fghj)');
           to_tsquery         
! ----------------------------
!  'asd' & ( 'and' | 'fghj' )
  (1 row)
  
  select to_tsquery('default', '(asd&and)|fghj');
         to_tsquery       
! ------------------------
!  'asd' & 'and' | 'fghj'
  (1 row)
  
  select to_tsquery('default', '(asd&!and)|fghj');
         to_tsquery        
! -------------------------
!  'asd' & !'and' | 'fghj'
  (1 row)
  
  select to_tsquery('default', '(the|and&(i&1))&fghj');
                 to_tsquery               
! ----------------------------------------
!  ( 'the' | 'and' & 'i' & '1' ) & 'fghj'
  (1 row)
  
  select plainto_tsquery('default', 'the and z 1))& fghj');
            plainto_tsquery           
! ------------------------------------
!  'the' & 'and' & 'z' & '1' & 'fghj'
  (1 row)
  
  select plainto_tsquery('default', 'foo bar') && plainto_tsquery('default', 'asd');
***************
*** 2227,2234 ****
   9h        |    1 |      1
   9r        |    1 |      1
   9w        |    1 |      1
   qwerti    |    1 |      1
! (1146 rows)
  
  insert into test_tsvector values ('1', 'a:1a,2,3b b:5a,6a,7c,8');
  insert into test_tsvector values ('1', 'a:1a,2,3c b:5a,6b,7c,8b');
--- 2227,2236 ----
   9h        |    1 |      1
   9r        |    1 |      1
   9w        |    1 |      1
+  over      |    1 |      1
   qwerti    |    1 |      1
!  the       |    1 |      1
! (1148 rows)
  
  insert into test_tsvector values ('1', 'a:1a,2,3b b:5a,6a,7c,8');
  insert into test_tsvector values ('1', 'a:1a,2,3c b:5a,6b,7c,8b');
***************
*** 2262,2270 ****
   bar       |    1 |      2
   345       |    1 |      1
   b         |    1 |      1
   qq        |    1 |      1
   qwerti    |    1 |      1
! (8 rows)
  
  select * from stat('select a from test_tsvector','ad') order by ndoc desc, nentry desc, word;
     word    | ndoc | nentry 
--- 2264,2274 ----
   bar       |    1 |      2
   345       |    1 |      1
   b         |    1 |      1
+  over      |    1 |      1
   qq        |    1 |      1
   qwerti    |    1 |      1
!  the       |    1 |      1
! (10 rows)
  
  select * from stat('select a from test_tsvector','ad') order by ndoc desc, nentry desc, word;
     word    | ndoc | nentry 
***************
*** 2275,2283 ****
   foo       |    1 |      3
   bar       |    1 |      2
   345       |    1 |      1
   qq        |    1 |      1
   qwerti    |    1 |      1
! (8 rows)
  
  select reset_tsearch();
  NOTICE:  TSearch cache cleaned
--- 2279,2289 ----
   foo       |    1 |      3
   bar       |    1 |      2
   345       |    1 |      1
+  over      |    1 |      1
   qq        |    1 |      1
   qwerti    |    1 |      1
!  the       |    1 |      1
! (10 rows)
  
  select reset_tsearch();
  NOTICE:  TSearch cache cleaned
***************
*** 2344,2351 ****
  Upon a woman s face. E.  J.  Pratt  (1882 1964)
  '), to_tsquery('sea&thousand&years'));
                                                                                               get_covers                                                                                             
! ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  eros took {1 sea thousand year }1 {2 thousand year trace granit featur cliff crag scarp base took sea }2 hour one night hour storm place sculptur granit seam upon woman face e j pratt 1882 1964 
  (1 row)
  
  select get_covers(to_tsvector('Erosion It took the sea a thousand years,
--- 2350,2357 ----
  Upon a woman s face. E.  J.  Pratt  (1882 1964)
  '), to_tsquery('sea&thousand&years'));
                                                                                                                                   get_covers                                                                                                                                  
! -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  eros it took the {1 sea a thousand year }1 a {2 thousand year to trace the granit featur of this cliff in crag and scarp and base it took the sea }2 an hour one night an hour of storm to place the sculptur of these granit seam upon a woman s face e j pratt 1882 1964 
  (1 row)
  
  select get_covers(to_tsvector('Erosion It took the sea a thousand years,
***************
*** 2358,2365 ****
  Upon a woman s face. E.  J.  Pratt  (1882 1964)
  '), to_tsquery('granite&sea'));
                                                                                                  get_covers                                                                                                
! ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  eros took {1 sea thousand year thousand year trace {2 granit }1 featur cliff crag scarp base took {3 sea }2 hour one night hour storm place sculptur granit }3 seam upon woman face e j pratt 1882 1964 
  (1 row)
  
  select get_covers(to_tsvector('Erosion It took the sea a thousand years,
--- 2364,2371 ----
  Upon a woman s face. E.  J.  Pratt  (1882 1964)
  '), to_tsquery('granite&sea'));
                                                                                                                                      get_covers                                                                                                                                     
! -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  eros it took the {1 sea a thousand year a thousand year to trace the {2 granit }1 featur of this cliff in crag and scarp and base it took the {3 sea }2 an hour one night an hour of storm to place the sculptur of these granit }3 seam upon a woman s face e j pratt 1882 1964 
  (1 row)
  
  select get_covers(to_tsvector('Erosion It took the sea a thousand years,
***************
*** 2372,2379 ****
  Upon a woman s face. E.  J.  Pratt  (1882 1964)
  '), to_tsquery('sea'));
                                                                                               get_covers                                                                                             
! ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  eros took {1 sea }1 thousand year thousand year trace granit featur cliff crag scarp base took {2 sea }2 hour one night hour storm place sculptur granit seam upon woman face e j pratt 1882 1964 
  (1 row)
  
  select headline('Erosion It took the sea a thousand years,
--- 2378,2385 ----
  Upon a woman s face. E.  J.  Pratt  (1882 1964)
  '), to_tsquery('sea'));
                                                                                                                                   get_covers                                                                                                                                  
! -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
!  eros it took the {1 sea }1 a thousand year a thousand year to trace the granit featur of this cliff in crag and scarp and base it took the {2 sea }2 an hour one night an hour of storm to place the sculptur of these granit seam upon a woman s face e j pratt 1882 1964 
  (1 row)
  
  select headline('Erosion It took the sea a thousand years,
***************
*** 2461,2467 ****
  ---------+----------+-------------+------------+-----------+--------------
   default | lword    | Latin word  | Tsearch    | {en_stem} | 'tsearch'
   default | lword    | Latin word  | module     | {en_stem} | 'modul'
!  default | lword    | Latin word  | for        | {en_stem} | 
   default | lword    | Latin word  | PostgreSQL | {en_stem} | 'postgresql'
   default | version  | VERSION     | 7.3.3      | {simple}  | '7.3.3'
  (5 rows)
--- 2467,2473 ----
  ---------+----------+-------------+------------+-----------+--------------
   default | lword    | Latin word  | Tsearch    | {en_stem} | 'tsearch'
   default | lword    | Latin word  | module     | {en_stem} | 'modul'
!  default | lword    | Latin word  | for        | {en_stem} | 'for'
   default | lword    | Latin word  | PostgreSQL | {en_stem} | 'postgresql'
   default | version  | VERSION     | 7.3.3      | {simple}  | '7.3.3'
  (5 rows)
***************
*** 2481,2487 ****
   f        | 
   f        | 
   f        | '345':1 'qwerti':2 'copyright':3
!  f        | 'qq':7 'bar':2,8 'foo':1,3,6 'copyright':9
   f        | 'a':1A,2,3B 'b':5A,6A,7C,8
   f        | 'a':1A,2,3C 'b':5A,6B,7C,8B
   f        | '7w' 'ch' 'd7' 'eo' 'gw' 'i4' 'lq' 'o6' 'qt' 'y0'
--- 2487,2493 ----
   f        | 
   f        | 
   f        | '345':1 'qwerti':2 'copyright':3
!  f        | 'qq':7 'bar':2,8 'foo':1,3,6 'the':4 'over':5 'copyright':9
   f        | 'a':1A,2,3B 'b':5A,6A,7C,8
   f        | 'a':1A,2,3C 'b':5A,6B,7C,8B
   f        | '7w' 'ch' 'd7' 'eo' 'gw' 'i4' 'lq' 'o6' 'qt' 'y0'

======================================================================

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#1)
Re: tsearch2 regression test failures

Magnus Hagander <magnus@hagander.net> writes:

tsearch2 regression tests are also failing on win32/msvc, with attached
diffs.
Any pointers on where to start? ;)

FWIW, it looks like it failed to reject stopwords. Is it possible you
ran it in an environment that would make it pick the Russian stopword
list?

regards, tom lane

#3Teodor Sigaev
teodor@sigaev.ru
In reply to: Tom Lane (#2)
1 attachment(s)
Re: tsearch2 regression test failures

FWIW, it looks like it failed to reject stopwords. Is it possible you

Right.

I suppose the problem is with '\r\n'... Try attached patch.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

ttttext/plain; name=tttDownload
*** ./contrib/tsearch2/stopword.c.orig	Mon Mar 26 14:25:16 2007
--- ./contrib/tsearch2/stopword.c	Mon Mar 26 14:28:25 2007
***************
*** 47,53 ****
  
  		while (fgets(buf, sizeof(buf), hin))
  		{
! 			buf[strlen(buf) - 1] = '\0';
  			pg_verifymbstr(buf, strlen(buf), false);
  			if (*buf == '\0')
  				continue;
--- 47,57 ----
  
  		while (fgets(buf, sizeof(buf), hin))
  		{
! 			pbuf = buf;
! 			while( !isspace( *pbuf ) )
! 				pbuf++;
! 			*pbuf = '\0';
! 
  			pg_verifymbstr(buf, strlen(buf), false);
  			if (*buf == '\0')
  				continue;
#4Magnus Hagander
magnus@hagander.net
In reply to: Teodor Sigaev (#3)
Re: tsearch2 regression test failures

On Mon, Mar 26, 2007 at 02:32:26PM +0400, Teodor Sigaev wrote:

FWIW, it looks like it failed to reject stopwords. Is it possible you

Right.

I suppose the problem is with '\r\n'... Try attached patch.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW:
http://www.sigaev.ru/

*** ./contrib/tsearch2/stopword.c.orig	Mon Mar 26 14:25:16 2007
--- ./contrib/tsearch2/stopword.c	Mon Mar 26 14:28:25 2007
***************
*** 47,53 ****
while (fgets(buf, sizeof(buf), hin))
{
! 			buf[strlen(buf) - 1] = '\0';
pg_verifymbstr(buf, strlen(buf), false);
if (*buf == '\0')
continue;
--- 47,57 ----

while (fgets(buf, sizeof(buf), hin))
{
! pbuf = buf;
! while( !isspace( *pbuf ) )
! pbuf++;
! *pbuf = '\0';
!
pg_verifymbstr(buf, strlen(buf), false);
if (*buf == '\0')
continue;

Yup, that solved the problem, thanks.

Wouldn't it be more efficiently written to walk the string backwards until
!isspace instead? Not sure that it matters at all, but then you'll
normallyi never step over more than two bytes...

//Magnus

#5Teodor Sigaev
teodor@sigaev.ru
In reply to: Magnus Hagander (#4)
Re: tsearch2 regression test failures

Yup, that solved the problem, thanks.

I'll commit extended patch - there is one more place with the same bug.

Wouldn't it be more efficiently written to walk the string backwards until
!isspace instead? Not sure that it matters at all, but then you'll
normallyi never step over more than two bytes...

It doesn't significant matter - file reads once per backend lifetime.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#6Magnus Hagander
magnus@hagander.net
In reply to: Teodor Sigaev (#5)
Re: tsearch2 regression test failures

Yup, that solved the problem, thanks.> I'll commit extended patch - there is one more place with the same bug.

Ok, thanks.

Wouldn't it be more efficiently written to walk the string backwards until

!isspace instead? Not sure that it matters at all, but then you'll
normallyi never step over more than two bytes...

It doesn't significant matter - file reads once per backend lifetime.

ok.

/Magnus

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Teodor Sigaev (#3)
Re: tsearch2 regression test failures

Teodor Sigaev <teodor@sigaev.ru> writes:

! pbuf = buf;
! while( !isspace( *pbuf ) )
! pbuf++;
! *pbuf = '\0';

Surely the loop needs to look like

while (*pbuf && !isspace(*pbuf))
pbuf++;

regards, tom lane

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#7)
Re: tsearch2 regression test failures

Tom Lane wrote:

Teodor Sigaev <teodor@sigaev.ru> writes:

! pbuf = buf;
! while( !isspace( *pbuf ) )
! pbuf++;
! *pbuf = '\0';

Surely the loop needs to look like

while (*pbuf && !isspace(*pbuf))
pbuf++;

Yes.

But in any case, I am having difficulty in understanding why we are
seeing a CR at all - the file should be opened in text mode, which
should translate CR-LF in the file to a simple LF in the buffer. So
regardless of the odd behavior of CVSNT, which presumabye caused this
mess, it's rather strange.

Can someone please explain?

cheers

andrew