v7.4.1 text_position() patch

Started by Korea PostgreSQL Users' Groupalmost 22 years ago9 messages
#1Korea PostgreSQL Users' Group
pgsql-kr@postgresql.or.kr

In src/backend/utils/adt/varlena.c,
766 line must be exits in block of 'else if (elm >1)' too.

Because, strpos() function make a wrong result in multibyte string.
line 796
------------
ps1 = p1 = (pg_wchar *) palloc((len1 + 1) * sizeof(pg_wchar));
(void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t1), p1, len1);
len1 = pg_wchar_strlen(p1);
ps2 = p2 = (pg_wchar *) palloc((len2 + 1) * sizeof(pg_wchar));
(void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t2), p2, len2);
len2 = pg_wchar_strlen(p2);

/*** recalculate px ****/
px = (len1 - len2);

for (p = 0; p <= px; p++)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Korea PostgreSQL Users' Group (#1)
Re: v7.4.1 text_position() patch

"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes:

In src/backend/utils/adt/varlena.c,
766 line must be exits in block of 'else if (elm >1)' too.
Because, strpos() function make a wrong result in multibyte string.

Hm. I don't think it can actually fail, because the wchar strings are
zero-terminated. But it does look like there's a missed speedup here.
(Tatsuo, do you agree?)

Thanks for the report!

regards, tom lane

#3Korea PostgreSQL Users' Group
pgsql-kr@postgresql.or.kr
In reply to: Korea PostgreSQL Users' Group (#1)
Re: v7.4.1 text_position() patch

strpos() function ( internal text_postion()) had a bug in unicode database.

dsn=> select id,subject, strpos(subject, ' ') from bd_22 where id = 3927;
id | subject | strpos
------+-------------+--------
3927 | 안녕하세요~ | 0
(1 row)

Time: 1.619 ms
dsn=> select id,subject, strpos(subject, ' ') from bd_22 where id between 3925 and 3927;
id | subject | strpos
------+-----------------------------------------+--------
3925 | 대구에 DB 스터디 관련한...곳..없는가요? | 4
3927 | 안녕하세요~ | 11
(2 rows)

Time: 2.490 ms
----
Sorry, above text is korean language.
strpos returns wrong result.

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr>
Cc: <pgsql-patches@postgresql.org>; "Tatsuo Ishii" <t-ishii@sra.co.jp>
Sent: Saturday, January 31, 2004 12:31 AM
Subject: Re: [PATCHES] v7.4.1 text_position() patch

Show quoted text

Hm. I don't think it can actually fail, because the wchar strings are
zero-terminated. But it does look like there's a missed speedup here.
(Tatsuo, do you agree?)

Thanks for the report!

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Korea PostgreSQL Users' Group (#3)
Re: v7.4.1 text_position() patch

"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes:

Hm. I don't think it can actually fail, because the wchar strings are
zero-terminated.

[ yes it can ]

You're right. I was confused at first because I couldn't reproduce the
problem, but then I realized it's because I'm running with
CLOBBER_FREED_MEMORY enabled, so the junk beyond the end of the string
won't match the other string.

Will commit the patch. Thanks.

regards, tom lane

#5Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#4)
Re: [PATCHES] v7.4.1 text_position() patch

"Korea PostgreSQL Users' Group" <pgsql-kr@postgresql.or.kr> writes:

Hm. I don't think it can actually fail, because the wchar strings are
zero-terminated.

[ yes it can ]

You're right. I was confused at first because I couldn't reproduce the
problem, but then I realized it's because I'm running with
CLOBBER_FREED_MEMORY enabled, so the junk beyond the end of the string
won't match the other string.

Will commit the patch. Thanks.

regards, tom lane

It's surprising that nobody noticed the bug until now. It seems it has
been there since 7.3 days. I would like to make a back patch for
7.3-stable if nobody objects.

BTW, I'm interested in Korea PostgreSQL Users' Group since I myself am
a bord member of Japan PostgreSQL Users' Group. Please let me know if
both users group could make some collaboration, such as having a
seminar in Korea or in Japan.
--
Tatsuo Ishii

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#5)
Re: [PATCHES] v7.4.1 text_position() patch

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

It's surprising that nobody noticed the bug until now. It seems it has
been there since 7.3 days. I would like to make a back patch for
7.3-stable if nobody objects.

No objection here. Note that I applied a minimal patch to the 7.4
branch, but a more extensive one with some cosmetic changes in HEAD.
You probably want to copy the 7.4 change to 7.3.

regards, tom lane

#7Joe Conway
mail@joeconway.com
In reply to: Tatsuo Ishii (#5)
1 attachment(s)
Re: [PATCHES] v7.4.1 text_position() patch

Tatsuo Ishii wrote:

It's surprising that nobody noticed the bug until now. It seems it has
been there since 7.3 days. I would like to make a back patch for
7.3-stable if nobody objects.

It's my bug :( -- sorry about that. Here's a 7.3 patch per Tom's nearby
advice. I'll apply if you'd like.

Joe

Attachments:

text_position-bug.7.3.patchtext/plain; name=text_position-bug.7.3.patchDownload
Index: src/backend/utils/adt/varlena.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/adt/varlena.c,v
retrieving revision 1.92.2.2
diff -c -r1.92.2.2 varlena.c
*** src/backend/utils/adt/varlena.c	30 Nov 2003 20:52:37 -0000	1.92.2.2
--- src/backend/utils/adt/varlena.c	31 Jan 2004 16:50:37 -0000
***************
*** 665,673 ****
  	len1 = (VARSIZE(t1) - VARHDRSZ);
  	len2 = (VARSIZE(t2) - VARHDRSZ);
  
- 	/* no use in searching str past point where search_str will fit */
- 	px = (len1 - len2);
- 
  	if (eml == 1)				/* simple case - single byte encoding */
  	{
  		char	   *p1,
--- 665,670 ----
***************
*** 676,681 ****
--- 673,681 ----
  		p1 = VARDATA(t1);
  		p2 = VARDATA(t2);
  
+ 		/* no use in searching str past point where search_str will fit */
+ 		px = (len1 - len2);
+ 
  		for (p = 0; p <= px; p++)
  		{
  			if ((*p2 == *p1) && (strncmp(p1, p2, len2) == 0))
***************
*** 702,707 ****
--- 702,710 ----
  		ps2 = p2 = (pg_wchar *) palloc((len2 + 1) * sizeof(pg_wchar));
  		(void) pg_mb2wchar_with_len((unsigned char *) VARDATA(t2), p2, len2);
  		len2 = pg_wchar_strlen(p2);
+ 
+ 		/* no use in searching str past point where search_str will fit */
+ 		px = (len1 - len2);
  
  		for (p = 0; p <= px; p++)
  		{
#8Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Joe Conway (#7)
Re: [PATCHES] v7.4.1 text_position() patch

Tatsuo Ishii wrote:

It's surprising that nobody noticed the bug until now. It seems it has
been there since 7.3 days. I would like to make a back patch for
7.3-stable if nobody objects.

It's my bug :( -- sorry about that. Here's a 7.3 patch per Tom's nearby
advice. I'll apply if you'd like.

Joe

Thanks. Please apply it.
--
Tatsuo Ishii

#9Joe Conway
mail@joeconway.com
In reply to: Tatsuo Ishii (#8)
Re: [HACKERS] v7.4.1 text_position() patch

Tatsuo Ishii wrote:

Thanks. Please apply it.

Applied to REL7_3_STABLE.

Thanks,

Joe