regexp_split_to_array hangs backend

Started by Pavel Stehuleover 18 years ago2 messageshackers
Jump to latest
#1Pavel Stehule
pavel.stehule@gmail.com

Hello,

I found small bug

regexp_split_to_array('123456','1');
regexp_split_to_array('123456','6');
regexp_split_to_array('123456','.');

these parameters hangs backend.

following patch correct it

Regards
Pavel Stehule

./regexp.c
*** ./regexp.c.orig     2007-08-10 14:17:15.000000000 +0200
--- ./regexp.c  2007-08-10 14:19:36.000000000 +0200
***************
*** 1048,1053 ****
--- 1048,1056 ----
                {
                        int length = splitctx->match.rm_so - startpos + 1;
+                       /* set the offset to the end of this match for
next time */
+                       splitctx->offset = pmatch->rm_eo;
+
                        /*
                         * If we are trying to match at the beginning
of the string and
                         * we got a zero-length match, or if we just
matched where we
***************
*** 1063,1070 ****

Int32GetDatum(startpos),

Int32GetDatum(length));

- /* set the offset to the end of this match for
next time */
- splitctx->offset = pmatch->rm_eo;

                        return result;
                }
--- 1066,1071 ----
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#1)
Re: regexp_split_to_array hangs backend

"Pavel Stehule" <pavel.stehule@gmail.com> writes:

I found small bug

regexp_split_to_array('123456','1');
regexp_split_to_array('123456','6');
regexp_split_to_array('123456','.');

these parameters hangs backend.

This code's got more problems than that :-(

The one that's bothering me right now is that regexp_match() and
regexp_split() cache a compiled regex on first entry to the function,
and then blithely assume it will still be there on repeated calls.

I think probably the best thing to do is do all the matching on the
first call, and have the saved state include an array of character
positions of matches; then repeat calls to the SRF just iterate through
the array.

It seems a bit short of comments too. Working on it now.

regards, tom lane