patch for space around the FragmentDelimiter
FragmentDelimiter is an argument for ts_headline function to separates
different headline fragments. The default delimiter is " ... ".
Currently if someone specifies the delimiter as an option to the
function, no extra space is added around the delimiter. However, it does
not look good without space around the delimter.
Since the option parsing function removes any space around the given
value, it is not possible to add any desired space. The attached patch
adds space when a FragmentDelimiter is specified.
QUERY:
SELECT ts_headline('english', '
Day after day, day after day,
We stuck, nor breath nor motion,
As idle as a painted Ship
Upon a painted Ocean.
Water, water, every where
And all the boards did shrink;
Water, water, every where,
Nor any drop to drink.
S. T. Coleridge (1772-1834)
', to_tsquery('english', 'Coleridge & stuck'),
'MaxFragments=2,FragmentDelimiter=***');
OLD RESULT
ts_headline
--------------------------------------------
after day, day after day,
We <b>stuck</b>, nor breath nor motion,
As idle as a painted Ship
Upon a painted Ocean.
Water, water, every where
And all the boards did shrink;
Water, water, every where***drop to drink.
S. T. <b>Coleridge</b>
(1 row)
NEW RESULT after the patch
ts_headline
----------------------------------------------
after day, day after day,
We <b>stuck</b>, nor breath nor motion,
As idle as a painted Ship
Upon a painted Ocean.
Water, water, every where
And all the boards did shrink;
Water, water, every where *** drop to drink.
S. T. <b>Coleridge</b>
Attachments:
fragment_delimiter.patchtext/x-patch; charset=UTF-8; name=fragment_delimiter.patchDownload
Index: src/backend/tsearch/wparser_def.c
===================================================================
RCS file: /home/sushant/devel/pgrep/pgsql/src/backend/tsearch/wparser_def.c,v
retrieving revision 1.20
diff -c -r1.20 wparser_def.c
*** src/backend/tsearch/wparser_def.c 15 Jan 2009 16:33:59 -0000 1.20
--- src/backend/tsearch/wparser_def.c 2 Mar 2009 06:00:02 -0000
***************
*** 2082,2087 ****
--- 2082,2088 ----
int shortword = 3;
int max_fragments = 0;
int highlight = 0;
+ int len;
ListCell *l;
/* config */
***************
*** 2105,2111 ****
else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
prs->stopsel = pstrdup(val);
else if (pg_strcasecmp(defel->defname, "FragmentDelimiter") == 0)
! prs->fragdelim = pstrdup(val);
else if (pg_strcasecmp(defel->defname, "HighlightAll") == 0)
highlight = (pg_strcasecmp(val, "1") == 0 ||
pg_strcasecmp(val, "on") == 0 ||
--- 2106,2116 ----
else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
prs->stopsel = pstrdup(val);
else if (pg_strcasecmp(defel->defname, "FragmentDelimiter") == 0)
! {
! len = strlen(val) + 2 + 1;/* 2 for spaces and 1 for end of string */
! prs->fragdelim = palloc(len * sizeof(char));
! snprintf(prs->fragdelim, len, " %s ", val);
! }
else if (pg_strcasecmp(defel->defname, "HighlightAll") == 0)
highlight = (pg_strcasecmp(val, "1") == 0 ||
pg_strcasecmp(val, "on") == 0 ||
Index: src/test/regress/expected/tsearch.out
===================================================================
RCS file: /home/sushant/devel/pgrep/pgsql/src/test/regress/expected/tsearch.out,v
retrieving revision 1.15
diff -c -r1.15 tsearch.out
*** src/test/regress/expected/tsearch.out 17 Oct 2008 18:05:19 -0000 1.15
--- src/test/regress/expected/tsearch.out 2 Mar 2009 02:02:38 -0000
***************
*** 624,630 ****
<body>
<b>Sea</b> view wow <u><b>foo</b> bar</u> <i>qq</i>
<a href="http://www.google.com/foo.bar.html" target="_blank">YES </a>
! ff-bg
<script>
document.write(15);
</script>
--- 624,630 ----
<body>
<b>Sea</b> view wow <u><b>foo</b> bar</u> <i>qq</i>
<a href="http://www.google.com/foo.bar.html" target="_blank">YES </a>
! ff-bg
<script>
document.write(15);
</script>
***************
*** 712,726 ****
Nor any drop to drink.
S. T. Coleridge (1772-1834)
', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2,FragmentDelimiter=***');
! ts_headline
! --------------------------------------------
after day, day after day,
We <b>stuck</b>, nor breath nor motion,
As idle as a painted Ship
Upon a painted Ocean.
Water, water, every where
And all the boards did shrink;
! Water, water, every where***drop to drink.
S. T. <b>Coleridge</b>
(1 row)
--- 712,726 ----
Nor any drop to drink.
S. T. Coleridge (1772-1834)
', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2,FragmentDelimiter=***');
! ts_headline
! ----------------------------------------------
after day, day after day,
We <b>stuck</b>, nor breath nor motion,
As idle as a painted Ship
Upon a painted Ocean.
Water, water, every where
And all the boards did shrink;
! Water, water, every where *** drop to drink.
S. T. <b>Coleridge</b>
(1 row)
Sushant Sinha <sushant354@gmail.com> writes:
FragmentDelimiter is an argument for ts_headline function to separates
different headline fragments. The default delimiter is " ... ".
Currently if someone specifies the delimiter as an option to the
function, no extra space is added around the delimiter. However, it does
not look good without space around the delimter.
Maybe not to you, for the particular delimiter you happen to be working
with, but it doesn't follow that spaces are always appropriate.
Since the option parsing function removes any space around the given
value, it is not possible to add any desired space. The attached patch
adds space when a FragmentDelimiter is specified.
I think this is a pretty bad idea. Better would be to document how to
get spaces into the delimiter, ie, use double quotes:
... FragmentDelimiter = " ... " ...
Hmm, actually, it looks to me that the documentation already shows this,
in the example of the default values.
regards, tom lane
yeah you are right. I did not know that you can pass space using double
quotes.
-Sushant.
Show quoted text
On Sun, 2009-03-01 at 20:49 -0500, Tom Lane wrote:
Sushant Sinha <sushant354@gmail.com> writes:
FragmentDelimiter is an argument for ts_headline function to separates
different headline fragments. The default delimiter is " ... ".
Currently if someone specifies the delimiter as an option to the
function, no extra space is added around the delimiter. However, it does
not look good without space around the delimter.Maybe not to you, for the particular delimiter you happen to be working
with, but it doesn't follow that spaces are always appropriate.Since the option parsing function removes any space around the given
value, it is not possible to add any desired space. The attached patch
adds space when a FragmentDelimiter is specified.I think this is a pretty bad idea. Better would be to document how to
get spaces into the delimiter, ie, use double quotes:... FragmentDelimiter = " ... " ...
Hmm, actually, it looks to me that the documentation already shows this,
in the example of the default values.regards, tom lane