tsvector extraction patch

Started by Hans-Jürgen Schönigalmost 17 years ago7 messageshackers
Jump to latest
#1Hans-Jürgen Schönig
postgres@cybertec.at

hello,

this patch has not made it through yesterday, so i am trying to send it
again.
i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text
processing and comparison.

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
this is a good patch'));
lex | rank
--------+------
good | 8
patch | 9
pretti | 3
sure | 4
(4 rows)

many thanks,

hans

--
Cybertec Schoenig & Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

#2Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Hans-Jürgen Schönig (#1)
Re: tsvector extraction patch

Hans-Juergen Schoenig -- PostgreSQL wrote:

hello,

this patch has not made it through yesterday, so i am trying to send
it again.
i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text
processing and comparison.

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty
sure this is a good patch'));
lex | rank
--------+------
good | 8
patch | 9
pretti | 3
sure | 4
(4 rows)

many thanks,

hans

--
Cybertec Schoenig & Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

Attachments:

tsvcontent-0.1-ctxdiff.difftext/x-patch; name=tsvcontent-0.1-ctxdiff.diffDownload+208-0
#3Peter Eisentraut
peter_e@gmx.net
In reply to: Hans-Jürgen Schönig (#1)
Re: tsvector extraction patch

On Friday 03 July 2009 10:49:41 Hans-Juergen Schoenig -- PostgreSQL wrote:

hello,

this patch has not made it through yesterday, so i am trying to send it
again.
i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text
processing and comparison.

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
this is a good patch'));
lex | rank
--------+------
good | 8
patch | 9
pretti | 3
sure | 4
(4 rows)

Sounds useful. But in the interest of orthogonality (or whatever), how about
instead you write a cast from tsvector to text[], and then you can use
unnest() to convert that to a table, e.g.,

SELECT * FROM unnest(CAST(to_tsvector('...') AS text[]));

#4Mike Rylander
mrylander@gmail.com
In reply to: Hans-Jürgen Schönig (#1)
Re: tsvector extraction patch

On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig --
PostgreSQL<postgres@cybertec.at> wrote:

hello,

this patch has not made it through yesterday, so i am trying to send it
again.
i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text
processing and comparison.

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
this is a good patch'));
lex   | rank
--------+------
good   |    8
patch  |    9
pretti |    3
sure   |    4
(4 rows)

This looks very useful! I wonder if providing a "weight" column would
be relatively simple? I think this would present problems with the
cast-to-text[] idea that Peter suggests, though.

--
Mike Rylander
| VP, Research and Design
| Equinox Software, Inc. / The Evergreen Experts
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@esilibrary.com
| web: http://www.esilibrary.com

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Mike Rylander (#4)
Re: tsvector extraction patch

Mike Rylander escribi�:

On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig --
PostgreSQL<postgres@cybertec.at> wrote:

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
this is a good patch'));
lex � | rank
--------+------
good � | � �8
patch �| � �9
pretti | � �3
sure � | � �4
(4 rows)

This looks very useful! I wonder if providing a "weight" column would
be relatively simple? I think this would present problems with the
cast-to-text[] idea that Peter suggests, though.

Where would the weight come from?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Mike Rylander
mrylander@gmail.com
In reply to: Hans-Jürgen Schönig (#1)
Fwd: tsvector extraction patch

Sorry, forgot to reply-all.

---------- Forwarded message ----------
From: Mike Rylander <mrylander@gmail.com>
Date: Wed, Jul 8, 2009 at 4:17 PM
Subject: Re: [HACKERS] tsvector extraction patch
To: Alvaro Herrera <alvherre@commandprompt.com>

On Wed, Jul 8, 2009 at 3:38 PM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:

Mike Rylander escribió:

On Fri, Jul 3, 2009 at 3:49 AM, Hans-Juergen Schoenig --
PostgreSQL<postgres@cybertec.at> wrote:

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
this is a good patch'));
lex   | rank
--------+------
good   |    8
patch  |    9
pretti |    3
sure   |    4
(4 rows)

This looks very useful!  I wonder if providing a "weight" column would
be relatively simple?  I think this would present problems with the
cast-to-text[] idea that Peter suggests, though.

Where would the weight come from?

From a tsvector column that has weights set via setweight().

--
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker@esilibrary.com
 | web:  http://www.esilibrary.com

--
Mike Rylander
| VP, Research and Design
| Equinox Software, Inc. / The Evergreen Experts
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@esilibrary.com
| web: http://www.esilibrary.com

#7Robert Haas
robertmhaas@gmail.com
In reply to: Hans-Jürgen Schönig (#2)
Re: tsvector extraction patch

On Fri, Jul 3, 2009 at 3:01 AM, Hans-Juergen Schoenig -- PostgreSQL
<postgres@cybertec.at> wrote:

Hans-Juergen Schoenig -- PostgreSQL wrote:

hello,

this patch has not made it through yesterday, so i am trying to send it
again.
i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text
processing and comparison.

test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure
this is a good patch'));
lex   | rank
--------+------
good   |    8
patch  |    9
pretti |    3
sure   |    4
(4 rows)

 many thanks,

    hans

Hmm, looks like we never did anything about this. Hans-Juergen, you
should probably update this and add it to the open CommitFest if you
want it to be considered for 8.5.

https://commitfest.postgresql.org/action/commitfest_view/open

...Robert