avoid recasting text to tsvector when calculating selectivity

Started by Jan Urbańskiover 17 years ago4 messages
#1Jan Urbański
j.urbanski@students.mimuw.edu.pl

I'm about to write a oprrest function for the @@ operator. Currently @@
handles multiple cases, like tsvector @@ tsquery, text @@ tsquery,
tsquery @@ tsvector etc. The text @@ text case is for instance handled
by calling to_tsvector and plainto_tsquery on the input arguments.

For a @@ restriction function, I need to have a tsquery and a tsvector,
so in the text @@ text situation I'd end up calling plainto_tsquery
during planning, which would consequently get called again during
execution. Also, I'd need a not-so-elegant if-elsif-elsif sequence at
the beginning of the function. Is this OK/unavoidable/easly avoided?

Cheers,
Jan
--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Urbański (#1)
Re: avoid recasting text to tsvector when calculating selectivity

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:

I'm about to write a oprrest function for the @@ operator. Currently @@
handles multiple cases, like tsvector @@ tsquery, text @@ tsquery,
tsquery @@ tsvector etc. The text @@ text case is for instance handled
by calling to_tsvector and plainto_tsquery on the input arguments.

For a @@ restriction function, I need to have a tsquery and a tsvector,
so in the text @@ text situation I'd end up calling plainto_tsquery
during planning, which would consequently get called again during
execution. Also, I'd need a not-so-elegant if-elsif-elsif sequence at
the beginning of the function. Is this OK/unavoidable/easly avoided?

I'm not following your point here. Sure, there are multiple flavors of
@@, but why shouldn't they each have their own oprrest function?

regards, tom lane

#3Jan Urbański
j.urbanski@students.mimuw.edu.pl
In reply to: Tom Lane (#2)
Re: avoid recasting text to tsvector when calculating selectivity

Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:

I'm about to write a oprrest function for the @@ operator. Currently @@
handles multiple cases, like tsvector @@ tsquery, text @@ tsquery,
tsquery @@ tsvector etc. The text @@ text case is for instance handled
by calling to_tsvector and plainto_tsquery on the input arguments.

For a @@ restriction function, I need to have a tsquery and a tsvector,
so in the text @@ text situation I'd end up calling plainto_tsquery
during planning, which would consequently get called again during
execution. Also, I'd need a not-so-elegant if-elsif-elsif sequence at
the beginning of the function. Is this OK/unavoidable/easly avoided?

I'm not following your point here. Sure, there are multiple flavors of
@@, but why shouldn't they each have their own oprrest function?

Because they'll all boil down to the same function. Suppose I have an
oprrest function for tsvector @@ tsquery. An oprrest for text @@ text
would just be:
tv = DatumGetTSVector(DirectFunctionCall1(to_tsvector, PG_GETARG_DATUM(0)));
tq = DatumGetTSQuery(DirectFunctionCall1(plainto_tsquery,
PG_GETARG_DATUM(1)));
res = DirectFunctionCall2(my_oprrest, TSVectorGetDatum(tv),
TSQueryGetDatun(tq))
...

I thought I might avoid having to call ts_tsvector and plainto_tsquery,
because the arguments need to be transformed to tsvector and tsquery
anyway during execution.

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

#4Jan Urbański
j.urbanski@students.mimuw.edu.pl
In reply to: Jan Urbański (#3)
Re: avoid recasting text to tsvector when calculating selectivity

Jan Urbański wrote:

Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl>
writes:

I'm about to write a oprrest function for the @@ operator. Currently
@@ handles multiple cases, like tsvector @@ tsquery, text @@ tsquery,
tsquery @@ tsvector etc. The text @@ text case is for instance
handled by calling to_tsvector and plainto_tsquery on the input
arguments.

For a @@ restriction function, I need to have a tsquery and a
tsvector, so in the text @@ text situation I'd end up calling
plainto_tsquery during planning, which would consequently get called
again during execution. Also, I'd need a not-so-elegant
if-elsif-elsif sequence at the beginning of the function. Is this
OK/unavoidable/easly avoided?

I'm not following your point here. Sure, there are multiple flavors of
@@, but why shouldn't they each have their own oprrest function?

Because they'll all boil down to the same function. Suppose I have an
oprrest function for tsvector @@ tsquery. An oprrest for text @@ text
would just be:
tv = DatumGetTSVector(DirectFunctionCall1(to_tsvector,
PG_GETARG_DATUM(0)));
tq = DatumGetTSQuery(DirectFunctionCall1(plainto_tsquery,
PG_GETARG_DATUM(1)));
res = DirectFunctionCall2(my_oprrest, TSVectorGetDatum(tv),
TSQueryGetDatun(tq))
...

I thought I might avoid having to call ts_tsvector and plainto_tsquery,
because the arguments need to be transformed to tsvector and tsquery
anyway during execution.

[thinks...]
OTOH, you often plan a query without executing it, so this doesn't make
sense. OK, please disregard that, I'm just beginning to see the depths
of my misunderstanding of the issue ;)

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin