how to avoid repeating expensive computation in select

Started by Bob Priceabout 15 years ago3 messagesgeneral

rjp_email@yahoo.com

about 15 years ago

I have been searching through the docs and mailing list and haven't found a way to do this, so I thought I would ask the community.

I would like to know if there is a way in PostgreSQL to avoid repeating an expensive computation in a SELECT where the result is needed both as a returned value and as an expression in the WHERE clause.

As a simple example, consider the following query on a table with 'id' and 'value' columns, and an expensive computation represented as a function:

SELECT id, expensivefunc(value) AS score FROM mytable
WHERE id LIKE '%z%' AND expensivefunc(value) > 0.5;

It would be great if I could find a way to only compute expensivefunc(value) at most once per row, and not at all if the other WHERE constraints are not satisfied.

For this simple case I know that I could rewrite the SELECT as something like the following:

WITH other_where AS (
SELECT id, value FROM mytable WHERE id LIKE '%z%'
), calc_scores AS (
SELECT id, expensivefunc(value) AS score FROM other_where
)
SELECT id, score from calc_scores WHERE score > 0.5;

This works in this simple case, but my guess is that it probably adds a lot of overhead (is this true?), and I also have to deal with much more complicated scenarios with multiple expensive calculations that may not fit into this kind of rewrite.

Does anyone know of a simpler way to accomplish this?

For example, it would be great if there were a function that could reference the Nth select list item so it is only computed once, like:

SELECT id, expensivefunc(value) AS score FROM mytable
WHERE id LIKE '%z%' AND sel_list_item(2) > 0.5;

or if there were temporary variables in the WHERE expressions like:

SELECT id, tmp1 AS score FROM mytable
WHERE id LIKE '%z%' AND (tmp1 = expensivefunc(value)) > 0.5;

Any ideas anyone!

Thanks in advance!
Bob

Orhan Kavrakoglu

orhan@tart.com.tr

about 15 years ago

In reply to: Bob Price (#1)

Re: how to avoid repeating expensive computation in select

On 2011-02-03 18:07, Bob Price wrote:

I would like to know if there is a way in PostgreSQL to avoid repeating an expensive computation in a SELECT where the result is needed both as a returned value and as an expression in the WHERE clause.

I think I've seen it said here that PG avoids redundant multiple
calculations of an expression.

Even so, have you thought about using subqueries?

SELECT id, expensivefunc(value) AS score FROM mytable
WHERE id LIKE '%z%' AND expensivefunc(value)> 0.5;

SELECT id, expensivefunc(value) FROM (
(SELECT id, value FROM mytable WHERE id LIKE '%z%')
) WHERE expensivefunc(value) > 0.5;

or even

SELECT id, score FROM (
SELECT id, expensivefunc(value) AS score FROM (
(SELECT id, value FROM mytable WHERE id LIKE '%z%')
)
) WHERE score > 0.5

--
Orhan Kavrakoğlu
orhan@tart.com.tr

Tart New Media
w : http://www.tart.com.tr
t : +90 212 263 0 666 / ext: 142
f : TBA
a : TBA

Merlin Moncure

mmoncure@gmail.com

about 15 years ago

In reply to: Orhan Kavrakoglu (#2)

Re: how to avoid repeating expensive computation in select

On Tue, Mar 1, 2011 at 2:51 AM, Orhan Kavrakoglu <orhan@tart.com.tr> wrote:

I would like to know if there is a way in PostgreSQL to avoid repeating an
expensive computation in a SELECT where the result is needed both as a
returned value and as an expression in the WHERE clause.

I think I've seen it said here that PG avoids redundant multiple
calculations of an expression.

Even so, have you thought about using subqueries?

SELECT id, expensivefunc(value) AS score FROM mytable
WHERE id LIKE '%z%' AND expensivefunc(value)> 0.5;

SELECT id, expensivefunc(value) FROM (
(SELECT id, value FROM mytable WHERE id LIKE '%z%')
) WHERE expensivefunc(value) > 0.5;

or even

SELECT id, score FROM (
SELECT id, expensivefunc(value) AS score FROM (
(SELECT id, value FROM mytable WHERE id LIKE '%z%')
)
) WHERE score > 0.5

you missed the point: even when you use subqueries postgres can inline
them, 'unsubquerying' your query. I think the OP nailed probably the
best and most logical approach -- use a CTE. It's more formal, and
while not super efficient today, isn't terrible.

merlin