Query caching absent "query caching"

Started by Bexley Hallover 13 years ago6 messagesgeneral

bexley401@yahoo.com

over 13 years ago

Hi,

In the absence of query caching AND NOT WANTING TO FORCE
THE APPLICATION TO DO SO EXPLICITLY, I'm looking for ideas
as to how I should "future-safe" the design of some custom
user base types and functions thereon.

Specifically, I have several computationally expensive
functions that derive their results from specific values of
these base types. *Solely*. (For example, area() when
applied to a given "circle" always yields the same result...
though this is a trivial/inexpensive function, by comparison).

I can define the base types to set aside space to store
these results and cache them *in* the base type. Then, serve
up these cached results when they are needed, again. With
plan caching, this should (?) reduce the cost of repeated
queries significantly without the need/benefit for caching the
actual query results. (Is that true?)

To guard against future enhancements to the server (e.g., if
query caching is ever implemented, etc.), I assume that all
such functions should declare themselves as IMMUTABLE? Or,
does my update of the internal representation of the data
values (i.e., to include the cached results of each of these
functions) conflict with this declaration?

[I am working in a fixed, constrained resource environment so
the idea of explicitly building a temporary table to hold these
results in the absence of a smart query caching strategy is not
possible -- the builder(s) of such tables would have to know
too much about each other to not exhaust the resources available!]

Thanks!
--don

Pavel Stehule

pavel.stehule@gmail.com

over 13 years ago

In reply to: Bexley Hall (#1)

Re: Query caching absent "query caching"

Hello

you can try use plperl as cache

http://okbob.blogspot.cz/2007/12/using-shared-as-table-cache-in-plperl.html

Regards

Pavel Stehule

2012/11/25 Bexley Hall <bexley401@yahoo.com>:

Show quoted text

Hi,

In the absence of query caching AND NOT WANTING TO FORCE
THE APPLICATION TO DO SO EXPLICITLY, I'm looking for ideas
as to how I should "future-safe" the design of some custom
user base types and functions thereon.

Specifically, I have several computationally expensive
functions that derive their results from specific values of
these base types. *Solely*. (For example, area() when
applied to a given "circle" always yields the same result...
though this is a trivial/inexpensive function, by comparison).

I can define the base types to set aside space to store
these results and cache them *in* the base type. Then, serve
up these cached results when they are needed, again. With
plan caching, this should (?) reduce the cost of repeated
queries significantly without the need/benefit for caching the
actual query results. (Is that true?)

To guard against future enhancements to the server (e.g., if
query caching is ever implemented, etc.), I assume that all
such functions should declare themselves as IMMUTABLE? Or,
does my update of the internal representation of the data
values (i.e., to include the cached results of each of these
functions) conflict with this declaration?

[I am working in a fixed, constrained resource environment so
the idea of explicitly building a temporary table to hold these
results in the absence of a smart query caching strategy is not
possible -- the builder(s) of such tables would have to know
too much about each other to not exhaust the resources available!]

Thanks!
--don

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 13 years ago

In reply to: Pavel Stehule (#2)

Re: Query caching absent "query caching"

Bexley Hall wrote:

Specifically, I have several computationally expensive
functions that derive their results from specific values of
these base types. *Solely*. (For example, area() when
applied to a given "circle" always yields the same result...
though this is a trivial/inexpensive function, by comparison).

I can define the base types to set aside space to store
these results and cache them *in* the base type. Then, serve
up these cached results when they are needed, again. With
plan caching, this should (?) reduce the cost of repeated
queries significantly without the need/benefit for caching the
actual query results. (Is that true?)

To guard against future enhancements to the server (e.g., if
query caching is ever implemented, etc.), I assume that all
such functions should declare themselves as IMMUTABLE? Or,
does my update of the internal representation of the data
values (i.e., to include the cached results of each of these
functions) conflict with this declaration?

As long as a call to a given function with a specific set of
arguments always returns the same result, and there are no *user
visible* side effects of the internal caching, I don't see a
problem with declaring the functions immutable.

Out of curiosity, are you planning on using a process-local cache
(which would start empty for each new connection) or are you
planning to allocate shared memory somehow and coordinate access to
that?

-Kevin

Import Notes

Resolved by subject fallback

Bexley Hall

bexley401@yahoo.com

over 13 years ago

In reply to: Pavel Stehule (#2)

Re: Query caching absent "query caching"

Hi Pavel,

On 11/24/2012 9:47 PM, Pavel Stehule wrote:

Hello

you can try use plperl as cache

http://okbob.blogspot.cz/2007/12/using-shared-as-table-cache-in-plperl.html

But how is this any different than just creating a named/shared
table manually?

And, how do further/additional accesses (by other clients or
the same client) *augment* the shared table?

In terms of my "application":
- Assume client A does a query that evaluates expensive_function()
for rows 1, 5 and 93
- Client B does a query that evaluates expensive_function() for
rows 3, 5 and 97
- Client C does a query that evaluates expensive_function() for
rows 93, 95 and 97
(no one alters any of the data on which expensive_function() relies
in this time interval)

Then, A should bear the cost of computing the results for 1, 5 and 93.
B should bear the cost of computing 3 and 97 -- but should be able to
benefit from A's computation of 5. C should bear the cost of computing
95 but benefit from the previous computations of 93 and 97.

Thx,
--don

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Bexley Hall

bexley401@yahoo.com

over 13 years ago

In reply to: Kevin Grittner (#3)

Re: Query caching absent "query caching"

Hi Kevin,

On 11/25/2012 8:10 AM, Kevin Grittner wrote:

Bexley Hall wrote:

Specifically, I have several computationally expensive
functions that derive their results from specific values of
these base types. *Solely*. (For example, area() when
applied to a given "circle" always yields the same result...
though this is a trivial/inexpensive function, by comparison).

I can define the base types to set aside space to store
these results and cache them *in* the base type. Then, serve
up these cached results when they are needed, again. With
plan caching, this should (?) reduce the cost of repeated
queries significantly without the need/benefit for caching the
actual query results. (Is that true?)

To guard against future enhancements to the server (e.g., if
query caching is ever implemented, etc.), I assume that all
such functions should declare themselves as IMMUTABLE? Or,
does my update of the internal representation of the data
values (i.e., to include the cached results of each of these
functions) conflict with this declaration?

As long as a call to a given function with a specific set of
arguments always returns the same result, and there are no *user
visible* side effects of the internal caching, I don't see a
problem with declaring the functions immutable.

OK.

Out of curiosity, are you planning on using a process-local cache
(which would start empty for each new connection) or are you
planning to allocate shared memory somehow and coordinate access to
that?

I was planning on writing back the results of each successful
function evaluation into the data type's internal representation.
Ideally, back into PostgreSQL's "master copy" of the data
(though I would settle for hiding it in an anonymous table
behind a view, etc.)

The point is NEVER to have to RE-evaluate any of these functions
for the data on which they are evaluated once they have been
evaluated (assuming the data themselves do not change). And,
in doing so, make the results of each evaluation available to
other clients regardless of the query which caused them to
be evaluated.

Thx,
--don

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Pavel Stehule

pavel.stehule@gmail.com

over 13 years ago

In reply to: Bexley Hall (#4)

Re: Query caching absent "query caching"

2012/11/25 Bexley Hall <bexley401@yahoo.com>:

Hi Pavel,

On 11/24/2012 9:47 PM, Pavel Stehule wrote:

Hello

you can try use plperl as cache

http://okbob.blogspot.cz/2007/12/using-shared-as-table-cache-in-plperl.html

But how is this any different than just creating a named/shared
table manually?

access to memory is faster than access to table - but it is limited.

And, how do further/additional accesses (by other clients or
the same client) *augment* the shared table?

In terms of my "application":
- Assume client A does a query that evaluates expensive_function()
for rows 1, 5 and 93
- Client B does a query that evaluates expensive_function() for
rows 3, 5 and 97
- Client C does a query that evaluates expensive_function() for
rows 93, 95 and 97
(no one alters any of the data on which expensive_function() relies
in this time interval)

Then, A should bear the cost of computing the results for 1, 5 and 93.
B should bear the cost of computing 3 and 97 -- but should be able to
benefit from A's computation of 5. C should bear the cost of computing
95 but benefit from the previous computations of 93 and 97.

depends on implementation - probably you cannot to design a generic
solution, but for some not wide defined tasks, you can find effective
solutions.

Regards

Pavel

Thx,
--don

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general