Suggestions please: names for function cachability attributes
Since I'm about to have to edit pg_proc.h to add a namespace column,
I thought this would be a good time to revise the current proiscachable
column into the three-way cachability distinction we've discussed
before. But I need some names for the values, and I'm not satisfied
with the ideas I've had so far.
To refresh people's memory: what we want is to be able to distinguish
between functions that are:
1. Strictly cachable (a/k/a constant-foldable): given fixed input
values, the same result value will always be produced, for ever and
ever, amen. Examples: addition operator, sin(x). Given a call
of such a function with all-constant input values, the system is
entitled to fold the function call to a constant on sight.
2. Cachable within a single command: given fixed input values, the
result will not change if the function were to be repeatedly evaluated
within a single SQL command; but the result could change over time.
Examples: now(); datetime-related operations that depend on the current
timezone (or other SET-able variables); any function that looks in
database tables to determine its result.
3. Totally non-cachable: result may change from one call to the next,
even within a single SQL command. Examples: nextval(), random(),
timeofday(). (Yes, timeofday() and now() are in different categories.
See http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT)
Currently the system can only distinguish cases 1 and 3, so functions
that are really case 2 have to be labeled as case 3; this prevents a lot
of useful optimizations. In particular, it is safe to use expressions
involving only case-1 and case-2 functions as indexscan conditions,
whereas case-3 functions cannot be optimized into an indexscan. So this
is an important fix to make.
BTW, because of MVCC semantics, case 2 covers more ground than you might
think. We are interested in functions whose values cannot change during
a single "scan", ie, while the intra-transaction command counter does
not increment. So functions that do SELECTs are actually guaranteed to
be case 2, even if stuff outside the function is changing the table
being looked at.
My problem is picking names for the three categories of functions.
Currently we use "with (isCachable)" to identify category 1, but it
seems like this name might actually be more sensible for category 2.
I'm having a hard time picking simple names that convey these meanings
accurately, or even with a reasonable amount of suggestiveness.
Comments, ideas?
regards, tom lane
Tom Lane wrote:
BTW, because of MVCC semantics, case 2 covers more ground than you might
think. We are interested in functions whose values cannot change during
a single "scan", ie, while the intra-transaction command counter does
not increment. So functions that do SELECTs are actually guaranteed to
be case 2, even if stuff outside the function is changing the table
being looked at.My problem is picking names for the three categories of functions.
Currently we use "with (isCachable)" to identify category 1, but it
seems like this name might actually be more sensible for category 2.
I'm having a hard time picking simple names that convey these meanings
accurately, or even with a reasonable amount of suggestiveness.Comments, ideas?
How about:
case 1: Cachable
case 2: ScanCachable or Optimizable
case 3: NonCachable
Joe
My 2 cents.
Level 1. with (isCachableStatic)
Level 2. with (isCachableDynamic)
Level 3. default
In my mind (isCachable) sounds like level 1
Show quoted text
On Tuesday 02 April 2002 03:40 pm, Tom Lane wrote:
Since I'm about to have to edit pg_proc.h to add a namespace column,
I thought this would be a good time to revise the current proiscachable
column into the three-way cachability distinction we've discussed
before. But I need some names for the values, and I'm not satisfied
with the ideas I've had so far.To refresh people's memory: what we want is to be able to distinguish
between functions that are:1. Strictly cachable (a/k/a constant-foldable): given fixed input
values, the same result value will always be produced, for ever and
ever, amen. Examples: addition operator, sin(x). Given a call
of such a function with all-constant input values, the system is
entitled to fold the function call to a constant on sight.2. Cachable within a single command: given fixed input values, the
result will not change if the function were to be repeatedly evaluated
within a single SQL command; but the result could change over time.
Examples: now(); datetime-related operations that depend on the current
timezone (or other SET-able variables); any function that looks in
database tables to determine its result.3. Totally non-cachable: result may change from one call to the next,
even within a single SQL command. Examples: nextval(), random(),
timeofday(). (Yes, timeofday() and now() are in different categories.
See
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-datet
ime.html#FUNCTIONS-DATETIME-CURRENT)Currently the system can only distinguish cases 1 and 3, so functions
that are really case 2 have to be labeled as case 3; this prevents a lot
of useful optimizations. In particular, it is safe to use expressions
involving only case-1 and case-2 functions as indexscan conditions,
whereas case-3 functions cannot be optimized into an indexscan. So this
is an important fix to make.BTW, because of MVCC semantics, case 2 covers more ground than you might
think. We are interested in functions whose values cannot change during
a single "scan", ie, while the intra-transaction command counter does
not increment. So functions that do SELECTs are actually guaranteed to
be case 2, even if stuff outside the function is changing the table
being looked at.My problem is picking names for the three categories of functions.
Currently we use "with (isCachable)" to identify category 1, but it
seems like this name might actually be more sensible for category 2.
I'm having a hard time picking simple names that convey these meanings
accurately, or even with a reasonable amount of suggestiveness.Comments, ideas?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
* Tom Lane (tgl@sss.pgh.pa.us) [020402 16:42]:
Since I'm about to have to edit pg_proc.h to add a namespace column,
I thought this would be a good time to revise the current proiscachable
column into the three-way cachability distinction we've discussed
before. But I need some names for the values, and I'm not satisfied
with the ideas I've had so far.
Invariant
Cachable
Noncachable
I am full agreement with proposal. I love it!!
(1) const or constant
(2) cacheable
(3) volatile
P.S.
Tom: My mail doesn't reach you. As an AT&T user, you block my machine's IP
address with the anti-spam blocking. :-(
Tom Lane wrote:
Show quoted text
Since I'm about to have to edit pg_proc.h to add a namespace column,
I thought this would be a good time to revise the current proiscachable
column into the three-way cachability distinction we've discussed
before. But I need some names for the values, and I'm not satisfied
with the ideas I've had so far.To refresh people's memory: what we want is to be able to distinguish
between functions that are:1. Strictly cachable (a/k/a constant-foldable): given fixed input
values, the same result value will always be produced, for ever and
ever, amen. Examples: addition operator, sin(x). Given a call
of such a function with all-constant input values, the system is
entitled to fold the function call to a constant on sight.2. Cachable within a single command: given fixed input values, the
result will not change if the function were to be repeatedly evaluated
within a single SQL command; but the result could change over time.
Examples: now(); datetime-related operations that depend on the current
timezone (or other SET-able variables); any function that looks in
database tables to determine its result.3. Totally non-cachable: result may change from one call to the next,
even within a single SQL command. Examples: nextval(), random(),
timeofday(). (Yes, timeofday() and now() are in different categories.
See http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT)Currently the system can only distinguish cases 1 and 3, so functions
that are really case 2 have to be labeled as case 3; this prevents a lot
of useful optimizations. In particular, it is safe to use expressions
involving only case-1 and case-2 functions as indexscan conditions,
whereas case-3 functions cannot be optimized into an indexscan. So this
is an important fix to make.BTW, because of MVCC semantics, case 2 covers more ground than you might
think. We are interested in functions whose values cannot change during
a single "scan", ie, while the intra-transaction command counter does
not increment. So functions that do SELECTs are actually guaranteed to
be case 2, even if stuff outside the function is changing the table
being looked at.My problem is picking names for the three categories of functions.
Currently we use "with (isCachable)" to identify category 1, but it
seems like this name might actually be more sensible for category 2.
I'm having a hard time picking simple names that convey these meanings
accurately, or even with a reasonable amount of suggestiveness.Comments, ideas?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
Tom Lane writes:
Since I'm about to have to edit pg_proc.h to add a namespace column,
I thought this would be a good time to revise the current proiscachable
column into the three-way cachability distinction we've discussed
before. But I need some names for the values, and I'm not satisfied
with the ideas I've had so far.
Well, for one thing, we might want to change the name to the correct
spelling "cacheable".
1. Strictly cachable (a/k/a constant-foldable): given fixed input
values, the same result value will always be produced, for ever and
ever, amen. Examples: addition operator, sin(x). Given a call
of such a function with all-constant input values, the system is
entitled to fold the function call to a constant on sight.
deterministic
(That's how SQL99 calls it.)
2. Cachable within a single command: given fixed input values, the
result will not change if the function were to be repeatedly evaluated
within a single SQL command; but the result could change over time.
Examples: now(); datetime-related operations that depend on the current
timezone (or other SET-able variables); any function that looks in
database tables to determine its result.
"cacheable" seems OK for this.
3. Totally non-cachable: result may change from one call to the next,
even within a single SQL command. Examples: nextval(), random(),
timeofday(). (Yes, timeofday() and now() are in different categories.
See http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT)
not deterministic, not cacheable
--
Peter Eisentraut peter_e@gmx.net
On Tue, 2 Apr 2002, Peter Eisentraut wrote:
Tom Lane writes:
Since I'm about to have to edit pg_proc.h to add a namespace column,
I thought this would be a good time to revise the current proiscachable
column into the three-way cachability distinction we've discussed
before. But I need some names for the values, and I'm not satisfied
with the ideas I've had so far.Well, for one thing, we might want to change the name to the correct
spelling "cacheable".1. Strictly cachable (a/k/a constant-foldable): given fixed input
values, the same result value will always be produced, for ever and
ever, amen. Examples: addition operator, sin(x). Given a call
of such a function with all-constant input values, the system is
entitled to fold the function call to a constant on sight.deterministic
(That's how SQL99 calls it.)
2. Cachable within a single command: given fixed input values, the
result will not change if the function were to be repeatedly evaluated
within a single SQL command; but the result could change over time.
Examples: now(); datetime-related operations that depend on the current
timezone (or other SET-able variables); any function that looks in
database tables to determine its result."cacheable" seems OK for this.
SQL99 suggests that there are only two types of user defined
routines: deterministic and 'possibly non-deterministic'. However, in
section 11.49 it defines
<deterministic characteristic> ::= DETERMINISTIC | NOT DETERMINISTIC
So the real problem is how to qualify this.
TRANSACTIONAL DETERMINISTIC
or
NOT DETERMINISTIC CACHEABLE
are the only ways that come to mind. I'll admit that I don't like either.
3. Totally non-cachable: result may change from one call to the next,
even within a single SQL command. Examples: nextval(), random(),
timeofday(). (Yes, timeofday() and now() are in different categories.
See http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT)not deterministic, not cacheable
Gavin
Peter Eisentraut <peter_e@gmx.net> writes:
Well, for one thing, we might want to change the name to the correct
spelling "cacheable".
Is that correct?
I looked in the Oxford English Dictionary, the Random House Dictionary,
and a couple other dictionaries of less substantial heft, and could not
find anything authoritative at all. RH gives the derived forms "cached"
and "caching"; OED offers nothing. I'd be interested to see an
authoritative reference for the spelling of the adjective form.
Possibly we should avoid the issue by using another word ;-)
regards, tom lane
On Tue, 02 Apr 2002 23:39:35 -0500
"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter_e@gmx.net> writes:
Well, for one thing, we might want to change the name to the correct
spelling "cacheable".Is that correct?
Apparently, other people are confused as well:
http://www.xent.com/FoRK-archive/august97/0431.html
FWIW, google has ~30,000 results for -eable, and ~8,000 results for
-able. A couple other software projects (notably Apache Jakarta)
use -eable.
My preference would be for -eable, but that's just on the basis of
"it looks right", which is hardly authoritative.
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
I am full agreement with proposal. I love it!!
(1) const or constant
(2) cacheable
(3) volatile
P.S.
Tom: My mail doesn't reach you. As an AT&T user, you block my machine's IP
address with the anti-spam blocking. :-(
Tom Lane wrote:
mlw <markw@mohawksoft.com> writes:
Tom: My mail doesn't reach you. As an AT&T user, you block my machine's IP
address with the anti-spam blocking. :-(Sorry about that. I like 510sg's dnsbl list precisely because it's
aggressive, but sometimes it's too aggressive. I can whitelist you
if you have a stable IP address ... is 24.147.138.78 a permanently
assigned address, or not?
Alas we have the irony of me trying to respond to you via email, to give you
information on how to unblock me so I can respond via email. I am laughing.
I wish I could say I have a fixed IP, but I do not. It is a DHCP assigned AT&T
cable modem. Sorry.
I'm not sure I'm the only one, am I?
mlw <markw@mohawksoft.com> writes:
(1) const or constant
(2) cacheable
(3) volatile
I was wondering about "const" for case 1, also. I think there is some
precedent for using "const" with this meaning in other programming
languages. "volatile" for case 3 seems reasonable.
Tom: My mail doesn't reach you. As an AT&T user, you block my machine's IP
address with the anti-spam blocking. :-(
Sorry about that. I like 510sg's dnsbl list precisely because it's
aggressive, but sometimes it's too aggressive. I can whitelist you
if you have a stable IP address ... is 24.147.138.78 a permanently
assigned address, or not?
regards, tom lane
...
I'm not sure I'm the only one, am I?
No, I was also blocked from Tom's mail a while ago. I have a static IP,
but my ISP's entire block of addresses made it on to the spam list Tom
uses, and the strategy of the list maintainers seems to be to maximize
the collateral damage to force me to somehow force my ISP to change
their policies, whatever those are. If I researched it enough, I might
be able to find out what my ISP does or does not do, and what I'm
supposed to do or not do. What a pain...
Not sure if my status has changed. I'll bet not, since the anti-spam
folks have high enough standards that someone like me can't make the
grade. I suppose they don't rely on PostgreSQL for their database... ;)
That said, I'd like to block some spam myself. I'd rather find a spam
list which doesn't already have me disallowed however...
- Thomas
Tom Lane writes:
mlw <markw@mohawksoft.com> writes:
(1) const or constant
(2) cacheable
(3) volatileI was wondering about "const" for case 1, also. I think there is some
precedent for using "const" with this meaning in other programming
languages.
I think the meaning of "const" tends to be "cannot change the result" --
which may actually make sense in SQL in a future life if you can pass
around table descriptors or cursor references.
--
Peter Eisentraut peter_e@gmx.net
Thomas Lockhart <thomas@fourpalms.org> writes:
That said, I'd like to block some spam myself. I'd rather find a spam
list which doesn't already have me disallowed however...
In case it makes you feel better: my *own* address was on the 510sg list
for awhile last month. But I still use the list ;-). Nothing to stop
you from using some less-aggressive list though; see
http://relays.osirusoft.com/ for links to a dozen or more possibilities.
In practice, any DNSBL list can cause denial-of-service problems.
(The original and still most conservatively run one, MAPS RBL, had a
memorable episode where someone put 127.0.0.1 into the blacklist for
a few hours...) I deal with this by installing local whitelist
exceptions for people I talk to regularly. Otherwise, there's always
the mailing lists.
regards, tom lane
On Wed, Apr 03, 2002 at 11:11:28AM -0500, mlw wrote:
I'm not sure I'm the only one, am I?
Nope, we're on AT&T lines, too.
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110
mlw <markw@mohawksoft.com> writes:
Tom Lane wrote:
Sorry about that. I like 510sg's dnsbl list precisely because it's
aggressive, but sometimes it's too aggressive. I can whitelist you
if you have a stable IP address ... is 24.147.138.78 a permanently
assigned address, or not?
I wish I could say I have a fixed IP, but I do not. It is a DHCP assigned AT&T
cable modem. Sorry.
Cable modem IPs are more stable than you might think --- a quick look in
the list archives shows you've had this one since early January. In
practice you'll keep the same IP as long as you don't lose connectivity.
I'll whitelist 24.147.138.* and hope for the best...
regards, tom lane
Peter Eisentraut wrote:
Tom Lane writes:
mlw <markw@mohawksoft.com> writes:
(1) const or constant
(2) cacheable
(3) volatileI was wondering about "const" for case 1, also. I think there is some
precedent for using "const" with this meaning in other programming
languages.I think the meaning of "const" tends to be "cannot change the result" --
which may actually make sense in SQL in a future life if you can pass
around table descriptors or cursor references.
I can buy that. Ok, const isn't a good name.
How about 'immutable' ?
On Wed, Apr 03, 2002 at 08:45:03AM -0800, Thomas Lockhart wrote:
...
I'm not sure I'm the only one, am I?
No, I was also blocked from Tom's mail a while ago. I have a static IP,
but my ISP's entire block of addresses made it on to the spam list Tom
uses, and the strategy of the list maintainers seems to be to maximize
the collateral damage to force me to somehow force my ISP to change
their policies, whatever those are. If I researched it enough, I might
be able to find out what my ISP does or does not do, and what I'm
supposed to do or not do. What a pain...
We had the same problem here. I spoke to the 5-10 list provider and
got our ISP delisted since they seem to have kicked the 3 or so spammers
off their network. It seems a little unreasonable to blacklist an entire
large ISP's netblock just because they have a very small number of spam
sites. It is also pretty unreasonable to think that any company is
going to switch providers because of one blacklist or somehow complain
to their ISP about the spammers the ISP is hosting without any more
detail than:
"Blacklist X says you provide spam support and/or have too many
spammers on your network. Please remove them so I can send
my email."
Martin
Martin Renters <martin@datafax.com> writes:
It is also pretty unreasonable to think that any company is
going to switch providers because of one blacklist or somehow complain
to their ISP about the spammers the ISP is hosting without any more
detail than:
"Blacklist X says you provide spam support and/or have too many
spammers on your network. Please remove them so I can send
my email."
FWIW, all the blacklists I use (and 510sg is only the first line of
defense ;-)) have documentation available about the reasons for listing
IP blocks. F'r instance, looking up Thomas' IP I get:
xo.com.spam-support.blackholes.five-ten-sg.com. 23h32m50s IN TXT "added 2002-01-05; spam support - dns server at 64.1.121.57 supporting http://www.poxteam2001.com"
xo.com.spam-support.blackholes.five-ten-sg.com. 23h32m50s IN TXT "added 2002-01-07; spam support - dns server at 64.1.121.57 supporting http://compower.numberop.com"
xo.com.spam-support.blackholes.five-ten-sg.com. 23h32m50s IN TXT "added 2002-03-07; spam support - hosting http://207.88.179.193 - terminated"
xo.com.spam-support.blackholes.five-ten-sg.com. 23h32m50s IN TXT "added 2002-03-10; spam support - hosting http://thecottagemonitor.com"
xo.com.spam-support.blackholes.five-ten-sg.com. 23h32m50s IN TXT "added 2002-03-13; spam support - hosting http://shortcuts2learning.com"
xo.com.spam-support.blackholes.five-ten-sg.com. 23h32m50s IN TXT "added 2002-03-24; spam support - hosting http://209.164.32.75/consumer_first_funding"
But this is getting pretty far off-topic for the PG lists.
regards, tom lane
FWIW, all the blacklists I use (and 510sg is only the first line of
defense ;-)) have documentation available about the reasons for listing
IP blocks. F'r instance, looking up Thomas' IP I get:
...
But this is getting pretty far off-topic for the PG lists.
I'll guess that the list of reasons for the blacklisting I find today is
different than the list I found a few months ago when this first came
up. What is relevant to me is that I absolutely cannot get my machine
removed from this blacklist, no matter what I do to secure that machine.
And that, istm, reduces the relevance of that particular blacklisting
strategy.
I was just using this as an example (I happen to send mail directly to
you so have run across it in this context).
I'm interested because spam has affected me in other contexts too, and
every time it takes time away from PostgreSQL.
We could sent up Yet Another List, say pgsql-spam-whiners, and I could
be a charter member, and maybe y'all would suggest I should also be on
pgsql-clueless-spam-whiners. But maybe it is better to have an
occasional discussion on topics that people find affecting their use of
the mailing list(s) ;)
I have to say that spam is bumming me out more now than it ever has in
the past. So let's hope that the blacklists *do* help somehow!
- Thomas