replace all with * in pg_hba.conf

Started by Andrew Dunstanover 22 years ago38 messageshackers
Jump to latest
#1Andrew Dunstan
andrew@dunslane.net

If people are happy with Tom's suggestion of using '*' instead of 'all'
in pg_hba.conf I will prepare a patch for it.

(I will also replace the ugly long IP6 localhost netmask with a CIDR mask).

cheers

andrew

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#1)
Re: replace all with * in pg_hba.conf

Andrew Dunstan wrote:

If people are happy with Tom's suggestion of using '*' instead of
'all' in pg_hba.conf I will prepare a patch for it.

Well, while we're breaking stuff in the name of improvement, what about
databases named "*" or databases with spaces in their names?

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#2)
Re: replace all with * in pg_hba.conf

Peter Eisentraut wrote:

Andrew Dunstan wrote:

If people are happy with Tom's suggestion of using '*' instead of
'all' in pg_hba.conf I will prepare a patch for it.

Well, while we're breaking stuff in the name of improvement, what about
databases named "*" or databases with spaces in their names?

Good point. Perhaps we need to provide for an escaping mechanism in the
routines that parse the file, although personally I have little sympathy
for anyone who names a database '*'. I think it comes into the category
of "Doctor, it hurts when I do this" ... "Then stop doing that." Spaces
are a more likely problem, especially when we get W32 native users.

cheers

andrew

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#3)
Re: replace all with * in pg_hba.conf

I wrote:

Peter Eisentraut wrote:

Andrew Dunstan wrote:

If people are happy with Tom's suggestion of using '*' instead of
'all' in pg_hba.conf I will prepare a patch for it.

Well, while we're breaking stuff in the name of improvement, what
about databases named "*" or databases with spaces in their names?

Good point. Perhaps we need to provide for an escaping mechanism in
the routines that parse the file, although personally I have little
sympathy for anyone who names a database '*'. I think it comes into
the category of "Doctor, it hurts when I do this" ... "Then stop doing
that." Spaces are a more likely problem, especially when we get W32
native users.

Looking at the code I discover that there is already provision covering
spaces etc., because you can quote names. It's even documented ;-)

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget the
'*' idea. Alternatively, do the same sort of thing, but replacing 'all'
with '*'. A patch for the first would be quite tiny - similar for '*'
except for extra doc and sample file changes.

cheers

andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: replace all with * in pg_hba.conf

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget the
'*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

regards, tom lane

#6The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#5)
Re: replace all with * in pg_hba.conf

On Fri, 19 Dec 2003, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget the
'*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

why not do both, but deprecate the use of all in the docs? say with an
eventual goal of removing the use of all altogether in 2 releases?

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
distinguish between all and "all" in pg_hba.conf

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget the
'*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

OK, here's the patch. Should we also do this for "sameuser" and
"samegroup" for the sake of completness?

cheers

andrew

Attachments:

quote-all.patchtext/plain; name=quote-all.patchDownload+22-6
#8Andrew Dunstan
andrew@dunslane.net
In reply to: The Hermit Hacker (#6)
Re: replace all with * in pg_hba.conf

Marc G. Fournier wrote:

On Fri, 19 Dec 2003, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget the
'*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

why not do both, but deprecate the use of all in the docs? say with an
eventual goal of removing the use of all altogether in 2 releases?

The extra value is fairly low IMNSHO if we can distinguish between a
magical and a non-magical 'all' - see the patch I just submitted.

cheers

andrew

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#8)
Re: replace all with * in pg_hba.conf

Andrew Dunstan <andrew@dunslane.net> writes:

Marc G. Fournier wrote:

why not do both, but deprecate the use of all in the docs? say with an
eventual goal of removing the use of all altogether in 2 releases?

The extra value is fairly low IMNSHO if we can distinguish between a
magical and a non-magical 'all' - see the patch I just submitted.

Also, your point about the special sameuser and samegroup keywords is a
good one. We should make those all work consistently (ie, quoting makes
it not a keyword). Going in the "*" direction would only make sense
if that were the only special case --- but it isn't. I don't think we
want to start choosing random symbols for sameuser, samegroup, and other
stuff we might think of in future.

regards, tom lane

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
Re: replace all with * in pg_hba.conf

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Marc G. Fournier wrote:

why not do both, but deprecate the use of all in the docs? say with an
eventual goal of removing the use of all altogether in 2 releases?

The extra value is fairly low IMNSHO if we can distinguish between a
magical and a non-magical 'all' - see the patch I just submitted.

Also, your point about the special sameuser and samegroup keywords is a
good one. We should make those all work consistently (ie, quoting makes
it not a keyword). Going in the "*" direction would only make sense
if that were the only special case --- but it isn't. I don't think we
want to start choosing random symbols for sameuser, samegroup, and other
stuff we might think of in future.

Right. Revised patch sent to patches list.

cheers

andrew

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#7)
Re: distinguish between all and "all" in pg_hba.conf

I wrote:

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget
the '*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

OK, here's the patch. Should we also do this for "sameuser" and
"samegroup" for the sake of completness?

Revised patch for this as suggested by Tom.

cheers

andrew

Attachments:

quote-all.patchtext/plain; name=quote-all.patchDownload+42-18
#12Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#11)
Re: distinguish between all and "all" in pg_hba.conf

That IPv6 cleanup is major!

! host all all ::1 ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff trust

! host all all ::1/128 trust

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------

Andrew Dunstan wrote:

I wrote:

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget
the '*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

OK, here's the patch. Should we also do this for "sameuser" and
"samegroup" for the sake of completness?

Revised patch for this as suggested by Tom.

cheers

andrew

Index: hba.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/libpq/hba.c,v
retrieving revision 1.118
diff -c -w -r1.118 hba.c
*** hba.c	5 Dec 2003 15:50:31 -0000	1.118
--- hba.c	19 Dec 2003 17:42:20 -0000
***************
*** 87,102 ****
*	 token or EOF, whichever comes first. If no more tokens on line,
*	 return null string as *buf and position file to beginning of
*	 next line or EOF, whichever comes first. Allow spaces in quoted
!  *	 strings. Terminate on unquoted commas. Handle comments.
*/
void
next_token(FILE *fp, char *buf, const int bufsz)
{
int			c;
char	   *start_buf = buf;
! 	char	   *end_buf = buf + (bufsz - 1);
bool		in_quote = false;
bool		was_quote = false;
/* Move over initial whitespace and commas */
while ((c = getc(fp)) != EOF && (pg_isblank(c) || c == ','))
--- 87,105 ----
*	 token or EOF, whichever comes first. If no more tokens on line,
*	 return null string as *buf and position file to beginning of
*	 next line or EOF, whichever comes first. Allow spaces in quoted
!  *	 strings. Terminate on unquoted commas. Handle comments. Treat
!  *   unquoted keywords that might be user names or database names 
!  *   specially, by appending a newline to them.
*/
void
next_token(FILE *fp, char *buf, const int bufsz)
{
int			c;
char	   *start_buf = buf;
! 	char	   *end_buf = buf + (bufsz - 2);
bool		in_quote = false;
bool		was_quote = false;
+ 	bool        saw_quote = false;
/* Move over initial whitespace and commas */
while ((c = getc(fp)) != EOF && (pg_isblank(c) || c == ','))
***************
*** 149,155 ****
--- 152,161 ----
was_quote = false;
if (c == '"')
+ 			{
in_quote = !in_quote;
+ 				saw_quote = true;
+ 			}
c = getc(fp);
}
***************
*** 161,167 ****
--- 167,188 ----
if (c != EOF)
ungetc(c, fp);
}
+ 
+ 
+ 	if ( !saw_quote && 
+ 	     (
+ 			 strncmp(start_buf,"all",3) == 0  ||
+ 			 strncmp(start_buf,"sameuser",8) == 0  ||
+ 			 strncmp(start_buf,"samegroup",9) == 0 
+ 		 )
+ 		)
+ 	{
+ 		/* append newline to a magical keyword */
+ 		*buf++ = '\n';
+ 	}
+ 
*buf = '\0';
+ 
}

/*
***************
*** 446,452 ****
return true;
}
else if (strcmp(tok, user) == 0 ||
! strcmp(tok, "all") == 0)
return true;
}

--- 467,473 ----
return true;
}
else if (strcmp(tok, user) == 0 ||
! 				 strcmp(tok, "all\n") == 0)
return true;
}

***************
*** 463,476 ****

for (tok = strtok(param_str, MULTI_VALUE_SEP); tok != NULL; tok = strtok(NULL, MULTI_VALUE_SEP))
{
! 		if (strcmp(tok, "all") == 0)
return true;
! 		else if (strcmp(tok, "sameuser") == 0)
{
if (strcmp(dbname, user) == 0)
return true;
}
! 		else if (strcmp(tok, "samegroup") == 0)
{
if (check_group(dbname, user))
return true;
--- 484,497 ----
for (tok = strtok(param_str, MULTI_VALUE_SEP); tok != NULL; tok = strtok(NULL, MULTI_VALUE_SEP))
{
! 		if (strcmp(tok, "all\n") == 0)
return true;
! 		else if (strcmp(tok, "sameuser\n") == 0)
{
if (strcmp(dbname, user) == 0)
return true;
}
! 		else if (strcmp(tok, "samegroup\n") == 0)
{
if (check_group(dbname, user))
return true;
***************
*** 1068,1074 ****
errmsg("cannot use Ident authentication without usermap field")));
found_entry = false;
}
! 	else if (strcmp(usermap_name, "sameuser") == 0)
{
if (strcmp(pg_user, ident_user) == 0)
found_entry = true;
--- 1089,1095 ----
errmsg("cannot use Ident authentication without usermap field")));
found_entry = false;
}
! 	else if (strcmp(usermap_name, "sameuser\n") == 0)
{
if (strcmp(pg_user, ident_user) == 0)
found_entry = true;
Index: pg_hba.conf.sample
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/libpq/pg_hba.conf.sample,v
retrieving revision 1.47
diff -c -w -r1.47 pg_hba.conf.sample
*** pg_hba.conf.sample	13 Sep 2003 16:43:38 -0000	1.47
--- pg_hba.conf.sample	19 Dec 2003 17:42:20 -0000
***************
*** 35,40 ****
--- 35,45 ----
# encrypted passwords.  OPTION is the ident map or the name of the PAM
# service.
#
+ # Database and user names containing spaces, commas, quotes and other special
+ # characters can be quoted. Quoting one of the keywords "all", "sameuser" or
+ # "samegroup"  makes the name lose its special character, and just match a 
+ # database or username with that name.
+ #
# This file is read on server startup and when the postmaster receives
# a SIGHUP signal.  If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
***************
*** 59,62 ****
# IPv4-style local connections:
host    all         all         127.0.0.1         255.255.255.255   trust
# IPv6-style local connections:
! host    all         all         ::1               ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff        trust
--- 64,67 ----
# IPv4-style local connections:
host    all         all         127.0.0.1         255.255.255.255   trust
# IPv6-style local connections:
! host    all         all         ::1/128                             trust

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#13Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#11)
Re: distinguish between all and "all" in pg_hba.conf

Patch applied. Thanks.

---------------------------------------------------------------------------

Andrew Dunstan wrote:

I wrote:

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

The minimal disturbance change might be to teach the parser to
distinguish between a quoted 'all' and an unquoted 'all', and forget
the '*' idea.

Probably we ought to go with that, on backwards-compatibility grounds.

OK, here's the patch. Should we also do this for "sameuser" and
"samegroup" for the sake of completness?

Revised patch for this as suggested by Tom.

cheers

andrew

Index: hba.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/libpq/hba.c,v
retrieving revision 1.118
diff -c -w -r1.118 hba.c
*** hba.c	5 Dec 2003 15:50:31 -0000	1.118
--- hba.c	19 Dec 2003 17:42:20 -0000
***************
*** 87,102 ****
*	 token or EOF, whichever comes first. If no more tokens on line,
*	 return null string as *buf and position file to beginning of
*	 next line or EOF, whichever comes first. Allow spaces in quoted
!  *	 strings. Terminate on unquoted commas. Handle comments.
*/
void
next_token(FILE *fp, char *buf, const int bufsz)
{
int			c;
char	   *start_buf = buf;
! 	char	   *end_buf = buf + (bufsz - 1);
bool		in_quote = false;
bool		was_quote = false;
/* Move over initial whitespace and commas */
while ((c = getc(fp)) != EOF && (pg_isblank(c) || c == ','))
--- 87,105 ----
*	 token or EOF, whichever comes first. If no more tokens on line,
*	 return null string as *buf and position file to beginning of
*	 next line or EOF, whichever comes first. Allow spaces in quoted
!  *	 strings. Terminate on unquoted commas. Handle comments. Treat
!  *   unquoted keywords that might be user names or database names 
!  *   specially, by appending a newline to them.
*/
void
next_token(FILE *fp, char *buf, const int bufsz)
{
int			c;
char	   *start_buf = buf;
! 	char	   *end_buf = buf + (bufsz - 2);
bool		in_quote = false;
bool		was_quote = false;
+ 	bool        saw_quote = false;
/* Move over initial whitespace and commas */
while ((c = getc(fp)) != EOF && (pg_isblank(c) || c == ','))
***************
*** 149,155 ****
--- 152,161 ----
was_quote = false;
if (c == '"')
+ 			{
in_quote = !in_quote;
+ 				saw_quote = true;
+ 			}
c = getc(fp);
}
***************
*** 161,167 ****
--- 167,188 ----
if (c != EOF)
ungetc(c, fp);
}
+ 
+ 
+ 	if ( !saw_quote && 
+ 	     (
+ 			 strncmp(start_buf,"all",3) == 0  ||
+ 			 strncmp(start_buf,"sameuser",8) == 0  ||
+ 			 strncmp(start_buf,"samegroup",9) == 0 
+ 		 )
+ 		)
+ 	{
+ 		/* append newline to a magical keyword */
+ 		*buf++ = '\n';
+ 	}
+ 
*buf = '\0';
+ 
}

/*
***************
*** 446,452 ****
return true;
}
else if (strcmp(tok, user) == 0 ||
! strcmp(tok, "all") == 0)
return true;
}

--- 467,473 ----
return true;
}
else if (strcmp(tok, user) == 0 ||
! 				 strcmp(tok, "all\n") == 0)
return true;
}

***************
*** 463,476 ****

for (tok = strtok(param_str, MULTI_VALUE_SEP); tok != NULL; tok = strtok(NULL, MULTI_VALUE_SEP))
{
! 		if (strcmp(tok, "all") == 0)
return true;
! 		else if (strcmp(tok, "sameuser") == 0)
{
if (strcmp(dbname, user) == 0)
return true;
}
! 		else if (strcmp(tok, "samegroup") == 0)
{
if (check_group(dbname, user))
return true;
--- 484,497 ----
for (tok = strtok(param_str, MULTI_VALUE_SEP); tok != NULL; tok = strtok(NULL, MULTI_VALUE_SEP))
{
! 		if (strcmp(tok, "all\n") == 0)
return true;
! 		else if (strcmp(tok, "sameuser\n") == 0)
{
if (strcmp(dbname, user) == 0)
return true;
}
! 		else if (strcmp(tok, "samegroup\n") == 0)
{
if (check_group(dbname, user))
return true;
***************
*** 1068,1074 ****
errmsg("cannot use Ident authentication without usermap field")));
found_entry = false;
}
! 	else if (strcmp(usermap_name, "sameuser") == 0)
{
if (strcmp(pg_user, ident_user) == 0)
found_entry = true;
--- 1089,1095 ----
errmsg("cannot use Ident authentication without usermap field")));
found_entry = false;
}
! 	else if (strcmp(usermap_name, "sameuser\n") == 0)
{
if (strcmp(pg_user, ident_user) == 0)
found_entry = true;
Index: pg_hba.conf.sample
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/libpq/pg_hba.conf.sample,v
retrieving revision 1.47
diff -c -w -r1.47 pg_hba.conf.sample
*** pg_hba.conf.sample	13 Sep 2003 16:43:38 -0000	1.47
--- pg_hba.conf.sample	19 Dec 2003 17:42:20 -0000
***************
*** 35,40 ****
--- 35,45 ----
# encrypted passwords.  OPTION is the ident map or the name of the PAM
# service.
#
+ # Database and user names containing spaces, commas, quotes and other special
+ # characters can be quoted. Quoting one of the keywords "all", "sameuser" or
+ # "samegroup"  makes the name lose its special character, and just match a 
+ # database or username with that name.
+ #
# This file is read on server startup and when the postmaster receives
# a SIGHUP signal.  If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
***************
*** 59,62 ****
# IPv4-style local connections:
host    all         all         127.0.0.1         255.255.255.255   trust
# IPv6-style local connections:
! host    all         all         ::1               ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff        trust
--- 64,67 ----
# IPv4-style local connections:
host    all         all         127.0.0.1         255.255.255.255   trust
# IPv6-style local connections:
! host    all         all         ::1/128                             trust

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#14Michael Brusser
michael@synchronicity.com
In reply to: Andrew Dunstan (#1)
cache control?

Is there a way to force database to load
a frequently-accessed table into cache and keep it there?

Thanks,
Mike.

#15Reinoud van Leeuwen
reinoud.v@n.leeuwen.net
In reply to: Michael Brusser (#14)
Re: cache control?

On Fri, Jan 16, 2004 at 12:00:08PM -0500, Michael Brusser wrote:

Is there a way to force database to load
a frequently-accessed table into cache and keep it there?

If it is frequently accessed, I guess it would be in the cachke
permanently....

--
__________________________________________________
"Nothing is as subjective as reality"
Reinoud van Leeuwen reinoud.v@n.leeuwen.net
http://www.xs4all.nl/~reinoud
__________________________________________________

#16Neil Conway
neilc@samurai.com
In reply to: Michael Brusser (#14)
Re: cache control?

Michael Brusser <michael@synchronicity.com> writes:

Is there a way to force database to load a frequently-accessed table
into cache and keep it there?

No.

BTW, this idea has been suggested in the past, so check the archives
for the prior discussions on this topic. The usual response is that
the PostgreSQL bufmgr should already placing hot pages into the cache,
so there isn't really a need for this mechanism. (And if the bufmgr
doesn't do this well enough, we should improve the bufmgr -- as Jan
has done for 7.5)

-Neil

#17scott.marlowe
scott.marlowe@ihs.com
In reply to: Michael Brusser (#14)
Re: cache control?

On Fri, 16 Jan 2004, Michael Brusser wrote:

Is there a way to force database to load
a frequently-accessed table into cache and keep it there?

Nope. But there is a new cache buffer handler that may make it into 7.5
that would make that happen automagically.

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: scott.marlowe (#17)
Re: cache control?

This discussion seems likely to have a major effect on DBT-3 (DSS-type)
performance from PostgreSQL...

On Fri, 16 Jan 2004, Michael Brusser wrote:

Is there a way to force database to load
a frequently-accessed table into cache and keep it there?

Scott Marlow replied...

Nope. But there is a new cache buffer handler that may make it into

7.5

that would make that happen automagically.

The important question here is "what forces blocks out of cache?" rather
than thinking about how to directly keep them there.

Jeroen T. Vermeulen wrote:
Sent: Friday, January 16, 2004 23:02
Subject: [HACKERS] Reverse scans?

Would it be doable, and would it be useful, to try to alternate the
directions of table and index scans every time each table/index was
fully scanned?

I was thinking that it could help cache performance at various levels
in cases where data at the end of a large table, say, that remained in
memory after a scan, would otherwise be flushed out by a new scan of

the

same table. If the next scan of the same table was to go in the other
direction, any remains of the last time around that were still in the
filesystem cache, buffer pool, hard disk cache etc. would stand a

greater

chance of being reused.

Jereon's idea is a good one when we consider the current behaviour,
which is
that large scans are placed into block buffer cache, which then forces
other data out. I would like to question the latter behaviour, so we can
address the cause and not just the symptom.

Earlier versions of Oracle had a parameter called something like
SMALL_TABLE_THRESHOLD. Any scan of a table larger than that threshold
was NOT placed into buffer cache, but was consumed directly by the
shadow process (the backend). Teradata also uses a similar buffer
control technique for large table scans.

If a table is too large to fit into buffer, it clearly wasn't going to
be cached properly in the first place; Jereon's idea only works well for
tables near to the size of the cache. If the table is MUCH bigger then
it will have very little gain. Good original thinking, though I'm not
sure its worth it.

Oracle 9i now offers some variety for buffer cache management (as does
DB2). You can specify at the tablespace and object level whether to use
one of two buffer pools, KEEP or RECYCLE. The idea is to separate out
the two types of blocks - ones that are there because they're well used
and other blocks that are there at the moment, but unlikely to stay.

My suggestion would be to:
- split the buffer cache into two, just as Oracle does: KEEP & RECYCLE.
This could default to KEEP=66% of total memory available, but could also
be settable by init parameter.
[changes to the memory management routines]
- if we do a scan on a table whose size in blocks is more than some
fraction (25%?) of KEEP bufferpool then we place the blocks into RECYCLE
bufferpool. This can be decided immediately following optimization,
rather than including it within the optimizer decision process since we
aren't going to change the way the statement executes, we're just going
to stop it from having an adverse effect on other current or future
statements.
[additional test to set parameter, then work out where to note it]

Notice that I haven't suggested that the KEEP/RECYCLE option could be
specified at table level. That optionality sounds like a lot of extra
work, when what is needed is the automatic avoidance of cache-spoiling
behaviour. (This would still mean that very large indexes with random
request patterns would still spoil cache...maybe implement that later?)

This would remove most reasons for spoiling the cache and blocks would
then leave the cache only when they were genuinely no longer wanted.

Any comments?? Takers?

#19Jan Wieck
JanWieck@Yahoo.com
In reply to: Simon Riggs (#18)
Re: cache control?

Simon,

have you read src/backend/storage/buffer/README of current CVS tip?

The algorithm in the new replacement strategy is an attempt to figure
that SMALL_TABLE_THRESHOLD automatically. Do you see anything that can
be improved in that algorithm?

Jan

Simon Riggs wrote:

This discussion seems likely to have a major effect on DBT-3 (DSS-type)
performance from PostgreSQL...

On Fri, 16 Jan 2004, Michael Brusser wrote:

Is there a way to force database to load
a frequently-accessed table into cache and keep it there?

Scott Marlow replied...

Nope. But there is a new cache buffer handler that may make it into

7.5

that would make that happen automagically.

The important question here is "what forces blocks out of cache?" rather
than thinking about how to directly keep them there.

Jeroen T. Vermeulen wrote:
Sent: Friday, January 16, 2004 23:02
Subject: [HACKERS] Reverse scans?

Would it be doable, and would it be useful, to try to alternate the
directions of table and index scans every time each table/index was
fully scanned?

I was thinking that it could help cache performance at various levels
in cases where data at the end of a large table, say, that remained in
memory after a scan, would otherwise be flushed out by a new scan of

the

same table. If the next scan of the same table was to go in the other
direction, any remains of the last time around that were still in the
filesystem cache, buffer pool, hard disk cache etc. would stand a

greater

chance of being reused.

Jereon's idea is a good one when we consider the current behaviour,
which is
that large scans are placed into block buffer cache, which then forces
other data out. I would like to question the latter behaviour, so we can
address the cause and not just the symptom.

Earlier versions of Oracle had a parameter called something like
SMALL_TABLE_THRESHOLD. Any scan of a table larger than that threshold
was NOT placed into buffer cache, but was consumed directly by the
shadow process (the backend). Teradata also uses a similar buffer
control technique for large table scans.

If a table is too large to fit into buffer, it clearly wasn't going to
be cached properly in the first place; Jereon's idea only works well for
tables near to the size of the cache. If the table is MUCH bigger then
it will have very little gain. Good original thinking, though I'm not
sure its worth it.

Oracle 9i now offers some variety for buffer cache management (as does
DB2). You can specify at the tablespace and object level whether to use
one of two buffer pools, KEEP or RECYCLE. The idea is to separate out
the two types of blocks - ones that are there because they're well used
and other blocks that are there at the moment, but unlikely to stay.

My suggestion would be to:
- split the buffer cache into two, just as Oracle does: KEEP & RECYCLE.
This could default to KEEP=66% of total memory available, but could also
be settable by init parameter.
[changes to the memory management routines]
- if we do a scan on a table whose size in blocks is more than some
fraction (25%?) of KEEP bufferpool then we place the blocks into RECYCLE
bufferpool. This can be decided immediately following optimization,
rather than including it within the optimizer decision process since we
aren't going to change the way the statement executes, we're just going
to stop it from having an adverse effect on other current or future
statements.
[additional test to set parameter, then work out where to note it]

Notice that I haven't suggested that the KEEP/RECYCLE option could be
specified at table level. That optionality sounds like a lot of extra
work, when what is needed is the automatic avoidance of cache-spoiling
behaviour. (This would still mean that very large indexes with random
request patterns would still spoil cache...maybe implement that later?)

This would remove most reasons for spoiling the cache and blocks would
then leave the cache only when they were genuinely no longer wanted.

Any comments?? Takers?

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#18)
Re: cache control?

"Simon Riggs" <simon@2ndquadrant.com> writes:

Any comments?? Takers?

Um, did you read the discussion of the ARC buffer management algorithm
that's already been implemented for 7.5?

The main objection I have to doing it Oracle's way is that that creates
more parameters that DBAs have to get right for reasonable performance.
Self-tuning algorithms are better, when available.

regards, tom lane

#21Simon Riggs
simon@2ndQuadrant.com
In reply to: Jan Wieck (#19)
#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#20)
#23Jan Wieck
JanWieck@Yahoo.com
In reply to: Simon Riggs (#21)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#22)
#25Neil Conway
neilc@samurai.com
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Conway (#25)
#27Simon Riggs
simon@2ndQuadrant.com
In reply to: Jan Wieck (#23)
#28Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#26)
#29Jan Wieck
JanWieck@Yahoo.com
In reply to: Simon Riggs (#27)
#30Simon Riggs
simon@2ndQuadrant.com
In reply to: Jan Wieck (#29)
#31Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#24)
#32Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#26)
#33Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#31)
#34Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#32)
#35Richard Huxton
dev@archonet.com
In reply to: Simon Riggs (#34)
#36Bruce Momjian
bruce@momjian.us
In reply to: Richard Huxton (#35)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#36)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#37)