Protect syscache from bloating with negative cache entries
Hello, recently one of my customer stumbled over an immoderate
catcache bloat.
This is a known issue living on the Todo page in the PostgreSQL
wiki.
https://wiki.postgresql.org/wiki/Todo#Cache_Usage
Fix memory leak caused by negative catcache entries
/messages/by-id/51C0A1FF.2050404@vmware.com
This patch addresses the two cases of syscache bloat by using
invalidation callback mechanism.
Overview of the patch
The bloat is caused by negative cache entries in catcaches. They
are crucial for performance but it is a problem that there's no
way to remove them. They last for the backends' lifetime.
The first patch provides a means to flush catcache negative
entries, then defines a relcache invalidation callback to flush
negative entries in syscaches for pg_statistic(STATRELATTINH) and
pg_attributes(ATTNAME, ATTNUM). The second patch implements a
syscache invalidation callback so that deletion of a schema
causes a flush for pg_class (RELNAMENSP).
Both of the aboves are not hard-coded and defined in cacheinfo
using additional four members.
Remaining problems
Still, catcache can bloat by repeatedly accessing non-existent
table with unique names in a permanently-living schema but it
seems a bit too artificial (or malicious). Since such negative
entries don't have a trigger to remove, caps are needed to
prevent them from bloating syscaches, but the limits are hardly
seem reasonably determinable.
Defects or disadvantages
This patch scans over whole the target catcache to find negative
entries to remove and it might take a (comparably) long time on a
catcache with so many entries. By the second patch, unrelated
negative caches may be involved in flushing since they are keyd
by hashvalue, not by the exact key values.
The attached files are the following.
1. 0001-Cleanup-negative-cache-of-pg_statistic-when-dropping.patch
Negative entry flushing by relcache invalidation using
relcache invalidation callback.
2. 0002-Cleanup-negative-cache-of-pg_class-when-dropping-a-s.patch
Negative entry flushing by catcache invalidation using
catcache invalidation callback.
3. gen.pl
a test script for STATRELATTINH bloating.
4. gen2.pl
a test script for RELNAMENSP bloating.
3 and 4 are used as the following,
./gen.pl | psql postgres > /dev/null
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Cleanup-negative-cache-of-pg_statistic-when-dropping.patchtext/x-patch; charset=us-asciiDownload+263-74
0002-Cleanup-negative-cache-of-pg_class-when-dropping-a-s.patchtext/x-patch; charset=us-asciiDownload+284-71
gen.pltext/plain; charset=us-asciiDownload
gen2.pltext/plain; charset=us-asciiDownload
On Mon, Dec 19, 2016 at 6:15 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello, recently one of my customer stumbled over an immoderate
catcache bloat.
This isn't only an issue for negative catcache entries. A long time
ago, there was a limit on the size of the relcache, which was removed
because if you have a workload where the working set of relations is
just larger than the limit, performance is terrible. But the problem
now is that backend memory usage can grow without bound, and that's
also bad, especially on systems with hundreds of long-lived backends.
In connection-pooling environments, the problem is worse, because
every connection in the pool eventually caches references to
everything of interest to any client.
Your patches seem to me to have some merit, but I wonder if we should
also consider having a time-based threshold of some kind. If, say, a
backend hasn't accessed a catcache or relcache entry for many minutes,
it becomes eligible to be flushed. We could implement this by having
some process, like the background writer,
SendProcSignal(PROCSIG_HOUSEKEEPING) to every process in the system
every 10 minutes or so. When a process receives this signal, it sets
a flag that is checked before going idle. When it sees the flag set,
it makes a pass over every catcache and relcache entry. All the ones
that are unmarked get marked, and all of the ones that are marked get
removed. Access to an entry clears any mark. So anything that's not
touched for more than 10 minutes starts dropping out of backend
caches.
Anyway, that would be a much bigger change from what you are proposing
here, and what you are proposing here seems reasonable so I guess I
shouldn't distract from it. Your email just made me think of it,
because I agree that catcache/relcache bloat is a serious issue.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 20 December 2016 at 21:59, Robert Haas <robertmhaas@gmail.com> wrote:
We could implement this by having
some process, like the background writer,
SendProcSignal(PROCSIG_HOUSEKEEPING) to every process in the system
every 10 minutes or so.
... on a rolling basis.
Otherwise that'll be no fun at all, especially with some of those
lovely "we kept getting errors so we raised max_connections to 5000"
systems out there. But also on more sensibly configured ones that're
busy and want nice smooth performance without stalls.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Craig Ringer <craig@2ndquadrant.com> writes:
On 20 December 2016 at 21:59, Robert Haas <robertmhaas@gmail.com> wrote:
We could implement this by having
some process, like the background writer,
SendProcSignal(PROCSIG_HOUSEKEEPING) to every process in the system
every 10 minutes or so.
... on a rolling basis.
I don't understand why we'd make that a system-wide behavior at all,
rather than expecting each process to manage its own cache.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 20, 2016 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Craig Ringer <craig@2ndquadrant.com> writes:
On 20 December 2016 at 21:59, Robert Haas <robertmhaas@gmail.com> wrote:
We could implement this by having
some process, like the background writer,
SendProcSignal(PROCSIG_HOUSEKEEPING) to every process in the system
every 10 minutes or so.... on a rolling basis.
I don't understand why we'd make that a system-wide behavior at all,
rather than expecting each process to manage its own cache.
Individual backends don't have a really great way to do time-based
stuff, do they? I mean, yes, there is enable_timeout() and friends,
but I think that requires quite a bit of bookkeeping.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Dec 20, 2016 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I don't understand why we'd make that a system-wide behavior at all,
rather than expecting each process to manage its own cache.
Individual backends don't have a really great way to do time-based
stuff, do they? I mean, yes, there is enable_timeout() and friends,
but I think that requires quite a bit of bookkeeping.
If I thought that "every ten minutes" was an ideal way to manage this,
I might worry about that, but it doesn't really sound promising at all.
Every so many queries would likely work better, or better yet make it
self-adaptive depending on how much is in the local syscache.
The bigger picture here though is that we used to have limits on syscache
size, and we got rid of them (commit 8b9bc234a, see also
/messages/by-id/5141.1150327541@sss.pgh.pa.us)
not only because of the problem you mentioned about performance falling
off a cliff once the working-set size exceeded the arbitrary limit, but
also because enforcing the limit added significant overhead --- and did so
whether or not you got any benefit from it, ie even if the limit is never
reached. Maybe the present patch avoids imposing a pile of overhead in
situations where no pruning is needed, but it doesn't really look very
promising from that angle in a quick once-over.
BTW, I don't see the point of the second patch at all? Surely, if
an object is deleted or updated, we already have code that flushes
related catcache entries. Otherwise the caches would deliver wrong
data.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 20, 2016 at 3:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Dec 20, 2016 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I don't understand why we'd make that a system-wide behavior at all,
rather than expecting each process to manage its own cache.Individual backends don't have a really great way to do time-based
stuff, do they? I mean, yes, there is enable_timeout() and friends,
but I think that requires quite a bit of bookkeeping.If I thought that "every ten minutes" was an ideal way to manage this,
I might worry about that, but it doesn't really sound promising at all.
Every so many queries would likely work better, or better yet make it
self-adaptive depending on how much is in the local syscache.
I don't think "every so many queries" is very promising at all.
First, it has the same problem as a fixed cap on the number of
entries: if you're doing a round-robin just slightly bigger than that
value, performance will be poor. Second, what's really important here
is to keep the percentage of wall-clock time spent populating the
system caches small. If a backend is doing 4000 queries/second and
each of those 4000 queries touches a different table, it really needs
a cache of at least 4000 entries or it will thrash and slow way down.
But if it's doing a query every 10 minutes and those queries
round-robin between 4000 different tables, it doesn't really need a
4000-entry cache. If those queries are long-running, the time to
repopulate the cache will only be a tiny fraction of runtime. If the
queries are short-running, then the effect is, percentage-wise, just
the same as for the high-volume system, but in practice it isn't
likely to be felt as much. I mean, if we keep a bunch of old cache
entries around on a mostly-idle backend, they are going to be pushed
out of CPU caches and maybe even paged out. One can't expect a
backend that is woken up after a long sleep to be quite as snappy as
one that's continuously active.
Which gets to my third point: anything that's based on number of
queries won't do anything to help the case where backends sometimes go
idle and sit there for long periods. Reducing resource utilization in
that case would be beneficial. Ideally I'd like to get rid of not
only the backend-local cache contents but the backend itself, but
that's a much harder project.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thank you for the discussion.
At Tue, 20 Dec 2016 15:10:21 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <23492.1482264621@sss.pgh.pa.us>
The bigger picture here though is that we used to have limits on syscache
size, and we got rid of them (commit 8b9bc234a, see also
/messages/by-id/5141.1150327541@sss.pgh.pa.us)
not only because of the problem you mentioned about performance falling
off a cliff once the working-set size exceeded the arbitrary limit, but
also because enforcing the limit added significant overhead --- and did so
whether or not you got any benefit from it, ie even if the limit is never
reached. Maybe the present patch avoids imposing a pile of overhead in
situations where no pruning is needed, but it doesn't really look very
promising from that angle in a quick once-over.
Indeed. As mentioned in the mail at the beginning of this thread,
it hits the whole-cache scanning if at least one negative cache
exists even it is not in a relation with the target relid, and it
can be significantly long on a fat cache.
Lists of negative entries like CatCacheList would help but needs
additional memeory.
BTW, I don't see the point of the second patch at all? Surely, if
an object is deleted or updated, we already have code that flushes
related catcache entries. Otherwise the caches would deliver wrong
data.
Maybe you take the patch wrongly. Negative entires won't be
flushed by any means. Deletion of a namespace causes cascaded
object deletion according to dependency then finaly goes to
non-neative cache invalidation. But a removal of *negative
entries* in RELNAMENSP won't happen.
The test script for the case (gen2.pl) does the following thing,
CREATE SCHEMA foo;
SELECT * FROM foo.invalid;
DROP SCHEMA foo;
Removing the schema foo leaves a negative cache entry for
'foo.invalid' in RELNAMENSP.
However, I'm not sure the above situation happens so frequent
that it is worthwhile to amend.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Wed, 21 Dec 2016 10:21:09 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20161221.102109.51106943.horiguchi.kyotaro@lab.ntt.co.jp>
At Tue, 20 Dec 2016 15:10:21 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <23492.1482264621@sss.pgh.pa.us>
The bigger picture here though is that we used to have limits on syscache
size, and we got rid of them (commit 8b9bc234a, see also
/messages/by-id/5141.1150327541@sss.pgh.pa.us)
not only because of the problem you mentioned about performance falling
off a cliff once the working-set size exceeded the arbitrary limit, but
also because enforcing the limit added significant overhead --- and did so
whether or not you got any benefit from it, ie even if the limit is never
reached. Maybe the present patch avoids imposing a pile of overhead in
situations where no pruning is needed, but it doesn't really look very
promising from that angle in a quick once-over.Indeed. As mentioned in the mail at the beginning of this thread,
it hits the whole-cache scanning if at least one negative cache
exists even it is not in a relation with the target relid, and it
can be significantly long on a fat cache.Lists of negative entries like CatCacheList would help but needs
additional memeory.BTW, I don't see the point of the second patch at all? Surely, if
an object is deleted or updated, we already have code that flushes
related catcache entries. Otherwise the caches would deliver wrong
data.Maybe you take the patch wrongly. Negative entires won't be
flushed by any means. Deletion of a namespace causes cascaded
object deletion according to dependency then finaly goes to
non-neative cache invalidation. But a removal of *negative
entries* in RELNAMENSP won't happen.The test script for the case (gen2.pl) does the following thing,
CREATE SCHEMA foo;
SELECT * FROM foo.invalid;
DROP SCHEMA foo;Removing the schema foo leaves a negative cache entry for
'foo.invalid' in RELNAMENSP.However, I'm not sure the above situation happens so frequent
that it is worthwhile to amend.
Since 1753b1b conflicts this patch, I rebased this onto the
current master HEAD. I'll register this to the next CF.
The points of discussion are the following, I think.
1. The first patch seems working well. It costs the time to scan
the whole of a catcache that have negative entries for other
reloids. However, such negative entries are created by rather
unusual usages. Accesing to undefined columns, and accessing
columns on which no statistics have created. The
whole-catcache scan occurs on ATTNAME, ATTNUM and
STATRELATTINH for every invalidation of a relcache entry.
2. The second patch also works, but flushing negative entries by
hash values is inefficient. It scans the bucket corresponding
to given hash value for OIDs, then flushing negative entries
iterating over all the collected OIDs. So this costs more time
than 1 and flushes involving entries that is not necessary to
be removed. If this feature is valuable but such side effects
are not acceptable, new invalidation category based on
cacheid-oid pair would be needed.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Dec 21, 2016 at 5:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If I thought that "every ten minutes" was an ideal way to manage this,
I might worry about that, but it doesn't really sound promising at all.
Every so many queries would likely work better, or better yet make it
self-adaptive depending on how much is in the local syscache.The bigger picture here though is that we used to have limits on syscache
size, and we got rid of them (commit 8b9bc234a, see also
/messages/by-id/5141.1150327541@sss.pgh.pa.us)
not only because of the problem you mentioned about performance falling
off a cliff once the working-set size exceeded the arbitrary limit, but
also because enforcing the limit added significant overhead --- and did so
whether or not you got any benefit from it, ie even if the limit is never
reached. Maybe the present patch avoids imposing a pile of overhead in
situations where no pruning is needed, but it doesn't really look very
promising from that angle in a quick once-over.
Have there been ever discussions about having catcache entries in a
shared memory area? This does not sound much performance-wise, I am
just wondering about the concept and I cannot find references to such
discussions.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
Have there been ever discussions about having catcache entries in a
shared memory area? This does not sound much performance-wise, I am
just wondering about the concept and I cannot find references to such
discussions.
I'm sure it's been discussed. Offhand I remember the following issues:
* A shared cache would create locking and contention overhead.
* A shared cache would have a very hard size limit, at least if it's
in SysV-style shared memory (perhaps DSM would let us relax that).
* Transactions that are doing DDL have a requirement for the catcache
to reflect changes that they've made locally but not yet committed,
so said changes mustn't be visible globally.
You could possibly get around the third point with a local catcache that's
searched before the shared one, but tuning that to be performant sounds
like a mess. Also, I'm not sure how such a structure could cope with
uncommitted deletions: delete A -> remove A from local catcache, but not
the shared one -> search for A in local catcache -> not found -> search
for A in shared catcache -> found -> oops.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jan 13, 2017 at 8:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
Have there been ever discussions about having catcache entries in a
shared memory area? This does not sound much performance-wise, I am
just wondering about the concept and I cannot find references to such
discussions.I'm sure it's been discussed. Offhand I remember the following issues:
* A shared cache would create locking and contention overhead.
* A shared cache would have a very hard size limit, at least if it's
in SysV-style shared memory (perhaps DSM would let us relax that).* Transactions that are doing DDL have a requirement for the catcache
to reflect changes that they've made locally but not yet committed,
so said changes mustn't be visible globally.You could possibly get around the third point with a local catcache that's
searched before the shared one, but tuning that to be performant sounds
like a mess. Also, I'm not sure how such a structure could cope with
uncommitted deletions: delete A -> remove A from local catcache, but not
the shared one -> search for A in local catcache -> not found -> search
for A in shared catcache -> found -> oops.
I think the first of those concerns is the key one. If searching the
system catalogs costs $100 and searching the private catcache costs
$1, what's the cost of searching a hypothetical shared catcache? If
the answer is $80, it's not worth doing. If the answer is $5, it's
probably still not worth doing. If the answer is $1.25, then it's
probably worth investing some energy into trying to solve the other
problems you list. For some users, the memory cost of catcache and
syscache entries multiplied by N backends are a very serious problem,
so it would be nice to have some other options. But we do so many
syscache lookups that a shared cache won't be viable unless it's
almost as fast as a backend-private cache, or at least that's my
hunch.
I think it would be interested for somebody to build a prototype here
that ignores all the problems but the first and uses some
straightforward, relatively unoptimized locking strategy for the first
problem. Then benchmark it. If the results show that the idea has
legs, then we can try to figure out what a real implementation would
look like.
(One possible approach: use Thomas Munro's DHT stuff to build the shared cache.)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jan 14, 2017 at 12:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jan 13, 2017 at 8:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
Have there been ever discussions about having catcache entries in a
shared memory area? This does not sound much performance-wise, I am
just wondering about the concept and I cannot find references to such
discussions.I'm sure it's been discussed. Offhand I remember the following issues:
* A shared cache would create locking and contention overhead.
* A shared cache would have a very hard size limit, at least if it's
in SysV-style shared memory (perhaps DSM would let us relax that).* Transactions that are doing DDL have a requirement for the catcache
to reflect changes that they've made locally but not yet committed,
so said changes mustn't be visible globally.You could possibly get around the third point with a local catcache that's
searched before the shared one, but tuning that to be performant sounds
like a mess. Also, I'm not sure how such a structure could cope with
uncommitted deletions: delete A -> remove A from local catcache, but not
the shared one -> search for A in local catcache -> not found -> search
for A in shared catcache -> found -> oops.I think the first of those concerns is the key one. If searching the
system catalogs costs $100 and searching the private catcache costs
$1, what's the cost of searching a hypothetical shared catcache? If
the answer is $80, it's not worth doing. If the answer is $5, it's
probably still not worth doing. If the answer is $1.25, then it's
probably worth investing some energy into trying to solve the other
problems you list. For some users, the memory cost of catcache and
syscache entries multiplied by N backends are a very serious problem,
so it would be nice to have some other options. But we do so many
syscache lookups that a shared cache won't be viable unless it's
almost as fast as a backend-private cache, or at least that's my
hunch.
Being able to switch from one mode to another would be interesting.
Applications using extensing DDLs that require to change the catcache
with an exclusive lock would clearly pay the lock contention cost, but
do you think that be really the case of a shared lock? A bunch of
applications that I work with deploy Postgres once, then don't change
the schema except when an upgrade happens. So that would be benefitial
for that. There are even some apps that do not use pgbouncer, but drop
sessions after a timeout of inactivity to avoid a memory bloat because
of the problem of this thread. That won't solve the problem of the
local catcache bloat, but some users using few DDLs may be fine to pay
some extra concurrency cost if the session handling gets easied.
I think it would be interested for somebody to build a prototype here
that ignores all the problems but the first and uses some
straightforward, relatively unoptimized locking strategy for the first
problem. Then benchmark it. If the results show that the idea has
legs, then we can try to figure out what a real implementation would
look like.
(One possible approach: use Thomas Munro's DHT stuff to build the shared cache.)
Yeah, I'd bet on a couple of days of focus to sort that out.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
... There are even some apps that do not use pgbouncer, but drop
sessions after a timeout of inactivity to avoid a memory bloat because
of the problem of this thread.
Yeah, a certain company I used to work for had to do that, though their
problem had more to do with bloat in plpgsql's compiled-functions cache
(and ensuing bloat in the plancache), I believe.
Still, I'm pretty suspicious of anything that will add overhead to
catcache lookups. If you think the performance of those is not absolutely
critical, turning off the caches via -DCLOBBER_CACHE_ALWAYS will soon
disabuse you of the error.
I'm inclined to think that a more profitable direction to look in is
finding a way to limit the cache size. I know we got rid of exactly that
years ago, but the problems with it were (a) the mechanism was itself
pretty expensive --- a global-to-all-caches LRU list IIRC, and (b) there
wasn't a way to tune the limit. Possibly somebody can think of some
cheaper, perhaps less precise way of aging out old entries. As for
(b), this is the sort of problem we made GUCs for.
But, again, the catcache isn't the only source of per-process bloat
and I'm not even sure it's the main one. A more holistic approach
might be called for.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2017-01-13 17:58:41 -0500, Tom Lane wrote:
But, again, the catcache isn't the only source of per-process bloat
and I'm not even sure it's the main one. A more holistic approach
might be called for.
It'd be helpful if we'd find a way to make it easy to get statistics
about the size of various caches in production systems. Right now that's
kinda hard, resulting in us having to make a lot of guesses...
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/14/2017 12:06 AM, Andres Freund wrote:
Hi,
On 2017-01-13 17:58:41 -0500, Tom Lane wrote:
But, again, the catcache isn't the only source of per-process bloat
and I'm not even sure it's the main one. A more holistic approach
might be called for.It'd be helpful if we'd find a way to make it easy to get statistics
about the size of various caches in production systems. Right now
that's kinda hard, resulting in us having to make a lot of
guesses...
What about a simple C extension, that could inspect those caches?
Assuming it could be loaded into a single backend, that should be
relatively acceptable way (compared to loading it to all backends using
shared_preload_libraries).
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jan 14, 2017 at 9:36 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
On 01/14/2017 12:06 AM, Andres Freund wrote:
On 2017-01-13 17:58:41 -0500, Tom Lane wrote:
But, again, the catcache isn't the only source of per-process bloat
and I'm not even sure it's the main one. A more holistic approach
might be called for.It'd be helpful if we'd find a way to make it easy to get statistics
about the size of various caches in production systems. Right now
that's kinda hard, resulting in us having to make a lot of
guesses...What about a simple C extension, that could inspect those caches? Assuming
it could be loaded into a single backend, that should be relatively
acceptable way (compared to loading it to all backends using
shared_preload_libraries).
This extension could do a small amount of work on a portion of the
syscache entries at each query loop, still I am wondering if that
would not be nicer to get that in-core and configurable, which is
basically the approach proposed by Horiguchi-san. At least it seems to
me that it has some merit, and if we could make that behavior
switchable, disabled by default, that would be a win for some class of
applications. What do others think?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/26/16 2:31 AM, Kyotaro HORIGUCHI wrote:
The points of discussion are the following, I think.
1. The first patch seems working well. It costs the time to scan
the whole of a catcache that have negative entries for other
reloids. However, such negative entries are created by rather
unusual usages. Accesing to undefined columns, and accessing
columns on which no statistics have created. The
whole-catcache scan occurs on ATTNAME, ATTNUM and
STATRELATTINH for every invalidation of a relcache entry.
I took a look at this. It looks sane, though I've got a few minor
comment tweaks:
+ * Remove negative cache tuples maching a partial key.
s/maching/matching/
+/* searching with a paritial key needs scanning the whole cache */
s/needs/means/
+ * a negative cache entry cannot be referenced so we can remove
s/referenced/referenced,/
I was wondering if there's a way to test the performance impact of
deleting negative entries.
2. The second patch also works, but flushing negative entries by
hash values is inefficient. It scans the bucket corresponding
to given hash value for OIDs, then flushing negative entries
iterating over all the collected OIDs. So this costs more time
than 1 and flushes involving entries that is not necessary to
be removed. If this feature is valuable but such side effects
are not acceptable, new invalidation category based on
cacheid-oid pair would be needed.
I glanced at this and it looks sane. Didn't go any farther since this
one's pretty up in the air. ISTM it'd be better to do some kind of aging
instead of patch 2.
The other (possibly naive) question I have is how useful negative
entries really are? Will Postgres regularly incur negative lookups, or
will these only happen due to user activity? I can't think of a case
where an app would need to depend on fast negative lookup (in other
words, it should be considered a bug in the app). I can see where
getting rid of them completely might be problematic, but maybe we can
just keep a relatively small number of them around. I'm thinking a
simple LRU list of X number of negative entries; when that fills you
reuse the oldest one. You'd have to pay the LRU maintenance cost on
every negative hit, but if those shouldn't be that common it shouldn't
be bad.
That might well necessitate another GUC, but it seems a lot simpler than
most of the other ideas.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
The other (possibly naive) question I have is how useful negative
entries really are? Will Postgres regularly incur negative lookups, or
will these only happen due to user activity?
It varies depending on the particular syscache, but in at least some
of them, negative cache entries are critical for performance.
See for example RelnameGetRelid(), which basically does a RELNAMENSP
cache lookup for each schema down the search path until it finds a
match. For any user table name with the standard search_path, there's
a guaranteed failure in pg_catalog before you can hope to find a match.
If we don't have negative cache entries, then *every invocation of this
function has to go to disk* (or at least to shared buffers).
It's possible that we could revise all our lookup patterns to avoid this
sort of thing. But I don't have much faith in that always being possible,
and exactly none that we won't introduce new lookup patterns that need it
in future. I spent some time, for instance, wondering if RelnameGetRelid
could use a SearchSysCacheList lookup instead, doing the lookup on table
name only and then inspecting the whole list to see which entry is
frontmost according to the current search path. But that has performance
failure modes of its own, for example if you have identical table names in
a boatload of different schemas. We do it that way for some other cases
such as function lookups, but I think it's much less likely that people
have identical function names in N schemas than that they have identical
table names in N schemas.
If you want to poke into this for particular test scenarios, building with
CATCACHE_STATS defined will yield a bunch of numbers dumped to the
postmaster log at each backend exit.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/21/17 8:54 PM, Tom Lane wrote:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
The other (possibly naive) question I have is how useful negative
entries really are? Will Postgres regularly incur negative lookups, or
will these only happen due to user activity?It varies depending on the particular syscache, but in at least some
of them, negative cache entries are critical for performance.
See for example RelnameGetRelid(), which basically does a RELNAMENSP
cache lookup for each schema down the search path until it finds a
match.
Ahh, I hadn't considered that. So one idea would be to only track
negative entries on caches where we know they're actually useful. That
might make the performance hit of some of the other ideas more
tolerable. Presumably you're much less likely to pollute the namespace
cache than some of the others.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers