stats_ext test fails with -DCATCACHE_FORCE_RELEASE

Started by Amit Langoteover 7 years ago5 messages
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
1 attachment(s)

Hi.

While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
stats_ext test failed with errors for multiple statements that looked like
this:

ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

I figured it's because statext_dependencies_load() and
statext_ndistinct_build() both return a pointer that points directly into
a pg_statistics_ext tuple obtained by using SearchSysCache1 that has
uncertain lifetime after subsequent call to ReleaseSysCache before returning.

I think we should be calling statext_dependencies_deserialize() and
statext_ndistinct_deserialize(), respectively, *before* we perform
ReleaseSysCache on the tuple. Attached patch does that.

Thanks,
Amit

Attachments:

extended-stat-catalog-datum-1.patchtext/plain; charset=UTF-8; name=extended-stat-catalog-datum-1.patchDownload
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index d7305b7ab3..cbf3221ce9 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -631,6 +631,7 @@ dependency_implies_attribute(MVDependency *dependency, AttrNumber attnum)
 MVDependencies *
 statext_dependencies_load(Oid mvoid)
 {
+	MVDependencies *result;
 	bool		isnull;
 	Datum		deps;
 	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
@@ -642,9 +643,11 @@ statext_dependencies_load(Oid mvoid)
 						   Anum_pg_statistic_ext_stxdependencies, &isnull);
 	Assert(!isnull);
 
+	result = statext_dependencies_deserialize(DatumGetByteaP(deps));
+
 	ReleaseSysCache(htup);
 
-	return statext_dependencies_deserialize(DatumGetByteaP(deps));
+	return result;
 }
 
 /*
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 24e74d7a1b..069d3cd6ff 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -126,6 +126,7 @@ statext_ndistinct_build(double totalrows, int numrows, HeapTuple *rows,
 MVNDistinct *
 statext_ndistinct_load(Oid mvoid)
 {
+	MVNDistinct *result;
 	bool		isnull = false;
 	Datum		ndist;
 	HeapTuple	htup;
@@ -141,9 +142,11 @@ statext_ndistinct_load(Oid mvoid)
 			 "requested statistic kind %c is not yet built for statistics object %u",
 			 STATS_EXT_NDISTINCT, mvoid);
 
+	result = statext_ndistinct_deserialize(DatumGetByteaP(ndist));
+
 	ReleaseSysCache(htup);
 
-	return statext_ndistinct_deserialize(DatumGetByteaP(ndist));
+	return result;
 }
 
 /*
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#1)
Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
stats_ext test failed with errors for multiple statements that looked like
this:
ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

Interesting. How come the buildfarm hasn't noticed this? I should
think that the CLOBBER_CACHE_ALWAYS animals, as well as the one(s)
using -DCATCACHE_FORCE_RELEASE, would have shown failures.

regards, tom lane

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#2)
Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE

On 2018/05/02 0:33, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
stats_ext test failed with errors for multiple statements that looked like
this:
ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

Interesting. How come the buildfarm hasn't noticed this? I should
think that the CLOBBER_CACHE_ALWAYS animals, as well as the one(s)
using -DCATCACHE_FORCE_RELEASE, would have shown failures.

I too wondered why. Fwiw, similar failure occurs in PG 10 branch.

Thanks,
Amit

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#3)
Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

On 2018/05/02 0:33, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
stats_ext test failed with errors for multiple statements that looked like
this:
ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

Interesting. How come the buildfarm hasn't noticed this? I should
think that the CLOBBER_CACHE_ALWAYS animals, as well as the one(s)
using -DCATCACHE_FORCE_RELEASE, would have shown failures.

I too wondered why. Fwiw, similar failure occurs in PG 10 branch.

Ah, after looking closer I understand that. First, there isn't any
buildfarm member using CATCACHE_FORCE_RELEASE --- what I was thinking
of is Andrew's prion, which uses RELCACHE_FORCE_RELEASE. Not the
same thing.

Second, the nature of the bug is that these functions are reading
from a catcache entry immediately after ReleaseSysCache, when they
should do that immediately before. CLOBBER_CACHE_ALWAYS does not
trigger the problem because it clobbers cache only at invalidation
opportunities. In the current implementation, ReleaseSysCache per
se is not an invalidation opportunity.

CATCACHE_FORCE_RELEASE does model a real-world hazard, which is
that if we get an invalidation signal for a catcache item *while
it's pinned*, it'd go away as soon as the last pin is released.
Evidently, these code paths do not contain any invalidation
opportunities occurring while the pin on the stats_ext catcache
entry is already held, so CCA can't trigger the problem. I think
this means that there's no production hazard here, just a violation
of coding convention. Nonetheless, we certainly should fix it,
since it's easy to imagine future changes that would create a live
hazard of the tuple going away during the ReleaseSysCache call.

tl;dr: we lack buildfarm coverage of CATCACHE_FORCE_RELEASE.
This is probably bad. It might be okay to just add that to
prion's configuration; I'm not sure whether there's any value
in testing it separately from RELCACHE_FORCE_RELEASE.

regards, tom lane

#5Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE

On Wed, May 2, 2018 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

tl;dr: we lack buildfarm coverage of CATCACHE_FORCE_RELEASE.
This is probably bad. It might be okay to just add that to
prion's configuration;

Will do that pronto.

cheers

andre

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services