valgrind vs. shared typmod registry

Started by Tomas Vondraover 8 years ago7 messages
#1Tomas Vondra
tomas.vondra@2ndquadrant.com
1 attachment(s)

Hi,

I've been running some regression tests under valgrind, and it seems
select_parallel triggers some uses of uninitialized values in dshash. If
I'm reading the reports right, it complains about hashtable->size_log2
being not being initialized in ensure_valid_bucket_pointers.

I've been running tests under valgrind not too long ago and I don't
recall such failures, so perhaps something broke it in the past few days.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

valgrind.logtext/x-log; name=valgrind.logDownload
#2Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Tomas Vondra (#1)
Re: valgrind vs. shared typmod registry

On Sun, Sep 17, 2017 at 12:30 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I've been running some regression tests under valgrind, and it seems
select_parallel triggers some uses of uninitialized values in dshash. If
I'm reading the reports right, it complains about hashtable->size_log2
being not being initialized in ensure_valid_bucket_pointers.

Thanks. Will investigate.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Tomas Vondra (#1)
Re: valgrind vs. shared typmod registry

On Sat, Sep 16, 2017 at 5:30 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I've been running tests under valgrind not too long ago and I don't
recall such failures, so perhaps something broke it in the past few days.

That's what we have the buildfarm animal Skink for. It has indeed been
failing within select_parallel only following the commit that you
mentioned:

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=skink&amp;br=HEAD

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#2)
Re: valgrind vs. shared typmod registry

On Sun, Sep 17, 2017 at 7:42 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Sep 17, 2017 at 12:30 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I've been running some regression tests under valgrind, and it seems
select_parallel triggers some uses of uninitialized values in dshash. If
I'm reading the reports right, it complains about hashtable->size_log2
being not being initialized in ensure_valid_bucket_pointers.

Thanks. Will investigate.

Yeah, it's a bug, I simply failed to initialise it.
ensure_valid_bucket_pointers() immediately fixes the problem (unless
the uninitialised memory had an unlikely value), explaining why it
works anyway. I'm a bit tied up today but will test and post a patch
tomorrow.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#4)
1 attachment(s)
Re: valgrind vs. shared typmod registry

On Sun, Sep 17, 2017 at 8:49 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Sep 17, 2017 at 7:42 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Sep 17, 2017 at 12:30 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I've been running some regression tests under valgrind, and it seems
select_parallel triggers some uses of uninitialized values in dshash. If
I'm reading the reports right, it complains about hashtable->size_log2
being not being initialized in ensure_valid_bucket_pointers.

Thanks. Will investigate.

Yeah, it's a bug, I simply failed to initialise it.
ensure_valid_bucket_pointers() immediately fixes the problem (unless
the uninitialised memory had an unlikely value), explaining why it
works anyway. I'm a bit tied up today but will test and post a patch
tomorrow.

Here is a patch to fix that.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

0001-Fix-uninitialized-variable-in-dshash.c.patchapplication/octet-stream; name=0001-Fix-uninitialized-variable-in-dshash.c.patchDownload
From b11ef06d78758cc0446006f43cdf98d95250edb1 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 18 Sep 2017 17:28:05 +1200
Subject: [PATCH] Fix uninitialized variable in dshash.c.

A bugfix for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20.  The code
would have crashed if hashtable->size_log2 ever had the same value as
hashtable->control->size_log2 by coincidence.

Per Coverity.

Author: Thomas Munro
Reported-By: Tomas Vondra
Discussion: https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com
---
 src/backend/lib/dshash.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/src/backend/lib/dshash.c b/src/backend/lib/dshash.c
index 448e0587253..dd875730670 100644
--- a/src/backend/lib/dshash.c
+++ b/src/backend/lib/dshash.c
@@ -249,6 +249,7 @@ dshash_create(dsa_area *area, const dshash_parameters *params, void *arg)
 	}
 	hash_table->buckets = dsa_get_address(area,
 										  hash_table->control->buckets);
+	hash_table->size_log2 = hash_table->control->size_log2;
 
 	return hash_table;
 }
@@ -280,6 +281,14 @@ dshash_attach(dsa_area *area, const dshash_parameters *params,
 	hash_table->find_exclusively_locked = false;
 	Assert(hash_table->control->magic == DSHASH_MAGIC);
 
+	/*
+	 * These will later be set to the correct values by
+	 * ensure_valid_bucket_pointers(), at which time we'll be holding a
+	 * partition lock for interlocking against concurrent resizing.
+	 */
+	hash_table->buckets = NULL;
+	hash_table->size_log2 = 0;
+
 	return hash_table;
 }
 
-- 
2.13.2

#6Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#5)
1 attachment(s)
Re: valgrind vs. shared typmod registry

On Mon, Sep 18, 2017 at 5:39 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Here is a patch to fix that.

Here's a better one (same code, corrected commit message).

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

0001-Fix-uninitialized-variable-in-dshash.c.patchapplication/octet-stream; name=0001-Fix-uninitialized-variable-in-dshash.c.patchDownload
From 0b4996a971e36bfe6ac2d8631b4dd0579dbd391c Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Mon, 18 Sep 2017 17:28:05 +1200
Subject: [PATCH] Fix uninitialized variable in dshash.c.

A bugfix for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20.  The code
would have crashed if hashtable->size_log2 ever had the same value as
hashtable->control->size_log2 by coincidence.

Per Valgrind.

Author: Thomas Munro
Reported-By: Tomas Vondra
Discussion: https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com
---
 src/backend/lib/dshash.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/src/backend/lib/dshash.c b/src/backend/lib/dshash.c
index 448e0587253..dd875730670 100644
--- a/src/backend/lib/dshash.c
+++ b/src/backend/lib/dshash.c
@@ -249,6 +249,7 @@ dshash_create(dsa_area *area, const dshash_parameters *params, void *arg)
 	}
 	hash_table->buckets = dsa_get_address(area,
 										  hash_table->control->buckets);
+	hash_table->size_log2 = hash_table->control->size_log2;
 
 	return hash_table;
 }
@@ -280,6 +281,14 @@ dshash_attach(dsa_area *area, const dshash_parameters *params,
 	hash_table->find_exclusively_locked = false;
 	Assert(hash_table->control->magic == DSHASH_MAGIC);
 
+	/*
+	 * These will later be set to the correct values by
+	 * ensure_valid_bucket_pointers(), at which time we'll be holding a
+	 * partition lock for interlocking against concurrent resizing.
+	 */
+	hash_table->buckets = NULL;
+	hash_table->size_log2 = 0;
+
 	return hash_table;
 }
 
-- 
2.13.2

#7Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#6)
Re: valgrind vs. shared typmod registry

Hi,

On 2017-09-18 18:04:36 +1200, Thomas Munro wrote:

On Mon, Sep 18, 2017 at 5:39 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Here is a patch to fix that.

Here's a better one (same code, corrected commit message).

Pushed. For a second I was tempted to also replace the
palloc(sizeof(dshash_table)) with a palloc0 - but in the end it seems
actually not too bad either to be able to catch bugs like this with some
help. If you have a strong opinion either way...

Thanks,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers