Configure with thread sanitizer fails the thread test

Started by Ewan Higgsover 10 years ago5 messageshackers
Jump to latest
#1Ewan Higgs
ewan_higgs@yahoo.co.uk

Hi all,I throught I would try sqlsmith with more sanitizers. So, I tried the ThreadSanitizer. Building off HEAD (522400a5198f63823406202e18fdaf3621619d98), it fails to complete the thread check since the thread test because there are some volatile ints being accessed between threads[1]Failed test in configure yields: ==================                                                              WARNING: ThreadSanitizer: data race (pid=31420)                                   Write of size 4 at 0x000000602100 by thread T2:                                   #0 func_call_2 src/test/thread/thread_test.c:416 (conftest+0x000000400db9)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Previous read of size 4 at 0x000000602100 by thread T1:                           #0 func_call_1 src/test/thread/thread_test.c:326 (conftest+0x000000400f3a)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Location is global 'errno2_set' of size 4 at 0x000000602100 (conftest+0x000000602100)                                                                                   Thread T2 (tid=31423, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:165 (conftest+0x0000004010e1)                                                                                           Thread T1 (tid=31422, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)                                                                                         SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:416 func_call_2 ==================                                                              WARNING: ThreadSanitizer: data race (pid=31420)                                   Read of size 4 at 0x000000602104 by thread T2:                                    #0 func_call_2 src/test/thread/thread_test.c:417 (conftest+0x000000400dcd)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Previous write of size 4 at 0x000000602104 by thread T1:                          #0 func_call_1 src/test/thread/thread_test.c:325 (conftest+0x000000400f26)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Location is global 'errno1_set' of size 4 at 0x000000602104 (conftest+0x000000602104)                                                                                   Thread T2 (tid=31423, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:165 (conftest+0x0000004010e1)                                                                                           Thread T1 (tid=31422, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)                                                                                         SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:417 func_call_2 ==================                                                              errno not thread-safe **                                                        exiting                                                                         ==================                                                              WARNING: ThreadSanitizer: data race (pid=31420)                                   Write of size 4 at 0x00000060210c by thread T1:                                   #0 func_call_1 src/test/thread/thread_test.c:381 (conftest+0x000000400fc8)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Previous read of size 4 at 0x00000060210c by main thread (mutexes: write M6):     #0 main src/test/thread/thread_test.c:176 (conftest+0x000000401123)                                                                                           Location is global 'thread1_done' of size 4 at 0x00000060210c (conftest+0x00000060210c)                                                                                   Mutex M6 (0x000000602120) created at:                                             #0 pthread_mutex_lock <null> (libtsan.so.0+0x000000037776)                      #1 main src/test/thread/thread_test.c:156 (conftest+0x000000401066)                                                                                           Thread T1 (tid=31422, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)                                                                                         SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:381 func_call_1 ==================                                                              ThreadSanitizer: reported 3 warnings . Volatile prevents caching of the variable in each thread context but it doesn't provide all the guarantees needed for multithreaded programming. This should probably use _Atomic (C11) on platforms with C11 compilers and whatever else is needed for platforms that refuse to update their compilers past C89.

So I changed volatile to _Atomic and continued (patch is in thread_test_atomic.patch). I then ran it against sqlsmith. The good news: I didn't happen to find any problems in normal use. The bad news: I did find a lot of warnings about improper use of functions like malloc and free from signal handlers. I attached the log under 'errors.log'.
I haven't investigated this in depth yet since I figured if I flagged it on the list someone might be in a better position to determine if it's worth fixing and what the fix might be.
My configure flags:./configure CFLAGS='-O1 -g -fsanitize=thread -fno-omit-frame-pointer -fno-optimize-sibling-calls' LDFLAGS='-ltsan -ldl' --enable-cassert --prefix=$HOME/pkg

Yours,Ewan

[1]: Failed test in configure yields: ==================                                                              WARNING: ThreadSanitizer: data race (pid=31420)                                   Write of size 4 at 0x000000602100 by thread T2:                                   #0 func_call_2 src/test/thread/thread_test.c:416 (conftest+0x000000400db9)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Previous read of size 4 at 0x000000602100 by thread T1:                           #0 func_call_1 src/test/thread/thread_test.c:326 (conftest+0x000000400f3a)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Location is global 'errno2_set' of size 4 at 0x000000602100 (conftest+0x000000602100)                                                                                   Thread T2 (tid=31423, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:165 (conftest+0x0000004010e1)                                                                                           Thread T1 (tid=31422, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)                                                                                         SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:416 func_call_2 ==================                                                              WARNING: ThreadSanitizer: data race (pid=31420)                                   Read of size 4 at 0x000000602104 by thread T2:                                    #0 func_call_2 src/test/thread/thread_test.c:417 (conftest+0x000000400dcd)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Previous write of size 4 at 0x000000602104 by thread T1:                          #0 func_call_1 src/test/thread/thread_test.c:325 (conftest+0x000000400f26)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Location is global 'errno1_set' of size 4 at 0x000000602104 (conftest+0x000000602104)                                                                                   Thread T2 (tid=31423, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:165 (conftest+0x0000004010e1)                                                                                           Thread T1 (tid=31422, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)                                                                                         SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:417 func_call_2 ==================                                                              errno not thread-safe **                                                        exiting                                                                         ==================                                                              WARNING: ThreadSanitizer: data race (pid=31420)                                   Write of size 4 at 0x00000060210c by thread T1:                                   #0 func_call_1 src/test/thread/thread_test.c:381 (conftest+0x000000400fc8)      #1 <null> <null> (libtsan.so.0+0x0000000235b9)                                                                                                                Previous read of size 4 at 0x00000060210c by main thread (mutexes: write M6):     #0 main src/test/thread/thread_test.c:176 (conftest+0x000000401123)                                                                                           Location is global 'thread1_done' of size 4 at 0x00000060210c (conftest+0x00000060210c)                                                                                   Mutex M6 (0x000000602120) created at:                                             #0 pthread_mutex_lock <null> (libtsan.so.0+0x000000037776)                      #1 main src/test/thread/thread_test.c:156 (conftest+0x000000401066)                                                                                           Thread T1 (tid=31422, running) created by main thread at:                         #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                          #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)                                                                                         SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:381 func_call_1 ==================                                                              ThreadSanitizer: reported 3 warnings 
==================                                                             
WARNING: ThreadSanitizer: data race (pid=31420)                                
  Write of size 4 at 0x000000602100 by thread T2:                              
    #0 func_call_2 src/test/thread/thread_test.c:416 (conftest+0x000000400db9) 
    #1 <null> <null> (libtsan.so.0+0x0000000235b9)                             
                                                                               
  Previous read of size 4 at 0x000000602100 by thread T1:                      
    #0 func_call_1 src/test/thread/thread_test.c:326 (conftest+0x000000400f3a) 
    #1 <null> <null> (libtsan.so.0+0x0000000235b9)                             
                                                                               
  Location is global 'errno2_set' of size 4 at 0x000000602100 (conftest+0x000000602100)
                                                                               
  Thread T2 (tid=31423, running) created by main thread at:                    
    #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                     
    #1 main src/test/thread/thread_test.c:165 (conftest+0x0000004010e1)        
                                                                               
  Thread T1 (tid=31422, running) created by main thread at:                    
    #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                     
    #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)        
                                                                               
SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:416 func_call_2
==================                                                             
WARNING: ThreadSanitizer: data race (pid=31420)                                
  Read of size 4 at 0x000000602104 by thread T2:                               
    #0 func_call_2 src/test/thread/thread_test.c:417 (conftest+0x000000400dcd) 
    #1 <null> <null> (libtsan.so.0+0x0000000235b9)                             
                                                                               
  Previous write of size 4 at 0x000000602104 by thread T1:                     
    #0 func_call_1 src/test/thread/thread_test.c:325 (conftest+0x000000400f26) 
    #1 <null> <null> (libtsan.so.0+0x0000000235b9)                             
                                                                               
  Location is global 'errno1_set' of size 4 at 0x000000602104 (conftest+0x000000602104)
                                                                               
  Thread T2 (tid=31423, running) created by main thread at:                    
    #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                     
    #1 main src/test/thread/thread_test.c:165 (conftest+0x0000004010e1)        
                                                                               
  Thread T1 (tid=31422, running) created by main thread at:                    
    #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                     
    #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)        
                                                                               
SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:417 func_call_2
==================                                                             
errno not thread-safe **                                                       
exiting                                                                        
==================                                                             
WARNING: ThreadSanitizer: data race (pid=31420)                                
  Write of size 4 at 0x00000060210c by thread T1:                              
    #0 func_call_1 src/test/thread/thread_test.c:381 (conftest+0x000000400fc8) 
    #1 <null> <null> (libtsan.so.0+0x0000000235b9)                             
                                                                               
  Previous read of size 4 at 0x00000060210c by main thread (mutexes: write M6):
    #0 main src/test/thread/thread_test.c:176 (conftest+0x000000401123)        
                                                                               
  Location is global 'thread1_done' of size 4 at 0x00000060210c (conftest+0x00000060210c)
                                                                               
  Mutex M6 (0x000000602120) created at:                                        
    #0 pthread_mutex_lock <null> (libtsan.so.0+0x000000037776)                 
    #1 main src/test/thread/thread_test.c:156 (conftest+0x000000401066)        
                                                                               
  Thread T1 (tid=31422, running) created by main thread at:                    
    #0 pthread_create <null> (libtsan.so.0+0x000000027a67)                     
    #1 main src/test/thread/thread_test.c:158 (conftest+0x00000040107e)        
                                                                               
SUMMARY: ThreadSanitizer: data race src/test/thread/thread_test.c:381 func_call_1
==================                                                             
ThreadSanitizer: reported 3 warnings 

Attachments:

errors.logtext/plainDownload
thread_test_atomic.patchtext/x-patchDownload+5-4
#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Ewan Higgs (#1)
Re: Configure with thread sanitizer fails the thread test

Ewan Higgs wrote:

So I changed volatile to _Atomic and continued (patch is in
thread_test_atomic.patch). I then ran it against sqlsmith. The good
news: I didn't happen to find any problems in normal use. The bad
news: I did find a lot of warnings about improper use of functions
like malloc and free from signal handlers.

There's a reason why we don't offer a threaded server ... The
postmaster process in particular runs in a rather unusual arrangement,
where most of the interesting stuff does happen in signal handlers. I
doubt there's any chance that we would make it run in a threaded
environment.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@anarazel.de
In reply to: Ewan Higgs (#1)
Re: Configure with thread sanitizer fails the thread test

On 2015-08-17 07:37:45 +0000, Ewan Higgs wrote:

So I changed volatile to _Atomic and continued (patch is in
thread_test_atomic.patch). I then ran it against sqlsmith. The good
news: I didn't happen to find any problems in normal use. The bad
news: I did find a lot of warnings about improper use of functions
like malloc and free from signal handlers. I attached the log under
'errors.log'.

Aren't pretty much all of those false positives? I checked the first few
and they all look like:

==================
WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=26767)
#0 free <null> (libtsan.so.0+0x000000025d09)
#1 AllocSetDelete /home/ehiggs/src/download/postgresql/src/backend/utils/mmgr/aset.c:643 (postgres+0x0000009ece3d)
#2 MemoryContextDelete /home/ehiggs/src/download/postgresql/src/backend/utils/mmgr/mcxt.c:226 (postgres+0x0000009ef7cc)
#3 MemoryContextDeleteChildren /home/ehiggs/src/download/postgresql/src/backend/utils/mmgr/mcxt.c:246 (postgres+0x0000009ef83b)
#4 MemoryContextDelete /home/ehiggs/src/download/postgresql/src/backend/utils/mmgr/mcxt.c:209 (postgres+0x0000009ef798)
#5 StartChildProcess /home/ehiggs/src/download/postgresql/src/backend/postmaster/postmaster.c:5211 (postgres+0x0000007b2e72)
#6 reaper /home/ehiggs/src/download/postgresql/src/backend/postmaster/postmaster.c:2717 (postgres+0x0000007b44ac)
#7 <null> <null> (libtsan.so.0+0x0000000236e3)
#8 ServerLoop /home/ehiggs/src/download/postgresql/src/backend/postmaster/postmaster.c:1631 (postgres+0x0000007b6f78)
#9 PostmasterMain /home/ehiggs/src/download/postgresql/src/backend/postmaster/postmaster.c:1274 (postgres+0x0000007b6f78)
#10 main /home/ehiggs/src/download/postgresql/src/backend/main/main.c:223 (postgres+0x0000006f2da9)

This is after a fork. And fork is a async-signal-safe function. So it
seems tsan doesn't properly recognize that we actually "escaped" the
signal handler and are in a new process.

#include <stdio.h>
#include <stdlib.h>
+#include <stdatomic.h>
#include <unistd.h>
#include <netdb.h>
#include <sys/types.h>
@@ -84,11 +85,11 @@ static void func_call_2(void);

static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER;

-static volatile int thread1_done = 0;
-static volatile int thread2_done = 0;
+static _Atomic int thread1_done = 0;
+static _Atomic int thread2_done = 0;
-static volatile int errno1_set = 0;
-static volatile int errno2_set = 0;
+static _Atomic int errno1_set = 0;
+static _Atomic int errno2_set = 0;

#ifndef HAVE_STRERROR_R
static char *strerror_p1;

We can't do that because we have to work on older compilers as well. My
suggestion for now would be to disable tsan during configure and only
enable it during the actual build (you can do stuff like make
COPT='-fsanitize...'.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#2)
Re: Configure with thread sanitizer fails the thread test

On 2015-08-17 14:31:24 -0300, Alvaro Herrera wrote:

The postmaster process in particular runs in a rather unusual
arrangement, where most of the interesting stuff does happen in signal
handlers.

FWIW, I think it might be worthwhile to convert postmaster into a loop
over a process local latch, with that latch being set in signal
handlers. My feeling is that that'd simplify the code rather
significantly. I'm not 100% it's worth the code churn, but it'd
definitely be easier to understand. Thread sanitizer isn't the first
analysis tool that has problems coping with forks in signal handlers
btw, valgrind on amd64 for a long while had misaligned stacks in the
children afterwards leading to very odd crashes.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#4)
Re: Configure with thread sanitizer fails the thread test

On Mon, Aug 17, 2015 at 3:02 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-08-17 14:31:24 -0300, Alvaro Herrera wrote:

The postmaster process in particular runs in a rather unusual
arrangement, where most of the interesting stuff does happen in signal
handlers.

FWIW, I think it might be worthwhile to convert postmaster into a loop
over a process local latch, with that latch being set in signal
handlers. My feeling is that that'd simplify the code rather
significantly. I'm not 100% it's worth the code churn, but it'd
definitely be easier to understand. Thread sanitizer isn't the first
analysis tool that has problems coping with forks in signal handlers
btw, valgrind on amd64 for a long while had misaligned stacks in the
children afterwards leading to very odd crashes.

Yeah, I'm a little worried about whether we'd destabilize things by
changing them in that way, but if we could avoid that pitfall I
suspect we'd end up better off.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers