Segmentation fault in libpq

Started by Michal Novotnyover 8 years ago11 messages
#1Michal Novotny
michal.novotny@greycortex.com

Hi all,

we've developed an application using libpq to access a table in the
PgSQL database but we're sometimes experiencing segmentation fault on
resetPQExpBuffer() function of libpq called from PQexecParams() with
prepared query.

PostgreSQL version is 9.6.3 and the backtrace is:

Core was generated by `/usr/ti/bin/status-monitor2 -m /usr/lib64/status-monitor2/modules'.
Program terminated with signal 11, Segmentation fault.
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
152 str->data[0] = '\0';

Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
No locals.
#1 0x00007fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at fe-exec.c:1371
No locals.
#2 0x00007fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0, command=command@entry=0x409a98"SELECT min, hour, day, month, dow, sensor, module, params, priority,
rt_due FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0, paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0, paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at fe-exec.c:1192
No locals.
#3 0x00007fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98"SELECT min, hour, day, month, dow, sensor, module, params, priority,
rt_due FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1, paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0, resultFormat=0) at fe-exec.c:1871
No locals.

Unfortunately we didn't have more information from the crash, at least
for now.

Is this a known issue and can you help me with this one?

Thanks,
Michal

--
Michal Novotny
System Development Lead
michal.novotny@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Michal Novotny (#1)
Re: [HACKERS] Segmentation fault in libpq

On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Hi all,

we've developed an application using libpq to access a table in the PgSQL
database but we're sometimes experiencing segmentation fault on
resetPQExpBuffer() function of libpq called from PQexecParams() with
prepared query.

PostgreSQL version is 9.6.3 and the backtrace is:

Core was generated by `/usr/ti/bin/status-monitor2 -m
/usr/lib64/status-monitor2/modules'.
Program terminated with signal 11, Segmentation fault.
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
152 str->data[0] = '\0';

Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
No locals.
#1 0x00007fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at
fe-exec.c:1371
No locals.
#2 0x00007fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0,
command=command@entry=0x409a98 "SELECT min, hour, day, month, dow, sensor,
module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY
priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0,
paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0,
paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at
fe-exec.c:1192
No locals.
#3 0x00007fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98
"SELECT min, hour, day, month, dow, sensor, module, params, priority, rt_due
FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1,
paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0,
resultFormat=0) at fe-exec.c:1871
No locals.

Unfortunately we didn't have more information from the crash, at least for
now.

Is this a known issue and can you help me with this one?

Is your application written in C? We would need to completely rule
out your code (say, by double freeing result or something else nasty)
before assuming problem was withing libpq itself, particularly in this
area of the code. How reproducible is the problem?

merlin

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Michal Novotny
michal.novotny@greycortex.com
In reply to: Merlin Moncure (#2)
Re: Segmentation fault in libpq

Hi,

comments inline ...

On 06/29/2017 03:08 PM, Merlin Moncure wrote:

On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Hi all,

we've developed an application using libpq to access a table in the PgSQL
database but we're sometimes experiencing segmentation fault on
resetPQExpBuffer() function of libpq called from PQexecParams() with
prepared query.

PostgreSQL version is 9.6.3 and the backtrace is:

Core was generated by `/usr/ti/bin/status-monitor2 -m
/usr/lib64/status-monitor2/modules'.
Program terminated with signal 11, Segmentation fault.
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
152 str->data[0] = '\0';

Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
No locals.
#1 0x00007fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at
fe-exec.c:1371
No locals.
#2 0x00007fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0,
command=command@entry=0x409a98 "SELECT min, hour, day, month, dow, sensor,
module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY
priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0,
paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0,
paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at
fe-exec.c:1192
No locals.
#3 0x00007fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98
"SELECT min, hour, day, month, dow, sensor, module, params, priority, rt_due
FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1,
paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0,
resultFormat=0) at fe-exec.c:1871
No locals.

Unfortunately we didn't have more information from the crash, at least for
now.

Is this a known issue and can you help me with this one?

Is your application written in C? We would need to completely rule
out your code (say, by double freeing result or something else nasty)
before assuming problem was withing libpq itself, particularly in this
area of the code. How reproducible is the problem?

merlin

The application is written in plain C. The issue is it happens just
sometimes - sometimes it happens and sometimes it doesn't. Once it
happens it causes the application crash but as it's systemd unit with
Restart=on-failure flag it's automatically being restarted.

What's being done is:
1) Ensure connection already exists and create a new one if it doesn't
exist yet
2) Run PQexecParams() with specified $params that has $params_cnt elements:

res = PQexecParams(conn, prepared_query, params_cnt, NULL, (const char
**)params, NULL, NULL, 0);

3) Check for result and report error and exit if "PQresultStatus(res) !=
PGRES_TUPLES_OK"
4) Do some processing with the result
5) Clear result using PQclear()

It usually works fine but sometimes it's crashing and I don't know how
to investigate further.

Could you please help me based on information provided above?

Thanks,
Michal

--
Michal Novotny
System Development Lead
michal.novotny@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Michal Novotny (#3)
Re: [HACKERS] Segmentation fault in libpq

On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Hi,

comments inline ...

On 06/29/2017 03:08 PM, Merlin Moncure wrote:

On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Hi all,

we've developed an application using libpq to access a table in the PgSQL
database but we're sometimes experiencing segmentation fault on
resetPQExpBuffer() function of libpq called from PQexecParams() with
prepared query.

PostgreSQL version is 9.6.3 and the backtrace is:

Core was generated by `/usr/ti/bin/status-monitor2 -m
/usr/lib64/status-monitor2/modules'.
Program terminated with signal 11, Segmentation fault.
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
152 str->data[0] = '\0';

Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
#0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
No locals.
#1 0x00007fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at
fe-exec.c:1371
No locals.
#2 0x00007fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0,
command=command@entry=0x409a98 "SELECT min, hour, day, month, dow,
sensor,
module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY
priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0,
paramValues=paramValues@entry=0xa2b7b0,
paramLengths=paramLengths@entry=0x0,
paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0)
at
fe-exec.c:1192
No locals.
#3 0x00007fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98
"SELECT min, hour, day, month, dow, sensor, module, params, priority,
rt_due
FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1,
paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0,
resultFormat=0) at fe-exec.c:1871
No locals.

Unfortunately we didn't have more information from the crash, at least
for
now.

Is this a known issue and can you help me with this one?

Is your application written in C? We would need to completely rule
out your code (say, by double freeing result or something else nasty)
before assuming problem was withing libpq itself, particularly in this
area of the code. How reproducible is the problem?

merlin

The application is written in plain C. The issue is it happens just
sometimes - sometimes it happens and sometimes it doesn't. Once it happens
it causes the application crash but as it's systemd unit with
Restart=on-failure flag it's automatically being restarted.

What's being done is:
1) Ensure connection already exists and create a new one if it doesn't exist
yet
2) Run PQexecParams() with specified $params that has $params_cnt elements:

res = PQexecParams(conn, prepared_query, params_cnt, NULL, (const char
**)params, NULL, NULL, 0);

3) Check for result and report error and exit if "PQresultStatus(res) !=
PGRES_TUPLES_OK"
4) Do some processing with the result
5) Clear result using PQclear()

It usually works fine but sometimes it's crashing and I don't know how to
investigate further.

Could you please help me based on information provided above?

You might want to run your code through some analysis tools (for
example, valgrind). Short of that, to get help here you need to post
the code for review. How big is your application?

merlin

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#4)
Re: [BUGS] Segmentation fault in libpq

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Could you please help me based on information provided above?

You might want to run your code through some analysis tools (for
example, valgrind).

Yeah, that's what I was about to suggest. pqexpbuffer.c is pretty
small and paranoid code; it's really hard to see how it could have
crashed there unless something else corrupted its data structure.
While it's always possible that the "something else" was a wild
store from elsewhere in libpq, the lack of similar reports from
others and the fact that you don't sound to be doing anything very
exotic in terms of libpq requests both weigh against that theory.
If I had to bet given this much evidence, I'd bet on a wild store
from somewhere in your application having corrupted the
conn->errorMessage before PQexecParams was entered. C is not a
language that does much to prevent that kind of bug for you.

valgrind is not a perfect tool for finding that kind of problem,
especially if you can't reproduce the crash reliably; but at least
valgrind is readily available and easy to use, so you might as
well start there and see if it finds anything. If you have access
to any sort of static analysis tool (eg, Coverity), that might be
more likely to help. Or you could fall back on manual code
auditing, if the program isn't very big.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#5)
Re: [BUGS] Segmentation fault in libpq

On Thu, Jun 29, 2017 at 9:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Could you please help me based on information provided above?

You might want to run your code through some analysis tools (for
example, valgrind).

valgrind is not a perfect tool for finding that kind of problem,
especially if you can't reproduce the crash reliably; but at least
valgrind is readily available and easy to use, so you might as
well start there and see if it finds anything. If you have access
to any sort of static analysis tool (eg, Coverity), that might be
more likely to help. Or you could fall back on manual code
auditing, if the program isn't very big.

clang static analyzer is another good tool to check out

https://clang-analyzer.llvm.org/

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Michal Novotný
michal.novotny@greycortex.com
In reply to: Merlin Moncure (#6)
Re: [HACKERS] Segmentation fault in libpq

Hi all,
thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Anyway, thank you all for valuable advice.
Have a great time,
Michal

2017-06-29 16:30 GMT+02:00 Merlin Moncure <mmoncure@gmail.com>:

On Thu, Jun 29, 2017 at 9:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
<michal.novotny@greycortex.com> wrote:

Could you please help me based on information provided above?

You might want to run your code through some analysis tools (for
example, valgrind).

valgrind is not a perfect tool for finding that kind of problem,
especially if you can't reproduce the crash reliably; but at least
valgrind is readily available and easy to use, so you might as
well start there and see if it finds anything. If you have access
to any sort of static analysis tool (eg, Coverity), that might be
more likely to help. Or you could fall back on manual code
auditing, if the program isn't very big.

clang static analyzer is another good tool to check out

https://clang-analyzer.llvm.org/

merlin

--
Michal Novotny
System Development Lead
michal.novotny@greycortex.com

*GREYCORTEX s.r.o.*
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com

#8Andres Freund
andres@anarazel.de
In reply to: Michal Novotný (#7)
Re: [HACKERS] Segmentation fault in libpq

Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotn� wrote:

thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

- Andres

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#9Craig Ringer
craig@2ndquadrant.com
In reply to: Andres Freund (#8)
Re: [HACKERS] Segmentation fault in libpq

On 3 July 2017 at 03:12, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:

thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

Yep, huge help.

BTW, on Windows, the free tool DrMemory (now 64-bit too, yay) or
commercial Purify work great.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Michal Novotny
michal.novotny@greycortex.com
In reply to: Andres Freund (#8)
Re: [BUGS] Segmentation fault in libpq

On 07/02/2017 09:12 PM, Andres Freund wrote:

Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:

thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

- Andres

Well, I've tried but I was unable to locate the issue so I had to
investigate the code our little further and finally I've been able to
find the issue.

Thanks again,
Michal

--
Michal Novotny
System Development Lead
michal.novotny@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Michal Novotny
michal.novotny@greycortex.com
In reply to: Craig Ringer (#9)
Re: [BUGS] Segmentation fault in libpq

On 07/03/2017 04:58 AM, Craig Ringer wrote:

On 3 July 2017 at 03:12, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:

thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

Yep, huge help.

BTW, on Windows, the free tool DrMemory (now 64-bit too, yay) or
commercial Purify work great.

Well, good to know about Windows stuff however we use Linux so that's
not a big deal. Unfortunately it's easy to miss something in valgrind if
you have once multi-threaded library linked to libpq and this
multi-threaded library is used in conjunction with another libraries
sharing some of the data among them.

Thanks once again,
Michal

--
Michal Novotny
System Development Lead
michal.novotny@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers