segfault in 9.5alpha - plpgsql function, implicit cast and IMMUTABLE cast function
Hi all
While doing some testing of 9.5a one of my colleagues (not on list) found a
reproducible server segfault.
We've broken it down to a minimal script to reproduce below.
Reproduced on both machines on which we've installed 9.5 so far (both built
from source since we don't have any RHEL7 machines in development):
RHEL5.3 (Linux 2.6.18-128.el5 i386), gcc version 4.6.4
CentOS 6.5 (Linux 2.6.32-431.el6.i686), gcc version 4.4.7-4
Script for psql:
============ cut ===============
CREATE OR REPLACE FUNCTION to_date(integer) RETURNS date LANGUAGE sql
IMMUTABLE AS $$
SELECT $1::text::date
$$;
DROP CAST IF EXISTS (integer AS date);
CREATE CAST (integer AS date) WITH FUNCTION to_date(integer) AS IMPLICIT;
CREATE OR REPLACE FUNCTION newcrash(INTEGER) returns DATE LANGUAGE plpgsql
AS $$ BEGIN
RETURN $1;
END$$;
SELECT newcrash(20150202);
SELECT newcrash(20150203);
============ cut ===============
It doesn't crash the first time, but does consistently crash the second.
Given that if I remove IMMUTABLE from the function definition it doesn't
fail, it implies that there's a problem with the mechanism used to cache
function results - although the fact that the second function call doesn't
have to be the same value does suggest it's a problem with the code that
*searches* that result cache, rather than the section that retrieves it.
I tried cutting out the implicit CAST altogether and doing
RETURN to_date($1);
but this doesn't fail, which implies also that it's something related to
the implicit cast.
If I DECLARE a local DATE variable and SELECT INTO that (rather than just
using RETURN $1), it crashes at that point too.
Hope someone can get something useful from the above. Any questions, please
ask.
Geoff
On Fri, Jul 17, 2015 at 7:52 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
While doing some testing of 9.5a one of my colleagues (not on list) found a
reproducible server segfault.
[...]
Hope someone can get something useful from the above. Any questions, please
ask.
A test case is more than enough to look at this issue and guess what
is happening, thanks! The issue can be reproduced on REL9_5_STABLE and
master, and by looking at the stack trace it seems that the problem is
caused by an attempt to delete a memory context that has already been
free'd.
* thread #1: tid = 0x0000, 0x0000000109f30dee
postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 30 at
mcxt.c:206, stop reason = signal SIGSTOP
frame #0: 0x0000000109f30dee
postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 30 at
mcxt.c:206
203 void
204 MemoryContextDelete(MemoryContext context)
205 {
-> 206 AssertArg(MemoryContextIsValid(context));
207 /* We had better not be deleting TopMemoryContext ... */
208 Assert(context != TopMemoryContext);
209 /* And not CurrentMemoryContext, either */
(lldb) bt
* thread #1: tid = 0x0000, 0x0000000109f30dee
postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 30 at
mcxt.c:206, stop reason = signal SIGSTOP
* frame #0: 0x0000000109f30dee
postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 30 at
mcxt.c:206
frame #1: 0x0000000109b7e261
postgres`fmgr_sql(fcinfo=0x00007f84c28d5870) + 433 at functions.c:1044
I am adding it to the list of Open Items for 9.5. I'll look into that
in the next couple of days (Tuesday at worst).
Regards,
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 17 July 2015 at 13:49, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jul 17, 2015 at 7:52 PM, Geoff Winkless <pgsqladmin@geoff.dj>
wrote:While doing some testing of 9.5a one of my colleagues (not on list)
found a
reproducible server segfault.
[...]
Hope someone can get something useful from the above. Any questions,please
ask.
I am adding it to the list of Open Items for 9.5. I'll look into that
in the next couple of days (Tuesday at worst).
Superb, thanks :)
Geoff
Geoff Winkless <pgsqladmin@geoff.dj> writes:
While doing some testing of 9.5a one of my colleagues (not on list) found a
reproducible server segfault.
Hm, looks like commit 1345cc67bbb014209714af32b5681b1e11eaf964 is to
blame: memory management for the plpgsql cast cache needs to be more
complicated than I realized :-(.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 17, 2015 at 11:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Geoff Winkless <pgsqladmin@geoff.dj> writes:
While doing some testing of 9.5a one of my colleagues (not on list) found a
reproducible server segfault.Hm, looks like commit 1345cc67bbb014209714af32b5681b1e11eaf964 is to
blame: memory management for the plpgsql cast cache needs to be more
complicated than I realized :-(.
And this issue is already fixed by 0fc94a5b.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers