SEGFAULT on SELECT * FROM view

Started by chris r.over 14 years ago6 messageshackers
Jump to latest
#1chris r.
chricki@gmx.net

Hi all,

When SELECTing from one particular view, I get reproducible SEGFAULTs in
my pg 9.1.2 production database. To test for hardware errors, I ran
memtest, which succeeded. I then asked for help in #postgres and got
advice to create stacktraces, but I couldn't find the problem yet.

Scenario: VIEW 'vwa' on TABLE 'tba' and some subselects and aggregates.
I've no idea what's wrong with this view, but a

SELECT * FROM vwa WHERE myid = 1000010 LIMIT 100 OFFSET 0;

and even EXPLAINing this statement SEGFAULTs the server.

gdb trace: http://pgsql.privatepaste.com/eacd1b6c5d
gdb "ec" output: http://pgsql.privatepaste.com/a61db5b564
smaps: http://pgsql.privatepaste.com/3c4f494015

Then I used the definition of vwa, and created vwb, and the very same
SELECT (just with vwb) works perfectly fine. I've a pastebin link
comparing the two view definitions, which I'm willing to share privately
if that helps. The views actually differ, although the look identical
with \d+ in the psql console, in that the newer view names more columns
that were added to the referenced tables lately.

So, you tell me, what's wrong with the old view?

Thanks a lot,
Chris

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: chris r. (#1)
Re: SEGFAULT on SELECT * FROM view

"chris r." <chricki@gmx.net> writes:

So, you tell me, what's wrong with the old view?

Nothing; that looks like a planner bug to me. Please submit a
self-contained test case.

regards, tom lane

#3chris r.
chricki@gmx.net
In reply to: Tom Lane (#2)
Re: SEGFAULT on SELECT * FROM view

So, you tell me, what's wrong with the old view?

Nothing; that looks like a planner bug to me. Please submit a
self-contained test case.

I ported the entire schema to my test DB server and could not reproduce
the error there. Note that probably recreating the view solves this
issue. Given this, how should I proceed to create a test case? Any
tutorial on this? (I'm not too familiar with all this yet.)

Chris

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: chris r. (#3)
Re: SEGFAULT on SELECT * FROM view

"chris r." <chricki@gmx.net> writes:

Nothing; that looks like a planner bug to me. Please submit a
self-contained test case.

I ported the entire schema to my test DB server and could not reproduce
the error there. Note that probably recreating the view solves this
issue. Given this, how should I proceed to create a test case? Any
tutorial on this? (I'm not too familiar with all this yet.)

It's possibly statistics-dependent; make sure you have the same stats
targets on both DBs, and try re-analyzing a few times. Check other
planner parameters are the same, too.

regards, tom lane

#5chris r.
chricki@gmx.net
In reply to: Tom Lane (#4)
Re: SEGFAULT on SELECT * FROM view

I ported the entire schema to my test DB server and could not reproduce
the error there. Note that probably recreating the view solves this
issue. Given this, how should I proceed to create a test case? Any
tutorial on this? (I'm not too familiar with all this yet.)

It's possibly statistics-dependent; make sure you have the same stats
targets on both DBs, and try re-analyzing a few times. Check other
planner parameters are the same, too.

In addition to the schema, I now also copied parts of the data in our
production system to the testing DB. (I cannot copy all data, as it's
too large for the testing DB.) After a couple of VACUUM FULL ANALYZEs,
the bug still doesn't reproduce.

In our production environment, I'm working with the re-created view that
doesn't show the SEGFAULT behavior. I can live with this workaround, as
only one particular view seems to be affected, but if you want me to
debug more on the error please let me know.

Thanks for your help until now!
Chris

#6Robert Haas
robertmhaas@gmail.com
In reply to: chris r. (#5)
Re: SEGFAULT on SELECT * FROM view

On Mon, Jan 2, 2012 at 1:42 AM, chris r. <chricki@gmx.net> wrote:

I ported the entire schema to my test DB server and could not reproduce
the error there. Note that probably recreating the view solves this
issue. Given this, how should I proceed to create a test case? Any
tutorial on this? (I'm not too familiar with all this yet.)

It's possibly statistics-dependent; make sure you have the same stats
targets on both DBs, and try re-analyzing a few times.  Check other
planner parameters are the same, too.

In addition to the schema, I now also copied parts of the data in our
production system to the testing DB. (I cannot copy all data, as it's
too large for the testing DB.) After a couple of VACUUM FULL ANALYZEs,
the bug still doesn't reproduce.

In our production environment, I'm working with the re-created view that
doesn't show the SEGFAULT behavior. I can live with this workaround, as
only one particular view seems to be affected, but if you want me to
debug more on the error please let me know.

It would be awfully nice to figure out what was going on. Another
thing that would help, if you can still reproduce the error, would be
to get a stack trace. The easiest way would probably be to enable
core dumps (pg_ctl -c start, or ulimit -c unlimited followed by pg_ctl
start). Then, when you get a core, which might end up inside $PGDATA
or perhaps elsewhere depending on your system configuration, do:

gdb /path/to/postgres /path/to/corefile

and then from within gdb:

bt

...to get a backtrace.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company