Thread: SEGFAULT on SELECT * FROM view

SEGFAULT on SELECT * FROM view

From
"chris r."
Date:
Hi all,

When SELECTing from one particular view, I get reproducible SEGFAULTs in
my pg 9.1.2 production database. To test for hardware errors, I ran
memtest, which succeeded. I then asked for help in #postgres and got
advice to create stacktraces, but I couldn't find the problem yet.

Scenario: VIEW 'vwa' on TABLE 'tba' and some subselects and aggregates.
I've no idea what's wrong with this view, but a
  SELECT * FROM vwa WHERE myid = 1000010 LIMIT 100 OFFSET 0;

and even EXPLAINing this statement SEGFAULTs the server.

gdb trace: http://pgsql.privatepaste.com/eacd1b6c5d
gdb "ec" output: http://pgsql.privatepaste.com/a61db5b564
smaps: http://pgsql.privatepaste.com/3c4f494015

Then I used the definition of vwa, and created vwb, and the very same
SELECT (just with vwb) works perfectly fine. I've a pastebin link
comparing the two view definitions, which I'm willing to share privately
if that helps. The views actually differ, although the look identical
with \d+ in the psql console, in that the newer view names more columns
that were added to the referenced tables lately.

So, you tell me, what's wrong with the old view?

Thanks a lot,
Chris


Re: SEGFAULT on SELECT * FROM view

From
Tom Lane
Date:
"chris r." <chricki@gmx.net> writes:
> So, you tell me, what's wrong with the old view?

Nothing; that looks like a planner bug to me.  Please submit a
self-contained test case.
        regards, tom lane


Re: SEGFAULT on SELECT * FROM view

From
"chris r."
Date:
>> So, you tell me, what's wrong with the old view?
> 
> Nothing; that looks like a planner bug to me.  Please submit a
> self-contained test case.
I ported the entire schema to my test DB server and could not reproduce
the error there. Note that probably recreating the view solves this
issue. Given this, how should I proceed to create a test case? Any
tutorial on this? (I'm not too familiar with all this yet.)

Chris


Re: SEGFAULT on SELECT * FROM view

From
Tom Lane
Date:
"chris r." <chricki@gmx.net> writes:
>> Nothing; that looks like a planner bug to me.  Please submit a
>> self-contained test case.

> I ported the entire schema to my test DB server and could not reproduce
> the error there. Note that probably recreating the view solves this
> issue. Given this, how should I proceed to create a test case? Any
> tutorial on this? (I'm not too familiar with all this yet.)

It's possibly statistics-dependent; make sure you have the same stats
targets on both DBs, and try re-analyzing a few times.  Check other
planner parameters are the same, too.
        regards, tom lane


Re: SEGFAULT on SELECT * FROM view

From
"chris r."
Date:
>> I ported the entire schema to my test DB server and could not reproduce
>> the error there. Note that probably recreating the view solves this
>> issue. Given this, how should I proceed to create a test case? Any
>> tutorial on this? (I'm not too familiar with all this yet.)
> 
> It's possibly statistics-dependent; make sure you have the same stats
> targets on both DBs, and try re-analyzing a few times.  Check other
> planner parameters are the same, too.
In addition to the schema, I now also copied parts of the data in our
production system to the testing DB. (I cannot copy all data, as it's
too large for the testing DB.) After a couple of VACUUM FULL ANALYZEs,
the bug still doesn't reproduce.

In our production environment, I'm working with the re-created view that
doesn't show the SEGFAULT behavior. I can live with this workaround, as
only one particular view seems to be affected, but if you want me to
debug more on the error please let me know.

Thanks for your help until now!
Chris


Re: SEGFAULT on SELECT * FROM view

From
Robert Haas
Date:
On Mon, Jan 2, 2012 at 1:42 AM, chris r. <chricki@gmx.net> wrote:
>>> I ported the entire schema to my test DB server and could not reproduce
>>> the error there. Note that probably recreating the view solves this
>>> issue. Given this, how should I proceed to create a test case? Any
>>> tutorial on this? (I'm not too familiar with all this yet.)
>>
>> It's possibly statistics-dependent; make sure you have the same stats
>> targets on both DBs, and try re-analyzing a few times.  Check other
>> planner parameters are the same, too.
> In addition to the schema, I now also copied parts of the data in our
> production system to the testing DB. (I cannot copy all data, as it's
> too large for the testing DB.) After a couple of VACUUM FULL ANALYZEs,
> the bug still doesn't reproduce.
>
> In our production environment, I'm working with the re-created view that
> doesn't show the SEGFAULT behavior. I can live with this workaround, as
> only one particular view seems to be affected, but if you want me to
> debug more on the error please let me know.

It would be awfully nice to figure out what was going on.  Another
thing that would help, if you can still reproduce the error, would be
to get a stack trace.  The easiest way would probably be to enable
core dumps (pg_ctl -c start, or ulimit -c unlimited followed by pg_ctl
start).  Then, when you get a core, which might end up inside $PGDATA
or perhaps elsewhere depending on your system configuration, do:

gdb /path/to/postgres /path/to/corefile

and then from within gdb:

bt

...to get a backtrace.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company