Re: BUG #14948: cost overflow - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #14948: cost overflow
Date
Msg-id 12680.1512670727@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #14948: cost overflow  (Jan Schulz <jasc@gmx.net>)
List pgsql-bugs
Jan Schulz <jasc@gmx.net> writes:
> just some update on "how" that happend. Our current hypothesis:

> * We have one parallel job was growing so big that postgresql consumed
> too much memory (we use 'work_mem = 2GB'). This job is part of a
> process which creates a 'm_dim_next' schema which in the end would be
> switched to 'm_dim'. (note that the old 'm_dim' schema is not written
> to during the whole process which creates m_dim_next, it gets dropped
> after the schema switch in the last step in that process: m_dim ->
> m_dim_old, m_dim_next -> m_dim,  drop m_dim_old)
> * The OOM killer killed postgresql (please note that we have
> configured postgres with almost no data security)
> * This in turn would result "somehow" in some funny
> data/statistics/whatever on the table in m_dim
> * This in turn would result in wrong plans which in turn would result
> in OOM when processes run which touched the table in m_dim

I'm not entirely convinced by this theory.  In the first place, an OOM
kill shouldn't result in data corruption, no matter how you have your
installation configured.  You can turn off things that might result in
corruption after a power outage or other operating-system-level crash,
but not a process-level crash.  (Or at least that's the theory; there
could always be bugs of course.  But we developers crash Postgres
pretty regularly ;-), and we don't see corruption from that.)

In the second place, even if there were something wrong in pg_statistic,
that should manifest as a bogus rowcount estimate, which we're not seeing
in this EXPLAIN output.  The calculation that produces a cost estimate
given a rowcount estimate just doesn't have all that many other inputs,
which is why I didn't have very many theories about what could be wrong.

It's puzzling ...

            regards, tom lane


pgsql-bugs by date:

Previous
From: Jan Schulz
Date:
Subject: Re: BUG #14948: cost overflow
Next
From: Alexander Voytsekhovskyy
Date:
Subject: Re: BUG in 10.1 - dsa_area could not attach to a segment that hasbeen freed