On 12/12/2017 02:54 PM, Jan Schulz wrote:
> Hello,
>
> in the meantime we managed to not trigger the OOM kill anymore and now
> also don't have this problem with the negative costs anymore. We still
> have one staging environment where this error is present, but we would
> like to reclaim it. Is there some guidelines I can follow to debug
> this?
>
I think you're probably looking for this:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
Once you have GDB attached to the process, you can set breakpoints on
interesting places.
I believe the most interesting function to inspect is btcostestimate,
which is what produces estimates for the bitmap index scans:
https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/selfuncs.c#L6880
In particular, we're interested in parameter values passed to the
function, when it produces negative indexTotalCost (..-12884901880.97).
The value is updated on multiple places in the function, so you'll need
to step through and see at which point it gets negative.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services