BUG #13561: "could not write to log file, Bad file descriptor" error when working with huge table - Mailing list pgsql-bugs

From manuel.pallier@beko.at
Subject BUG #13561: "could not write to log file, Bad file descriptor" error when working with huge table
Date
Msg-id 20150811054146.2693.12562@wrigleys.postgresql.org
Whole thread Raw
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      13561
Logged by:          Manuel Pallier
Email address:      manuel.pallier@beko.at
PostgreSQL version: 9.4.4
Operating system:   Windows Server 2008 SP2 x64
Description:

I'm having a big problem with a PostreSQL database on a specific machine.
Whenever I do some specific things with a huge table (over 9 million rows)
the whole postgres process crashes with the following entries in the log
file:

PANIC:  could not write to log file 000000010000000000000022 at offset
11673600, length 909312: Bad file descriptor
LOG:  WAL writer process (PID 17056) exited with exit code 3
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
CONTEXT:  COPY fund_records, line 2088885: "..."
LOG:  PID 31820 in cancel request did not match any process
LOG:  all server processes terminated; reinitializing
FATAL:  pre-existing shared memory block is still in use
HINT:  Check if there are any old server processes still running, and
terminate them.

The `CONTEXT` is of course differnt depending on what I did. The above
`COPY` command was logged when restoring a dump with the big table with
pg_restore. Other occurrences were `VACUUM` (when I run vacuum on either the
single big table or the whole database) or even `SELECT` (when I run a
simple 1 row result select joining the big table).

I can reproduce this crash 100% by either running vacuum on the table or
trying to restore the table data from a dump. Other things, like select
statements, do not reproduce always, but at least they crash often enough
that it is very annoying. It's always the `could not write to log file x at
offset y, length z: Bad file descriptor` error. I've already tried disabling
autovacuum during pg_resotre but it still fails.

This problem seems to be specific to one server, because the same table/dump
works fine when I import it on my local machine. The affected server is a
virtual (on Parallels Virtuozzo base) Windows Server 2008 SP2 x64 with 2 GB
RAM. There is only a single HDD partition and nearly 40 GB of free space on
it. Nothing other than PostgreSQL is running on the server, also no anti
virus software. But PostgreSQL is the only application on this server that
shows problems, all other operations I've tried work fine, so it seems like
it's not a general problem with this machine.

I've tried PostgreSQL server versions 8.4.4 (this was running for years
until it first showed the error), 8.4.22 and 9.4.4, all show the same error.
I've tried both upgrading from 8.4.4 to 8.4.22 and fresh installs of 8.4.22
and 9.4.4 (there I've also completely deleted the data folder) multiple
times.

Since I've changed some values in postgresql.conf away from their default
values, the the crash now happens much rarer. Simple SQL and DML statements
only crash very rarely (once every few days), but VACUUM still crashes 100%.
Here are the values I changed:

shared_buffers = 128MB
work_mem = 16MB
maintenance_work_mem = 64MB
wal_buffers = 4MB

Can someone please explain what's wrong here?

pgsql-bugs by date:

Previous
From: Michael Meskes
Date:
Subject: Re: BUG #13557: ESQL select issue
Next
From: alexander.a.rodionov@gmail.com
Date:
Subject: BUG #13562: BDR requires primary key to be defined on a view