Thread: Deleted WAL files held open by backends in Linux

Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
I ran across the subject issue and spent some time puzzling over it.
It seemed strange that the only backends which were holding open
deleted WAL files were ones where the connection was established
with a login which has no write permissions.  Eventually, I vaguely
recalled there was such an issue discussed in recent months and
found it in the archives.  It was this:
http://archives.postgresql.org/pgsql-bugs/2009-09/msg00144.php
As a DBA I feel that it's irritating and a potential time waster.
I guess it is a stretch to imagine that a database would have
enough read-only connections to exhaust resources by holding open
one deleted WAL file each; unless they have, say, 200 such
connections and they're cutting things so close that a wasted 3.2GB
of disk space at the WAL file location will run them out.
I'm not sure whether Tom's comment that "There is zero hope of
making that work." referred to the idea that we could close deleted
WAL files or to something else.  Is a fix feasible?
This was observed on:
PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu,
compiled by GCC gcc (GCC) 4.1.2 20070115 (SUSE Linux)
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 2
Linux version 2.6.16.60-0.39.3-smp (geeko@buildhost)
(gcc version 4.1.2 20070115 (SUSE Linux))
#1 SMP Mon May 11 11:46:34 UTC 2009
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
Itagaki Takahiro
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

> I guess it is a stretch to imagine that a database would have
> enough read-only connections to exhaust resources by holding open
> one deleted WAL file each; unless they have, say, 200 such
> connections and they're cutting things so close that a wasted 3.2GB
> of disk space at the WAL file location will run them out.

AFAIK, we rename and reuse old WAL segment files.
So, we don't waste disk space unless checkpoint_segments are too small.

Also, if you are worried about disk space,
how about adding ftruncate(to 0 byte) before unlink() ?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: Deleted WAL files held open by backends in Linux

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I ran across the subject issue and spent some time puzzling over it.
> ...
> I'm not sure whether Tom's comment that "There is zero hope of
> making that work." referred to the idea that we could close deleted
> WAL files or to something else.  Is a fix feasible?

The reason I was dismissive of that is this scenario: a backend is told
to execute a data-modifying transaction (so it has to write and commit
some WAL records) and then the client promptly goes to sleep and stays
that way for awhile.  The backend has nothing to do either, so it's just
sitting waiting on the client socket.  There is noplace reasonable here
to make it decide to close its open WAL file.  We could possibly do a
pre-emptive close() after transaction commit, but that would result in a
useless distributed overhead of opening and closing the current WAL file
when a backend is engaged in a series of transactions.

On Linux this simply isn't an issue, anyway --- if you can't afford one
open file descriptor per backend for WAL, you're going to have a far
bigger problem with the backend's habit of caching descriptors for data
files.  On Windows it might be a problem, but I think we have got
workarounds for all the important cases.  The important point is that
we want any overhead for this to be pushed into non-performance-critical
code paths.

> It seemed strange that the only backends which were holding open
> deleted WAL files were ones where the connection was established
> with a login which has no write permissions.

A backend would never open a WAL file unless it had to write a WAL
record, so I'm having a hard time believing that these were totally
read-only transactions.  Can you give specifics?
        regards, tom lane


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

>> It seemed strange that the only backends which were holding open
>> deleted WAL files were ones where the connection was established
>> with a login which has no write permissions.
>
> A backend would never open a WAL file unless it had to write a WAL
> record, so I'm having a hard time believing that these were
> totally read-only transactions.  Can you give specifics?

Here's some information captured before the original post:

MILWAUKEE-PG:~ # lsof | grep deleted
COMMAND     PID    USER   FD      TYPE             DEVICE       SIZE     NODE NAME
postgres  13105    ccsa   10u      REG                8,4   16777216 
268442394 /var/pgsql/data/cc/pg_xlog/00000001000000E0000000BB (deleted)
postgres  30500    ccsa    4u      REG                8,4   16777216 
268442626 /var/pgsql/data/cc/pg_xlog/00000001000000E20000003F (deleted)
postgres  30501    ccsa    5u      REG                8,4   16777216 
283509014 /var/pgsql/data/cc/pg_xlog/00000001000000E200000016 (deleted)
MILWAUKEE-PG:~ # ps auxf|grep ^ccsa
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME
COMMAND
ccsa     30490  0.0  0.0 290988  8608 ?        S    Oct24   0:13
/usr/local/pgsql-8.3.7/bin/postgres -D /var/pgsql/data/cc
ccsa     30491  0.0  0.0  14524  1056 ?        Ss   Oct24   0:10  \_
postgres: logger process
ccsa     30493  0.0  1.0 291876 268760 ?       Ss   Oct24   1:36  \_
postgres: writer process
ccsa     30494  0.0  0.0 291120  1452 ?        Ss   Oct24   0:01  \_
postgres: wal writer process
ccsa     30495  0.0  0.0 291552  1564 ?        Ss   Oct24   0:00  \_
postgres: autovacuum launcher process
ccsa     30496  0.0  0.0  14612   952 ?        Ss   Oct24   0:01  \_
postgres: archiver process   last was 00000001000000E20000005B
ccsa     30497  0.0  0.0  15280  1576 ?        Ss   Oct24   6:37  \_
postgres: stats collector process
ccsa     30500  0.0  0.9 292448 231844 ?       Ss   Oct24  10:42  \_
postgres: viewer cc 127.0.0.1(36846) idle
ccsa     30501  0.0  0.8 292496 209428 ?       Ss   Oct24  10:28  \_
postgres: viewer cc 165.219.78.11(53940) idle
ccsa      3107  0.0  0.0 292508  5836 ?        Ss   Nov09   0:32  \_
postgres: cc cc 127.0.0.1(40442) idle
ccsa      3113  0.0  1.1 305304 286280 ?       Ss   Nov09   6:51  \_
postgres: cc cc 127.0.0.1(49969) idle
ccsa      3119  0.0  0.0 292508  5836 ?        Ss   Nov09   0:29  \_
postgres: cc cc 127.0.0.1(54127) idle
ccsa     13105  0.0  0.0 293396 23852 ?        Ss   Nov23   0:08  \_
postgres: viewer cc 165.219.90.178(51481) idle
ccsa      1485  0.8  1.1 312400 291508 ?       Ss   Nov24  14:18  \_
postgres: cc cc 127.0.0.1(42692) idle
ccsa     10752  0.3  1.1 312712 290856 ?       Ss   Nov24   4:54  \_
postgres: cc cc 127.0.0.1(45119) idle
ccsa     10908  1.3  1.1 313120 292836 ?       Ss   Nov24  20:03  \_
postgres: cc cc 127.0.0.1(42065) idle
ccsa     25099  0.7  1.1 312864 285048 ?       Ss   14:02   1:03  \_
postgres: cc cc 127.0.0.1(50361) idle
ccsa     26571  0.0  0.3 298912 89528 ?        Ss   14:29   0:06  \_
postgres: cc cc 127.0.0.1(52213) idle
ccsa     26809  0.1  0.3 298940 88816 ?        Ss   14:33   0:06  \_
postgres: cc cc 127.0.0.1(59145) idle
ccsa     26812  0.6  1.1 310060 274192 ?       Ss   14:33   0:44  \_
postgres: cc cc 127.0.0.1(59962) idle
MILWAUKEE-PG:~ # ll /var/pgsql/data/cc/pg_xlog/
total 1671172
-rw------- 1 ccsa users      253 2009-11-20 21:29
00000001000000E00000000A.00001810.backup
-rw------- 1 ccsa users 16777216 2009-11-25 16:10
00000001000000E20000005B
-rw------- 1 ccsa users 16777216 2009-11-25 16:22
00000001000000E20000005C
-rw------- 1 ccsa users 16777216 2009-11-25 04:15
00000001000000E20000005D
-rw------- 1 ccsa users 16777216 2009-11-25 05:15
00000001000000E20000005E
-rw------- 1 ccsa users 16777216 2009-11-25 06:15
00000001000000E20000005F
-rw------- 1 ccsa users 16777216 2009-11-25 07:15
00000001000000E200000060
-rw------- 1 ccsa users 16777216 2009-11-25 07:42
00000001000000E200000061
[...]
-rw------- 1 ccsa users 16777216 2009-11-25 15:34
00000001000000E2000000BB
-rw------- 1 ccsa users 16777216 2009-11-25 15:37
00000001000000E2000000BC
-rw------- 1 ccsa users 16777216 2009-11-25 15:44
00000001000000E2000000BD
-rw------- 1 ccsa users 16777216 2009-11-25 15:50
00000001000000E2000000BE
-rw------- 1 ccsa users 16777216 2009-11-25 15:57
00000001000000E2000000BF
-rw------- 1 ccsa users 16777216 2009-11-25 16:04
00000001000000E2000000C0
drwx------ 2 ccsa users       94 2009-11-25 16:12 archive_status

You will note that the connections logged in as "viewer" (and only
those) are holding open a deleted WAL file.  This user has not been
granted anything except SELECT permissions to any tables.  In
addition, immediately after creating the database, we ran:
 REVOKE CREATE ON DATABASE cc FROM public; REVOKE CREATE ON SCHEMA public FROM public;

Two of these connections are from software publishers, which only
issue SELECT statements against three tables (to which they only
have SELECT permission).  The other is from monitoring software,
which only runs SELECT against C functions from our fsutil software
(which can be seen on pgfoundry).  All are JDBC connections.  I
suspect that autoCommit is on; if you think it matters, I can go
review the code to confirm.

On one of our development machines, with a number of PostgreSQL
databases, I found about 60 such WAL files deleted and held open
by backends logged in under the read-only user.  It wasted my time
primarily because I was looking for situations where we might have
deleted a directory tree without first stopping the process using
it; while 60 file handles and 1GB of disk space is no big deal for
us, it caused me to investigate the issue to confirm there weren't
zombie PostgreSQL processes out there.

At a minimum, we should add the extra 16MB per connection that might
be taken on the WAL file system to the calculations people should do
when sizing that, just in case someone is trying to cut that thin
while planning on using a lot of connections.

-Kevin


Re: Deleted WAL files held open by backends in Linux

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A backend would never open a WAL file unless it had to write a WAL
>> record, so I'm having a hard time believing that these were
>> totally read-only transactions.  Can you give specifics?

> You will note that the connections logged in as "viewer" (and only
> those) are holding open a deleted WAL file.  This user has not been
> granted anything except SELECT permissions to any tables.

You sure it's not creating any temp tables?  You didn't mention
revoking TEMP privilege.

I can think of one code path that could result in a genuinely read-only
session having to write WAL: if it's forced to flush dirty buffers in
order to read in other pages, and such a buffer was dirtied by
as-yet-uncommitted transactions, it might have to flush WAL to be
allowed to write the dirty buffer.  But I think you'd have had to dial
back the bgwriter to the point of uselessness before this would be a
common occurrence.

> At a minimum, we should add the extra 16MB per connection that might
> be taken on the WAL file system to the calculations people should do
> when sizing that, just in case someone is trying to cut that thin
> while planning on using a lot of connections.

In the first place, this is a complete non-issue except on Windows ---
on other platforms we can rename and recycle the files even if they're
being held open.  I rather doubt anyone would think they could predict a
Windows machine's disk usage that accurately anyway.  In the second
place, for each backend to be holding open a different dead WAL file
strains the limits of credulity.  Telling people to assume 16MB *
max_connections would be a gross overestimate.
        regards, tom lane


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> You sure it's not creating any temp tables?  You didn't mention
> revoking TEMP privilege.
They have not been revoked, but I am sure the software publisher
doesn't explicitly create any, and I'd be very surprised if the
monitoring software did.  The tables are small enough that it's hard
to believe that the 50MB work_mem would spill to disk, either (if
that matters).
> I can think of one code path that could result in a genuinely
> read-only session having to write WAL: if it's forced to flush
> dirty buffers in order to read in other pages, and such a buffer
> was dirtied by as-yet-uncommitted transactions, it might have to
> flush WAL to be allowed to write the dirty buffer.  But I think
> you'd have had to dial back the bgwriter to the point of
> uselessness before this would be a common occurrence.
#bgwriter_delay = 200ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4.0
>> At a minimum, we should add the extra 16MB per connection that
>> might be taken on the WAL file system to the calculations people
>> should do when sizing that, just in case someone is trying to
>> cut that thin while planning on using a lot of connections.
> 
> In the first place, this is a complete non-issue except on Windows
> --- on other platforms we can rename and recycle the files even
> if they're being held open.  I rather doubt anyone would think
> they could predict a Windows machine's disk usage that accurately
> anyway.  In the second place, for each backend to be holding open
> a different dead WAL file strains the limits of credulity. 
> Telling people to assume 16MB * max_connections would be a gross
> overestimate.
Pretty much every read only JDBC connection seems to be holding open
a deleted WAL file on my Linux box, but it would take pretty
pessimal timing for each connection to be holding open a different
one -- I see that many connections share a deleted WAL file.
project-db:~ # cat /proc/version
Linux version 2.6.16.60-0.39.3-smp (geeko@buildhost) (gcc version 4.1.2
20070115 (SUSE Linux)) #1 SMP Mon May 11 11:46:34 UTC 2009
project-db:~ # lsof | grep deleted                     
postgres   2189      ccefcirsa   20u      REG                8,3  
16777216 1610613340
/var/pgsql/data/ccefcir/pg_xlog/000000010000000A00000010 (deleted)
postgres   2195      ccefcirsa   43u      REG                8,3  
16777216 1610613340
/var/pgsql/data/ccefcir/pg_xlog/000000010000000A00000010 (deleted)
postgres   2511     jdashcirsa    1u      CHR              136,5                   7 /dev/pts/5 (deleted)
                            
 
postgres   2511     jdashcirsa    2u      CHR              136,5                   7 /dev/pts/5 (deleted)
                            
 
postgres   2514     jdashcirsa    3u      REG                8,3  
16777216 1610631538
/var/pgsql/data/jdashcir/pg_xlog/000000010000001100000076 (deleted)
postgres   2812           ccsa    3u      REG                8,3  
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000094 (deleted)
postgres   3647         ccefsa    1u      CHR              136,5                   7 /dev/pts/5 (deleted)
                             
 
postgres   3647         ccefsa    2u      CHR              136,5                   7 /dev/pts/5 (deleted)
                             
 
postgres   3650         ccefsa    3u      REG                8,3  
16777216  669441097
/var/pgsql/data/ccef/pg_xlog/000000020000004B000000C5 (deleted)    
postgres   4266  milwaukeeifsa    3u      REG                8,3  
16777216 2708846433
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006B00000014 (deleted)
postgres   4270     juryscansa    3u      REG                8,3  
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)   
postgres   6100     juryscansa   41u      REG                8,3  
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)   
postgres   6105     juryscansa   35u      REG                8,3  
16777216 1073742167
/var/pgsql/data/juryscan/pg_xlog/000000020000001A0000007E (deleted)   
postgres   6113     juryscansa   43u      REG                8,3  
16777216 1073742166
/var/pgsql/data/juryscan/pg_xlog/000000020000001A0000001D (deleted)   
postgres   6119     juryscansa   59u      REG                8,3  
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)   
postgres   6121     juryscansa   62u      REG                8,3  
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)   
postgres   6126     juryscansa   33u      REG                8,3  
16777216 1073742167
/var/pgsql/data/juryscan/pg_xlog/000000020000001A0000007E (deleted)   
postgres   7997   washrptcirsa   13u      REG                8,3  
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted) 
postgres   8170   washrptcirsa   23u      REG                8,3  
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted) 
postgres   8687   washrptcirsa   11u      REG                8,3  
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted) 
postgres   8720   washrptcirsa    3u      REG                8,3  
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted) 
postgres   8769   washrptcirsa    6u      REG                8,3  
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted) 
postgres  10093   washrptcirsa   18u      REG                8,3  
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted) 
postgres  10313        ccsagsa    3u      REG                8,3  
16777216 3758753714
/var/pgsql/data/ccsag/pg_xlog/0000000300000035000000EE (deleted)      
postgres  10660   datasourcesa   50u      REG                8,3  
16777216 3250939657
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000C9 (deleted) 
postgres  10661   datasourcesa   23u      REG                8,3  
16777216 3250939657
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000C9 (deleted)
postgres  10666   datasourcesa   20u      REG                8,3  
16777216 3250939679
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000B3 (deleted)
postgres  10667   datasourcesa   20u      REG                8,3  
16777216 3250939679
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000B3 (deleted)
postgres  11049        formssa    3u      REG                8,3  
16777216 1879057748
/var/pgsql/data/forms/pg_xlog/000000010000000B000000C1 (deleted)
postgres  11209  milwaukeeifsa   19u      REG                8,3  
16777216 2690371279
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006900000018 (deleted)
postgres  11217  milwaukeeifsa  131u      REG                8,3  
16777216 2690371279
/var/pgsql/data/milwaukeeif/pg_xlog/0000000200000069000000DE (deleted)
postgres  11221  milwaukeeifsa   34u      REG                8,3  
16777216 2690371282
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006A00000051 (deleted)
postgres  11223  milwaukeeifsa   16u      REG                8,3  
16777216 2690371282
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006A00000051 (deleted)
postgres  11659      jdashccsa    6u      REG                8,3  
16777216  401111458
/var/pgsql/data/jdashcc/pg_xlog/000000020000006400000080 (deleted)
postgres  12949      jdashccsa   22u      REG                8,3  
16777216  268437280
/var/pgsql/data/jdashcc/pg_xlog/0000000200000065000000BE (deleted)
postgres  12976        formssa   31u      REG                8,3  
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres  12978        formssa   23u      REG                8,3  
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres  12980        formssa   34u      REG                8,3  
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres  12992        formssa   35u      REG                8,3  
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres  12993        formssa   25u      REG                8,3  
16777216 1879057748
/var/pgsql/data/forms/pg_xlog/000000010000000E000000F4 (deleted)
postgres  12994        formssa   24u      REG                8,3  
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres  14359      jdashccsa   22u      REG                8,3  
16777216  401111459
/var/pgsql/data/jdashcc/pg_xlog/0000000200000065000000BD (deleted)
postgres  15397   washrptcirsa   26u      REG                8,3  
16777216 2688967664
/var/pgsql/data/washrptcir/pg_xlog/000000010000002900000079 (deleted)
postgres  16422        ccsagsa   11u      REG                8,3  
16777216 3758531497
/var/pgsql/data/ccsag/pg_xlog/0000000300000030000000ED (deleted)
postgres  16490         q409sa    3u      REG                8,3  
16777216 3490994896
/var/pgsql/data/codesq409/pg_xlog/000000010000000B00000056 (deleted)
postgres  16533         q309sa    3u      REG                8,3  
16777216 1073749053
/var/pgsql/data/codesq309/pg_xlog/0000000100000007000000A4 (deleted)
postgres  16721           ccsa   27u      REG                8,3  
16777216 3763808825
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000068 (deleted)
postgres  16725           ccsa   25u      REG                8,3  
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000094 (deleted)
postgres  16729           ccsa   17u      REG                8,3  
16777216 3763808824
/var/pgsql/data/epayment/pg_xlog/000000020000009A000000F0 (deleted)
postgres  16734           ccsa   34u      REG                8,3  
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B000000FA (deleted)
postgres  16741           ccsa   18u      REG                8,3  
16777216 3763808825
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000068 (deleted)
postgres  16747           ccsa   38u      REG                8,3  
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B000000FA (deleted)
postgres  18316         ccefsa   13u      REG                8,3  
16777216  669441097
/var/pgsql/data/ccef/pg_xlog/000000020000004B000000C5 (deleted)
postgres  20568      washrptsa    3u      REG                8,3  
16777216       1463
/var/pgsql/data/washrpt/pg_xlog/000000020000004700000011 (deleted)
postgres  23040  milwaukeeifsa   47u      REG                8,3  
16777216 2690371282
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006A00000051 (deleted)
nscd      23041           root    7u      REG                8,2    
217016    1459569 /var/run/nscd/dbnP5QN9 (deleted)
nscd      23041           root    8r      REG                8,2    
217016    1459569 /var/run/nscd/dbnP5QN9 (deleted)
postgres  23322      ccefcirsa    1u      CHR              136,0                   2 /dev/pts/0 (deleted)
postgres  23322      ccefcirsa    2u      CHR              136,0                   2 /dev/pts/0 (deleted)
postgres  23339  milwaukeeifsa  136u      REG                8,3  
16777216 2708846433
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006B00000014 (deleted)
postgres  23603        jawebsa    1u      CHR              136,5                   7 /dev/pts/5 (deleted)
postgres  23603        jawebsa    2u      CHR              136,5                   7 /dev/pts/5 (deleted)
postgres  24367           mssa   34u      REG                8,3  
16777216  805314300 /var/pgsql/data/ms/pg_xlog/000000010000000A00000058
(deleted)
postgres  24368           mssa   37u      REG                8,3  
16777216  805314303 /var/pgsql/data/ms/pg_xlog/000000010000000A00000059
(deleted)
postgres  24923      encryptsa  335u      REG                8,3  
16777216  805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres  24925      encryptsa   27u      REG                8,3  
16777216  805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres  24931      encryptsa   16u      REG                8,3  
16777216  805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres  24933      encryptsa   26u      REG                8,3  
16777216  805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres  24942      encryptsa   16u      REG                8,3  
16777216  805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres  26267   washrptcirsa    1u      CHR              136,3                   5 /dev/pts/3 (deleted)
postgres  26267   washrptcirsa    2u      CHR              136,3                   5 /dev/pts/3 (deleted)
postgres  26270   washrptcirsa    3u      REG                8,3  
16777216 2688967703
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B00000070 (deleted)
postgres  27403  milwaukeeifsa   30u      REG                8,3  
16777216 2708846433
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006B00000014 (deleted)
postgres  28916      encryptsa    3u      REG                8,3  
16777216  805314462
/var/pgsql/data/encrypt/pg_xlog/000000020000003C000000E8 (deleted)
postgres  29217           mssa    3u      REG                8,3  
16777216  805314300 /var/pgsql/data/ms/pg_xlog/000000010000000A00000010
(deleted)
postgres  30151   datasourcesa    3u      REG                8,3  
16777216 3250939659
/var/pgsql/data/datasource/pg_xlog/000000010000000400000094 (deleted)
I'll test a bit to see what it takes to get a JDBC connection to
open a WAL file for a connection.
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> You sure it's not creating any temp tables?  You didn't mention
>> revoking TEMP privilege.
> They have not been revoked, but I am sure the software publisher
> doesn't explicitly create any, and I'd be very surprised if the
> monitoring software did.  The tables are small enough that it's hard
> to believe that the 50MB work_mem would spill to disk, either (if
> that matters).

It's not about the size of a temp table, because writes to the temp
table itself aren't WAL-logged.  However, the system catalog entries for
a temp table *are* WAL-logged.

> Pretty much every read only JDBC connection seems to be holding open
> a deleted WAL file on my Linux box, but it would take pretty
> pessimal timing for each connection to be holding open a different
> one -- I see that many connections share a deleted WAL file.

This still seems a bit improbable to me.  There has to be something
causing those sessions to touch WAL, and the dirty-buffer scenario
doesn't seem reliable enough.

[ thinks... ]  How about SELECT FOR UPDATE or SELECT FOR SHARE?
Those cause WAL writes.
        regards, tom lane


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It's not about the size of a temp table, because writes to the
> temp table itself aren't WAL-logged.  However, the system catalog
> entries for a temp table *are* WAL-logged.
Definitely not issuing any CREATE TEMP statements of any kind,
unless the JDBC driver is doing that under the covers.
>> Pretty much every read only JDBC connection seems to be holding
>> open a deleted WAL file on my Linux box, but it would take pretty
>> pessimal timing for each connection to be holding open a
>> different one -- I see that many connections share a deleted WAL
>> file.
> 
> This still seems a bit improbable to me.  There has to be
> something causing those sessions to touch WAL, and the
> dirty-buffer scenario doesn't seem reliable enough.
> 
> [ thinks... ]  How about SELECT FOR UPDATE or SELECT FOR SHARE?
> Those cause WAL writes.
Definitely not.
Probably best not to worry about it until I can play around with
some Java test code to see what it takes to cause the connection to
open the WAL.  I'll post when I've had a chance to try that.
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> There has to be something causing those sessions to touch WAL, and
> the dirty-buffer scenario doesn't seem reliable enough.
This is seeming fairly likely to be the cause, though.  It may be a
combination of the nightly VACUUM FREEZE ANALYZE we typically do on
every database and the monitoring software, which would directly run
a couple queries each five seconds or so, and would also exercise
the software publishers on a similar schedule.  My "smoking gun" is
that the monitoring software was restarted last night after the time
that the nightly vacuums would have completed, and its connections
are not showing as having a deleted WAL file open today.
I was unable to directly cause the a backend to open a WAL file by
connecting through JDBC and running the same kind of queries used by
the software publisher, which does tend to point to some indirect
mechanism, like the dirty buffer flushing.
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> There has to be something causing those sessions to touch WAL, and
>> the dirty-buffer scenario doesn't seem reliable enough.
> This is seeming fairly likely to be the cause, though.  It may be a
> combination of the nightly VACUUM FREEZE ANALYZE we typically do on
> every database and the monitoring software, which would directly run
> a couple queries each five seconds or so, and would also exercise
> the software publishers on a similar schedule.  My "smoking gun" is
> that the monitoring software was restarted last night after the time
> that the nightly vacuums would have completed, and its connections
> are not showing as having a deleted WAL file open today.

Hmm.  If the read-only sessions are long-lived then you could expect
that the probability of having flushed a dirty block (and hence had to
write some WAL) increases over time and eventually approaches 1.
How old were the sessions you were looking at?

If we think this is worth doing something about (I'm not convinced yet)
then the answer would be to forcibly close a backend's open WAL file
in some reasonably seldom-used code path.  One possibility that comes
to mind is to do it in ProcessCatchupEvent(), which will be invoked in
approximately the right circumstances: a backend that is sitting idle
for a long time within an otherwise busy system.  That wouldn't be a
100% solution, because if the backend is handling a steady stream of
queries it will likely never run ProcessCatchupEvent().  But the places
that would be 100% (like transaction commit) would probably entail too
much of a performance hit from repeatedly opening and closing WAL files.
        regards, tom lane


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> How old were the sessions you were looking at?
Days to months old.
> If we think this is worth doing something about
> (I'm not convinced yet)
Once one knows about the issue, it's only a minor annoyance, and
that infrequently, so it's not worth anything that would take
significant effort or would cause any measurable performance hit.
> then the answer would be to forcibly close a backend's open WAL
> file in some reasonably seldom-used code path.  One possibility
> that comes to mind is to do it in ProcessCatchupEvent(), which
> will be invoked in approximately the right circumstances: a
> backend that is sitting idle for a long time within an otherwise
> busy system.  That wouldn't be a 100% solution, because if the
> backend is handling a steady stream of queries it will likely
> never run ProcessCatchupEvent().  But the places that would be
> 100% (like transaction commit) would probably entail too much of a
> performance hit from repeatedly opening and closing WAL files.
In our case there would tend to be bursts of activity on some of
these connections, and they would usually have a dribble of activity
from the monitoring system.  This dribble would consist of a couple
sub-millisecond SELECT statements once every five or ten seconds. 
(I have no clue whether that's frequent enough to suppress the
ProcessCatchupEvent() call.)
Is there a reasonably cheap way to check whether the backend has a
WAL file open and whether that one is the current append target?  As
you point out, the need to actually close such a file would be
infrequent, and it seems as though when the need exists, it wouldn't
be a matter of *whether* it would need to be closed, just *when*. 
Currently it sounds like we would often wind up doing it the next
time we try to run a query which has to flush dirty buffers, or when
the connection is being closed.  It seems like those might not be
the ideal time to add the overhead of the close, so we would just
need to find someplace which isn't worse.
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Is there a reasonably cheap way to check whether the backend has a
> WAL file open and whether that one is the current append target?

Detecting whether we have a WAL file open is trivial (just look at
the static variable holding the file descriptor).  Determining whether
it's still the current append target is not so cheap though; it would
require examining shared-memory status which means taking a lock on
that status (and it's a high-traffic lock already).

We could have the open WAL file dropped if stale as a side-effect
anytime we have occasion to examine that shared state anyway.  But
in a nearly-read-only session such as your example I'm not sure that
would happen often enough to fix the problem.
        regards, tom lane


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Determining whether it's still the current append target is not so
> cheap though; it would require examining shared-memory status
> which means taking a lock on that status (and it's a high-traffic
> lock already).
I haven't reviewed the internal locking techniques, so this may well
be a dumb question, but...  Since we only care whether the value is
equal, and an occasional false report of equality wouldn't hurt
anything, couldn't we bypass the lock in this particular case?
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Determining whether it's still the current append target is not so
>> cheap though; it would require examining shared-memory status
>> which means taking a lock on that status (and it's a high-traffic
>> lock already).
> I haven't reviewed the internal locking techniques, so this may well
> be a dumb question, but...  Since we only care whether the value is
> equal, and an occasional false report of equality wouldn't hurt
> anything, couldn't we bypass the lock in this particular case?

Perhaps, if you didn't mind sometimes getting a wrong answer.
I guess the cost of that would be pretty small in this particular
usage.
        regards, tom lane


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Perhaps, if you didn't mind sometimes getting a wrong answer.
Well, it would be a heuristic which would close the deleted file
*almost* all the time.  When it didn't, the next check would
probably catch it.  Assuming that you would never get an indication
that it was different when it hadn't changed, the worst that could
happen is that you infrequently get the current behavior.
That all *seems* sane to me, if low priroity.  Any objections to my
putting it on the TODO list?
-Kevin


Re: Deleted WAL files held open by backends in Linux

From
"Kevin Grittner"
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> Any objections to my putting it on the TODO list?
Hearing none, added.  (Apologies for missing the box where I should
have commented on the what the change did.)
-Kevin