Michael Glaesemann <michael.glaesemann@myyearbook.com> writes:
> On Nov 8, 2010, at 16:03 , Tom Lane wrote:
>> That's very peculiar. Do you keep query logs? It would be useful to
>> try to correlate the temp files' PIDs and timestamps with the specific
>> queries that must have created them.
> We don't log all of them, but I checked those we did. It looks like it's happening when queries are timing out. I'm
seeingthis pattern pretty consistently:
> temporary file + query
> canceling statement due to statement timeout
> second temp file
> Here's a sample:
> pid | 877
> sess_id | 4ccf7257.36d
> sess_line | 16
> filename | pgsql_tmp877.0
> accessed_at | 2010-09-15 12:14:45-04
> modified_at | 2010-11-01 22:37:00-04
> logged_at | 2010-11-01 22:37:01.412-04
> error | LOG
> sql_state | 00000
> message | temporary file: path "pg_tblspc/16384/pgsql_tmp/pgsql_tmp877.0", size 87184416
Oh, so you've got log_temp_files enabled?
Hmm. If you look at FileClose() in fd.c, you'll discover that that
"temporary file" log message is emitted immediately before unlink'ing
the file. It looks pretty safe ... but, scratching around, I notice
that there's a CHECK_FOR_INTERRUPTS at the end of ereport(). So a
cancel that was caught by that exact CHECK_FOR_INTERRUPTS call could
provoke this symptom. The window for this is larger than it might seem
since the CHECK_FOR_INTERRUPTS could be responding to an interrupt that
came in sometime before that.
I think we need to re-order the operations there to ensure that the
unlink will still happen if the ereport gets interrupted.
regards, tom lane