Re: logical decoding : exceeded maxAllocatedDescs for .spill files - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Date
Msg-id CAA4eK1LbjMgPd9-qF1VRaRWyAdjFw6OK86STPBFyza+T6tXUFg@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding : exceeded maxAllocatedDescs for .spill files  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: logical decoding : exceeded maxAllocatedDescs for .spill files  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Fri, Nov 22, 2019 at 11:00 AM Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On Fri, 22 Nov 2019 at 09:08, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Have you tried before that fix , if not, can you once try by
> > temporarily reverting that fix in your environment and share the
> > output of each step?  After you get the error due to EOF, check that
> > you have .spill files in pg_replslot/<slot_name>/ and then again try
> > to get changes by pg_logical_slot_get_changes().   If you want, you
> > can use the test provided in Amit Khandekar's patch.
>
> On my Linux machine, I added elog() in ReorderBufferRestoreChanges(),
> just after FileRead() returns 0. This results in error. But the thing is, in
> ReorderBufferCommit(), the error is already handled using PG_CATCH :
>
> PG_CATCH();
> {
> .....
>    AbortCurrentTransaction();
> .......
>    if (using_subtxn)
>       RollbackAndReleaseCurrentSubTransaction();
> ........
> ........
>    /* remove potential on-disk data, and deallocate */
>    ReorderBufferCleanupTXN(rb, txn);
> }
>
> So ReorderBufferCleanupTXN() removes all the .spill files using unlink().
>
> And on Windows, what should happen is : unlink() should succeed
> because the file is opened using FILE_SHARE_DELETE. But the files
> should still remain there because these are still open. It is just
> marked for deletion until there is no one having opened the file. That
> is what is my conclusion from running a sample attached program test.c
>

I think this is exactly the reason for the problem.  In my test [1],
the error "permission denied" occurred when I second time executed
pg_logical_slot_get_changes() which means on first execution the
unlink would have been successful but the files are still not removed
as they were not closed. Then on second execution, it gets an error
"Permission denied" when it again tries to unlink files via
ReorderBufferCleanupSerializedTXNs().


.
> But what you are seeing is "Permission denied" errors. Not sure why
> unlink() is failing.
>

In your test program, if you try to unlink the file second time, you
should see the error "Permission denied".


[1] - https://www.postgresql.org/message-id/CAA4eK1%2Bcey6i6a0zD9kk_eaDXb4RPNZqu4UwXO9LbHAgMpMBkg%40mail.gmail.com



--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pause recovery if pitr target not reached
Next
From: Amit Kapila
Date:
Subject: Re: adding partitioned tables to publications