Re: logical decoding : exceeded maxAllocatedDescs for .spill files - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Date
Msg-id CAJ3gD9cT8kjEiyjD6=vvW2AubpO3_wZjDVQHdLHLmUowQy-5Bw@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding : exceeded maxAllocatedDescs for .spill files  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: logical decoding : exceeded maxAllocatedDescs for .spill files  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers


On Fri, 22 Nov 2019 at 4:26 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 22, 2019 at 11:00 AM Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On Fri, 22 Nov 2019 at 09:08, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Have you tried before that fix , if not, can you once try by
> > temporarily reverting that fix in your environment and share the
> > output of each step?  After you get the error due to EOF, check that
> > you have .spill files in pg_replslot/<slot_name>/ and then again try
> > to get changes by pg_logical_slot_get_changes().   If you want, you
> > can use the test provided in Amit Khandekar's patch.
>
> On my Linux machine, I added elog() in ReorderBufferRestoreChanges(),
> just after FileRead() returns 0. This results in error. But the thing is, in
> ReorderBufferCommit(), the error is already handled using PG_CATCH :
>
> PG_CATCH();
> {
> .....
>    AbortCurrentTransaction();
> .......
>    if (using_subtxn)
>       RollbackAndReleaseCurrentSubTransaction();
> ........
> ........
>    /* remove potential on-disk data, and deallocate */
>    ReorderBufferCleanupTXN(rb, txn);
> }
>
> So ReorderBufferCleanupTXN() removes all the .spill files using unlink().
>
> And on Windows, what should happen is : unlink() should succeed
> because the file is opened using FILE_SHARE_DELETE. But the files
> should still remain there because these are still open. It is just
> marked for deletion until there is no one having opened the file. That
> is what is my conclusion from running a sample attached program test.c
>

I think this is exactly the reason for the problem.  In my test [1],
the error "permission denied" occurred when I second time executed
pg_logical_slot_get_changes() which means on first execution the
unlink would have been successful but the files are still not removed
as they were not closed. Then on second execution, it gets an error
"Permission denied" when it again tries to unlink files via
ReorderBufferCleanupSerializedTXNs().


.
> But what you are seeing is "Permission denied" errors. Not sure why
> unlink() is failing.
>

In your test program, if you try to unlink the file second time, you
should see the error "Permission denied".
 I tested using the sample program and indeed I got the error 5 (access denied) when I called unlink the second time. 


[1] - https://www.postgresql.org/message-id/CAA4eK1%2Bcey6i6a0zD9kk_eaDXb4RPNZqu4UwXO9LbHAgMpMBkg%40mail.gmail.com



--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Attempt to consolidate reading of XLOG page
Next
From: Alvaro Herrera
Date:
Subject: Re: Attempt to consolidate reading of XLOG page