Re: logical decoding : exceeded maxAllocatedDescs for .spill files - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: logical decoding : exceeded maxAllocatedDescs for .spill files |
Date | |
Msg-id | 20190912093055.exeatfemzr3f25yx@development Whole thread Raw |
In response to | Re: logical decoding : exceeded maxAllocatedDescs for .spill files (Alvaro Herrera from 2ndQuadrant <alvherre@alvh.no-ip.org>) |
Responses |
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Re: logical decoding : exceeded maxAllocatedDescs for .spill files |
List | pgsql-hackers |
On Wed, Sep 11, 2019 at 09:51:40AM -0300, Alvaro Herrera from 2ndQuadrant wrote: >On 2019-Sep-11, Amit Khandekar wrote: > >> I reproduced the error "exceeded maxAllocatedDescs (492) while trying >> to open file ...", which was also discussed about in the thread [1]. >> This issue is similar but not exactly the same as [1]. In [1], the >> file for which this error used to show up was >> "pg_logical/mappings/map...." , while here it's the .spill file. And >> here the issue , in short, seems to be : The .spill file does not get >> closed there and then, unlike in [1] where there was a file descriptor >> leak. > >Uh-oh :-( Thanks for the reproducer -- I confirm it breaks things. > >> Looking at the code, what might be happening is, >> ReorderBufferIterTXNInit()=>ReorderBufferRestoreChanges() opens the >> files, but leaves them open if end of file is not reached. Eventually >> if end of file is reached, it gets closed. The function returns back >> without closing the file descriptor if reorder buffer changes being >> restored are more than max_changes_in_memory. Probably later on, the >> rest of the changes get restored in another >> ReorderBufferRestoreChanges() call. But meanwhile, if there are a lot >> of such files opened, we can run out of the max files that PG decides >> to keep open (it has some logic that takes into account system files >> allowed to be open, and already opened). > >Makes sense. > >> Offhand, what I am thinking is, we need to close the file descriptor >> before returning from ReorderBufferRestoreChanges(), and keep track of >> the file offset and file path, so that next time we can resume reading >> from there. > >I think doing this all the time would make restore very slow -- there's a >reason we keep the files open, after all. How much slower? It certainly will have a hit, but maybe it's negligible compared to all the other stuff happening in this code? >It would be better if we can keep the descriptors open as much as >possible, and only close them if there's trouble. I was under the >impression that using OpenTransientFile was already taking care of that, >but that's evidently not the case. > I don't see how the current API could do that transparently - it does track the files, but the user only gets a file descriptor. With just a file descriptor, how could the code know to do reopen/seek when it's going just through the regular fopen/fclose? Anyway, I agree we need to do something, to fix this corner case (many serialized in-progress transactions). ISTM we have two options - either do something in the context of reorderbuffer.c, or extend the transient file API somehow. I'd say the second option is the right thing going forward, because it does allow doing it transparently and without leaking details about maxAllocatedDescs etc. There are two issues, though - it does require changes / extensions to the API, and it's not backpatchable. So maybe we should start with the localized fix in reorderbuffer, and I agree tracking offset seems reasonable. As a sidenote - in the other thread about streaming, one of the patches does change how we log subxact assignments. In the end, this allows using just a single file for the top-level transaction, instead of having one file per subxact. That would also solve this. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: