logical decoding : exceeded maxAllocatedDescs for .spill files - Mailing list pgsql-hackers
From | Amit Khandekar |
---|---|
Subject | logical decoding : exceeded maxAllocatedDescs for .spill files |
Date | |
Msg-id | CAJ3gD9c-sECEn79zXw4yBnBdOttacoE-6gAyP0oy60nfs_sabQ@mail.gmail.com Whole thread Raw |
Responses |
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
|
List | pgsql-hackers |
Hi, I reproduced the error "exceeded maxAllocatedDescs (492) while trying to open file ...", which was also discussed about in the thread [1]. This issue is similar but not exactly the same as [1]. In [1], the file for which this error used to show up was "pg_logical/mappings/map...." , while here it's the .spill file. And here the issue , in short, seems to be : The .spill file does not get closed there and then, unlike in [1] where there was a file descriptor leak. I could reproduce it using a transaction containing a long series of sub-transactions (possibly could be reproduced without sub-transactions, but looking at the code I could come up with this script using sub-transactions easily) : create table tab(id int); -- Function that does huge changes in a single transaction create or replace function f(id int) returns int as $$ begin -- Iterate this more than 492 times (max transient file descriptors PG would typically allow) -- This will create that many sub-transactions due to presence of exception block. FOR i IN 1..600 LOOP BEGIN -- Iterate more than 4096 times (so that changes spill to disk: max_changes_in_memory) FOR j IN 1..5000 LOOP insert into tab values (1); END LOOP; EXCEPTION when division_by_zero then perform 'dummy'; END; END LOOP; return id; end $$ language plpgsql; SELECT * FROM pg_create_logical_replication_slot('logical', 'test_decoding'); begin; select f(1); -- Do huge changes in a single transaction commit; \! pg_recvlogical -d postgres --slot=logical --verbose --start -f - pg_recvlogical: starting log streaming at 0/0 (slot logical) pg_recvlogical: streaming initiated pg_recvlogical: confirming write up to 0/0, flush to 0/0 (slot logical) BEGIN 1869 pg_recvlogical: confirming write up to 0/1B6D6E38, flush to 0/1B6D6E38 (slot logical) pg_recvlogical: error: unexpected termination of replication stream: ERROR: exceeded maxAllocatedDescs (492) while trying to open file "pg_replslot/logical/xid-2362-lsn-0-24000000.spill" pg_recvlogical: disconnected; waiting 5 seconds to try again Looking at the code, what might be happening is, ReorderBufferIterTXNInit()=>ReorderBufferRestoreChanges() opens the files, but leaves them open if end of file is not reached. Eventually if end of file is reached, it gets closed. The function returns back without closing the file descriptor if reorder buffer changes being restored are more than max_changes_in_memory. Probably later on, the rest of the changes get restored in another ReorderBufferRestoreChanges() call. But meanwhile, if there are a lot of such files opened, we can run out of the max files that PG decides to keep open (it has some logic that takes into account system files allowed to be open, and already opened). Offhand, what I am thinking is, we need to close the file descriptor before returning from ReorderBufferRestoreChanges(), and keep track of the file offset and file path, so that next time we can resume reading from there. Comments ? [1] https://www.postgresql.org/message-id/flat/738a590a-2ce5-9394-2bef-7b1caad89b37%402ndquadrant.com -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
pgsql-hackers by date: