Re: pgsql: Avoid creation of the free space map for small heaprelations, t - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: pgsql: Avoid creation of the free space map for small heaprelations, t |
Date | |
Msg-id | CAA4eK1JtJp-DE6QWMHDdZoU5OJEJssheJ0ZDZSim0MzQv6tYpg@mail.gmail.com Whole thread Raw |
In response to | Re: pgsql: Avoid creation of the free space map for small heap relations, t (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: pgsql: Avoid creation of the free space map for small heaprelations, t
Re: pgsql: Avoid creation of the free space map for small heaprelations, t |
List | pgsql-hackers |
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <akapila@postgresql.org> writes: > > Avoid creation of the free space map for small heap relations, take 2. > > I think this patch still has some issues. Note the following two > recent buildfarm failures: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2019-02-20%2004%3A20%3A01 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2019-02-24%2022%3A48%3A02 > > Both of them failed in an assertion added by this patch: > > TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind == 't') && fsm_local_map.map[oldPage] ==0x01)", File: "/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",Line: 229) > ... > 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG: server process (PID 17981) was terminated by signal 6: Aborted > 2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL: Failed process was running: INSERT INTO T VALUES ( $1 ) > > I can even offer you a theory as to why we're managing to get through all > the core regression tests and then failing in ecpg: the failing test is > ecpg/test/thread/prep.pgc, and I think that may be the only place in our > tests where we stress concurrent insertions into a single small table. > So I think you've got race-condition issues here. > Right, I think I have some theory what is going on here. There seems to be a narrow window during which this problem can occur (with concurrent insertions) where we will try to use the local map for block which doesn't exist in that map. During insertion (say by session-1) once we have exhausted all the pages in local map and didn't find available space, we try to extend the relation. Note, by this time we haven't cleared the local map. Now, say we didn't got the conditional relation extension lock (in function RelationGetBufferForTuple, line 561) and before we get the relation extension lock, the other backend (say session-2) acquired it and extended the relation in bulk (via RelationAddExtraBlocks) and updated the FSM (as the total number of blocks exceeds HEAP_FSM_CREATION_THRESHOLD). After that, session-1 got the lock and found the page from FSM which has some freespace. Then before it tries to validate whether the page has enough space to insert a new tuple, session-2 again inserted few more rows such that the page which session-1 has got from FSM doesn't have free space. After that, session-1 will try to record the freespace for this page and find a new page from FSM, however as our local map is still not cleared, it will hit the assertion shown above. Based on above theory, I have tried to reproduce this problem via debugger and here are the steps, I have followed. Session-1 ----------------- 1. Create table fsm_1(c1 int, c2 char(1500)); 2. Attach debugger, set breakpoint on line no. 535 in hio.c ( aka targetBlock = RecordAndGetPageWithFreeSpace(relation,) 3. insert into fsm_1 values(generate_series(1,10),'aaaaa'); 4. Now, when you debug the code at line 535, you must get InvalidBlockNumber and you can check that local map will still exist (value of fsm_local_map.nblocks must be greater than 0). 5. Debug till line 561 (else if (!ConditionalLockRelationForExtension(relation, ExclusiveLock))). Session-2 ------------------ 1. Attach debugger, set breakpoint on line no. 599 in hio.c (aka buffer = ReadBufferBI(relation, P_NEW, RBM_ZERO_AND_LOCK, bistate);) 2. insert into fsm_1 values(generate_series(11,37),'aaaaa'); 3. Now leave the debugger of this session at this stage, this ensures that Session-2 has RelationExtensionLock Session-1 ---------------- 6. Step over line 561. 7. Before acquiring RelationExtension Lock in in line 564 (LockRelationForExtension(relation, ExclusiveLock);), allow Session-2 to complete. Session-2 --------------- 4. continue 5. vacuum fsm_1; --This will allow to create FSM for this table. Session-1 ---------------- 8. Step through debugger, you will get the block with freespace from FSM via GetPageWithFreeSpace. 9. Now, it will allow to try that block (via goto loop). 10. Let the debugger in the same state and allow Session-2 to execute some statements so that the space in this block is used up. Session-2 ---------------- 6. insert into fsm_1 values(generate_series(41,50),'aaaaa'); Session-1 ---------------- 11. Step through debugger and it should hit Assertion in RecordAndGetPageWithFreeSpace (line 227, freespace.c) I am looking at code as of commit - 2ab23445bc6af517bddb40d8429146d8ff8d7ff4. I have also tried to reproduce it via the ecpg test and my colleague Kuntal Ghosh has helped me to run that ecpg test by varying THREADS and REPEATS in ecpg/test/thread/prep.pgc on different machines, but no luck till now. To fix this symptom, we can ensure that once we didn't get any block from local map, we must clear it. See the attached patch. I will try to evaluate this code path to see if there is any similar race condition and will also try to generate a reproducer. I don't have any great idea to write a reproducer for this issue apart from trying some thing similar to ecpg/test/thread/prep.pgc, if you can think of any, I am all ears. John, others, can you review my findings and patch? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: