MVCC catalog access - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | MVCC catalog access |
Date | |
Msg-id | CA+TgmoaShP2ytBXC3J10dvHzgi8tiL33TxCsAXbJtqY7z9SFQw@mail.gmail.com Whole thread Raw |
Responses |
Re: MVCC catalog access
Re: MVCC catalog access |
List | pgsql-hackers |
We've had a number of discussions about the evils of SnapshotNow. As far as I can tell, nobody likes it and everybody wants it gone, but there is concern about the performance impact. I decided to do some testing to measure the impact. I was pleasantly surprised by the results. The attached patch is a quick hack to provide for MVCC catalog access. It adds a GUC called "mvcc_catalog_access". When this GUC is set to true, and heap_beginscan() or index_beginscan() is called with SnapshotNow, they call GetLatestSnapshot() and use the resulting snapshot in lieu of SnapshotNow. As a debugging double-check, I modified HeapTupleSatisfiesNow to elog(FATAL) if called with mvcc_catalog_access is true. Of course, both of these are dirty hacks. If we were actually to implement MVCC catalog access, I think we'd probably just go through and start replacing instances of SnapshotNow with GetLatestSnapshot(), but I wanted to make it easy to do perf testing. When I first made this change, I couldn't detect any real change; indeed, it seemed that make check was running ever-so-slightly faster than before, although that may well have been a testing artifact. I wrote a test case that created a schema with 100,000 functions in it and then dropped the schema (I believe it was Tom who previously suggested this test case as a worst-case scenario for MVCC catalog access). That didn't seem to be adversely affected either, even though it must take ~700k additional MVCC snapshots with mvcc_catalog_access = true. MVCC Off: Create 8743.101 ms, Drop 9655.471 ms MVCC On: Create 7462.882 ms, Drop 9515.537 ms MVCC Off: Create 7519.160 ms, Drop 9380.905 ms MVCC On: Create 7517.382 ms, Drop 9394.857 ms The first "Create" seems to be artificially slow because of some kind of backend startup overhead. Not sure exactly what. After wracking my brain for a few minutes, I realized that the lack of any apparent performance regression was probably due to the lack of any concurrent connections, making the scans of the PGXACT array very cheap. So I wrote a little program to open a bunch of extra connections. My MacBook Pro grumbled when I tried to open more than about 600, so I had to settle for that number. That was enough to show up the cost of all those extra snapshots: MVCC Off: Create 9065.887 ms, Drop 9599.494 ms MVCC On: Create 8384.065 ms, Drop 10532.909 ms MVCC Off: Create 7632.197 ms, Drop 9499.502 ms MVCC On: Create 8215.443 ms, Drop 10033.499 ms Now, I don't know about you, but I'm having a hard time getting agitated about those numbers. Most people are not going to drop 100,000 objects with a single cascaded drop. And most people are not going to have 600 connections open when they do. (The snapshot overhead should be roughly proportional to the product of the number of drops and the number of open connections, and the number of cases where the product is as high as 60 million has got to be pretty small.) But suppose that someone is in that situation. Well, then they will take a... 10% performance penalty? That sounds plenty tolerable to me, if it means we can start moving in the direction of allowing some concurrent DDL. Am I missing an important test case here? Are these results worse than I think they are? Did I boot this testing somehow? [MVCC catalog access patch, test program to create lots of idle connections, and pg_depend stress test case attached.] -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: