On Tue, May 21, 2013 at 10:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> [ MVCC catalog access seems to be pretty cheap ]
In discussions today, Tom Lane suggested testing the time to start up
a backend and run a simple query such as "SELECT 2+2" in the absence
of a relcache file.
I did this and can't measure any overhead as a result of MVCC catalog
access. I tried it with no active connections. I tried it with 600
idle active connections (to make taking MVCC snapshots more
expensive). I couldn't quite believe it made no difference, so I
tried doing it in a tight loop under pgbench. I still can't measure
any difference. I haven't tested carefully enough to rule out the
possibility of an effect <1/2% at 600 connections, but there certainly
isn't anything bigger than that and I don't even think there's that
much of a difference.
Andres Freund suggested creating a couple of simple tables and having
lots of short-lived backends select data from them.
rhaas=# create table af1 (x) as select g from generate_series(1,4) g;
SELECT 4
rhaas=# create table af2 (x) as select g from generate_series(4,7) g;
SELECT 4
Test query: SELECT * FROM af1, af2 WHERE af1.x = af2.x;
pgbench command: pgbench -T 10 -c 50 -j 50 -n -f f -C
With mvcc_catalog_access=off, I get ~1553 tps; with it on, I get ~1557
tps. Hmm... that could be because of the two-line debugging hunk my
patch addes to HeapTupleSatisfiesNow(). After removing that, I get
maybe a 1% regression with mvcc_catalog_access=on on this test, but
it's not very consistent. If I restart the database server a few
times, the overhead bounces around each time, and sometimes it's zero;
the highest I saw is 1.4%. But it's not much, and this is a pretty
brutal workload for PostgreSQL, since starting up >1500 connections
per second is not a workload for which we're well-suited in the first
place.
All in all, I'm still feeling optimistic.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company