Thread: Statistics visibility in SERIALIZABLE transactions
Updates to planner statistics appear to be visible in SERIALIZABLE transactions even though updated data is not. Is this intentional? Could that adversely affect query plans? CREATE TABLE test (x integer); INSERT INTO test (x) SELECT i % 2 FROM generate_series(1, 100) AS g(i); ANALYZE test; Transaction 1: BEGIN ISOLATION LEVEL SERIALIZABLE; EXPLAIN ANALYZE SELECT * FROM test WHERE x = 1; QUERY PLAN ------------------------------------------------------------------------------------------------Seq Scan on test (cost=0.00..2.25rows=50 width=4) (actual time=0.201..0.787 rows=50 loops=1) Filter: (x = 1)Total runtime: 1.169 ms (3 rows) Transaction 2: BEGIN; DELETE FROM test WHERE x = 1; ANALYZE test; COMMIT; Transaction 1: EXPLAIN ANALYZE SELECT * FROM test WHERE x = 1; QUERY PLAN -----------------------------------------------------------------------------------------------Seq Scan on test (cost=0.00..1.62rows=1 width=4) (actual time=0.499..1.090 rows=50 loops=1) Filter: (x = 1)Total runtime: 1.476 ms (3 rows) In Transaction 1's second query the planner uses an updated row count estimate even though the old rows are still visible. I think I understand why statistics like the total relation size and total number of tuples would help the planner, but is there a reason for distribution statistics to be visible for data that itself isn't visible? Thanks. -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > Updates to planner statistics appear to be visible in SERIALIZABLE > transactions even though updated data is not. Is this intentional? This is because the planner uses the catcache to read pg_statistic, and that runs on SnapshotNow rules. I am not excited about it: as a general rule you want to be using the latest stats available. It is much more likely that the stats existing at the start of your transaction are out-of-date than that they are a better representation of the data in your snapshot than stats committed a bit later. It's an approximation either way... (Actually, in general catalog changes apply immediately even in SERIALIZABLE mode. Try looking at a table that was ALTERed since your transaction began.) regards, tom lane
Tom, > Try looking at a table that was ALTERed since > your transaction began.) Shouldn't the transaction fail, then? -- Josh Berkus PostgreSQL @ Sun San Francisco
Josh Berkus wrote: > Tom, > > >> Try looking at a table that was ALTERed since >> your transaction began.) >> > > Shouldn't the transaction fail, then? > > Neil Conway and I opened this VERY BIG can of worms a couple of years ago on hackers. I can't find the references to it in the archives. I think I need to go back more than the start of 2005. And I can't remember how to do that. It was in relation to CLUSTER on a table and how previous transactions would not use the correct view of the table. I recall something to do with issues relating to reading the catalog before transaction startup and problems with needing a transaction to use the catalog. My memory of it is fuzzy. If Neil or Tom remember the discussion they may be able to post a link. Sorry if this wasn't much help. Russell Smith