Thread: Mismatched pg_class.reltuples between table and primary key
We recently experienced a hard crash of a dev server due to lack of resources (we think, still investigating).
That resulted in an interesting scenario where one of the tables was returning 12 rows from "select * from tbl", but "select * from tbl order by <pk column>" was only returning 11.
Looking at pg_class.reltuples, it could be clearly seen that the table in question had a value of 12, but the primary key had only 11.
My system catalog ignorance is probably showing, but I assumed that the table and PK pg_class.reltuples values should always be the same?
The initial observations were made by a developer:
1). After an ANALYZE, PK reltuples remained at 11.
2). After REINDEX, PK reltuples was still 11.
3). After VACUUM FULL, PK reltuples was reset to 12.
The db was quiescent at this point (rows were not being added or removed).
I would chalk this up to a localized index corruption issue, but earlier today I observed the same thing on a different table which did not originally show a difference between the table and PK reltuples values.
This is following a VACUUM FREEZE, and an ANALYZE run, as well as at least two restarts of the db cluster since the initial failure.
Can anyone suggest whether this is an index corruption issue, or maybe a misunderstanding on my part about what the reltuples attribute means? Or perhaps it is something else?
PostgreSQL 9.3.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit
default_statistics_target = 200
Here is a query I hacked together to raise a WARNING if the reltuples value of a table was different than the associated primary key or unique key. I visually ignored any values that were above the default_statistics_target value, to factor out sample size issues.That resulted in an interesting scenario where one of the tables was returning 12 rows from "select * from tbl", but "select * from tbl order by <pk column>" was only returning 11.
Looking at pg_class.reltuples, it could be clearly seen that the table in question had a value of 12, but the primary key had only 11.
My system catalog ignorance is probably showing, but I assumed that the table and PK pg_class.reltuples values should always be the same?
The initial observations were made by a developer:
1). After an ANALYZE, PK reltuples remained at 11.
2). After REINDEX, PK reltuples was still 11.
3). After VACUUM FULL, PK reltuples was reset to 12.
The db was quiescent at this point (rows were not being added or removed).
I would chalk this up to a localized index corruption issue, but earlier today I observed the same thing on a different table which did not originally show a difference between the table and PK reltuples values.
This is following a VACUUM FREEZE, and an ANALYZE run, as well as at least two restarts of the db cluster since the initial failure.
Can anyone suggest whether this is an index corruption issue, or maybe a misunderstanding on my part about what the reltuples attribute means? Or perhaps it is something else?
PostgreSQL 9.3.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit
default_statistics_target = 200
DO
$$
DECLARE
rec RECORD;
BEGIN
FOR rec IN
SELECT n.nspname schemaname,
c.relname as tablename,
c.reltuples::NUMERIC as tbl_tuples,
i.relname as idx_name,
i.reltuples::NUMERIC as idx_tuples
FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class i ON i.oid = x.indexrelid
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'::"char"
AND i.relkind = 'i'::"char"
AND n.nspname NOT IN ('pg_catalog','information_schema')
AND (x.indisunique IS TRUE OR x.indisprimary IS TRUE)
ORDER BY 1,2,4
LOOP
IF ( rec.idx_name IS NOT NULL
AND rec.tbl_tuples IS DISTINCT FROM rec.idx_tuples ) THEN
RAISE WARNING 'Mismatched tuple counts - Table %, tuples: %, unique key: %, tuples: %',
rec.schemaname||'.'||rec.tablename, rec.tbl_tuples, rec.idx_name, rec.idx_tuples;
END IF;
END LOOP;
END
$$;
bricklen <bricklen@gmail.com> writes: > We recently experienced a hard crash of a dev server due to lack of > resources (we think, still investigating). > That resulted in an interesting scenario where one of the tables was > returning 12 rows from "select * from tbl", but "select * from tbl order by > <pk column>" was only returning 11. > Looking at pg_class.reltuples, it could be clearly seen that the table in > question had a value of 12, but the primary key had only 11. > My system catalog ignorance is probably showing, but I assumed that the > table and PK pg_class.reltuples values should always be the same? I wouldn't assume that --- they're only approximations, anyway. In particular there's some moving-average behavior in there that might prevent small errors in the value from ever going away completely. > The initial observations were made by a developer: > 1). After an ANALYZE, PK reltuples remained at 11. > 2). After REINDEX, PK reltuples was still 11. > 3). After VACUUM FULL, PK reltuples was reset to 12. (2) is a bit surprising, but I'm not sure that REINDEX recomputes the reltuples value for the index. regards, tom lane
bricklen escribió: > We recently experienced a hard crash of a dev server due to lack of > resources (we think, still investigating). > That resulted in an interesting scenario where one of the tables was > returning 12 rows from "select * from tbl", but "select * from tbl order by > <pk column>" was only returning 11. > Looking at pg_class.reltuples, it could be clearly seen that the table in > question had a value of 12, but the primary key had only 11. This sounds very similar to what was reported in http://www.postgresql.org/message-id/20131126123244.GI23284@alap2.anarazel.de In short, you probably want to make sure to upgrade to this week's release. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi Alvaro,
Thanks for the link.
On Mon, Dec 2, 2013 at 4:35 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
If it is indeed the same issue (and I see something similar in an actual production system running 9.2.5 which was recovered from a crashed LVM just the other day), are the chances of data corruption particularly high? From reading over that linked thread, it looked like the problem was likely to stem from wraparound.bricklen escribió:> We recently experienced a hard crash of a dev server due to lack ofThis sounds very similar to what was reported in
> resources (we think, still investigating).
> That resulted in an interesting scenario where one of the tables was
> returning 12 rows from "select * from tbl", but "select * from tbl order by
> <pk column>" was only returning 11.
> Looking at pg_class.reltuples, it could be clearly seen that the table in
> question had a value of 12, but the primary key had only 11.
http://www.postgresql.org/message-id/20131126123244.GI23284@alap2.anarazel.de
In short, you probably want to make sure to upgrade to this week's
release.
Once the patch is applied, will the existing data still be vulnerable? I haven't been paying close attention to that particular thread, so I might have missed a few things regarding the fixes -- will a full dump + reload be necessary?
Thanks,
Bricklen