> > DELETE FROM users_sessions WHERE changed < ('now'::timestamp - '1440
> > minutes'::interval) AND name = 'fhnid';
>
> What does EXPLAIN show as the plan for that query? I'm guessing an
> indexscan, and that the error was caused by reading a broken item
> pointer from the index. (1342198864 = hex 50005450, which sure looks
> like the upper 5 shouldn't be there ... how big is the table, anyway?)
NOTICE: QUERY PLAN:
Index Scan using users_sessions_cha_name_idx on users_sessions
(cost=0.00..12738.07 rows=1275 width=6) (actual time=231.74..239.39 rows=2
loops=1)
Total runtime: 239.81 msec
EXPLAIN
The size of the table:
canaveral# ls -al 44632
-rw------- 1 pgsql pgsql 357130240 Sep 19 18:52 44632
The size of the index:
canaveral# ls -al 7331245
-rw------- 1 pgsql pgsql 8151040 Sep 19 18:51 7331245
Holy crap - that table is huge. It's like it's never had a vacuum full sort
of thing. Going select count(*) takes _ages_ even though there's only 1451
rows in it - and not particularly large rows. Actually, the longest text
entry is 3832 characters and the average is 677.
The sessions table holds normal site session data, like a uid, username,
some other stuff, etc. However entries older than two hours or so get
deleted. We VACUUM everynight, so why is the on-disk relation growing so
huge?
> > However, I cannot repeat the error now. Is this a bug in postgres
> > somewhere.
>
> If the broken item pointer were indeed in the index, I'd expect it to be
> 100% repeatable. I'm wondering about flaky memory or some such. Have
> you run any hardware diagnostics?
No - the thought occured to me that there might be something wacky going on.
We've had problems with users_sessions before. Remember when I mailed about
vacuum failing on it before? You suggested doing a select for update on the
relation and that fixed it.
Chris