Re: Dead Lock problem with 8.1.3 - Mailing list pgsql-general

From Kai Hessing
Subject Re: Dead Lock problem with 8.1.3
Date
Msg-id 4o2bkgFclmdhU1@individual.net
Whole thread Raw
In response to Re: Dead Lock problem with 8.1.3  (Alban Hertroys <alban@magproductions.nl>)
Responses Re: Dead Lock problem with 8.1.3
Re: Dead Lock problem with 8.1.3
List pgsql-general
> EXPLAIN without ANALYZE locking up?!? Maybe some application is holding
> a lock on a record in your result set.

OK, this was a good hint. I got EXPLAIN working. My mistake was, that I
tried explain the last time, while the 'Killer-SQL' was running. Then
EXPLAIN didn't answer until I killed the CPU-eating process. Interesting
is, that it seems to be not possible to access pg_locks while this
process is running. A 'select * from pg_locks' takes also for... a very
long time...

> Did you try this query without
> any other applications connecting to that database?
>
> If you can't do that, you could dump that database and restore it in a
> different one for testing cases like this.

Yes, I surely have a testing environment with the same setup as the live
system. The Query does not work. But after terminating all running
queries the EXPLAIN did work. This is the result (The 8.0.8 database has
a dataset which is around 2 weeks older):

PostgreSQL 8.1.4 (The one which is taking... a very long time...)

Merge Join  (cost=7751.81..50026810.45 rows=7364 width=4)
  Merge Cond: ("outer".sid = "inner".sid)
  ->  Index Scan using stud_pkey on stud s  (cost=7751.81..49994210.01
rows=56607 width=4)
        Filter: (NOT (subplan))
        SubPlan
          ->  Materialize  (cost=7751.81..8497.94 rows=53613 width=4)
                ->  Seq Scan on stud_vera  (cost=0.00..7488.20
rows=53613 width=4)
                      Filter: (veraid = 2)
  ->  Index Scan using stud_vera_sid_veraid_idx on stud_vera v
(cost=0.00..37646.74 rows=14729 width=4)
        Index Cond: (veraid = 34)


PostgreSQL 8.0.8 (The one which takes only a few seconds... on a much
slower system...)

Merge Join  (cost=22724.05..23019.46 rows=731 width=4)
  Merge Cond: ("outer".sid = "inner".sid)
  ->  Sort  (cost=15223.02..15363.42 rows=56159 width=4)
        Sort Key: s.sid
        ->  Seq Scan on stud s  (cost=7427.89..10792.85 rows=56159 width=4)
              Filter: (NOT (hashed subplan))
              SubPlan
                ->  Seq Scan on stud_vera  (cost=0.00..7424.24 rows=1461
width=4)
                      Filter: (veraid = 2)
  ->  Sort  (cost=7501.03..7504.69 rows=1461 width=4)
        Sort Key: v.sid
        ->  Seq Scan on stud_vera v  (cost=0.00..7424.24 rows=1461 width=4)
              Filter: (veraid = 34)

> Lastly, considering the odd behaviour and the huge differences between
> minor versions of the database, you might have a corrupted index
> somewhere. You can fix those with REINDEX.

I did REINDEX and REINDEX FORCE and VACUUM FULL ANALYZE and all of this
doesn't take any noticable effect neither in calling the query nor in
explaining it.

Another thing I did was to do strace to the problematic process and
noticed that it is also repeatedly accessing a file called (for the
whole time): postgresql/8.1/main/base/1740468/pgsql_tmp/pgsql_tmp21938.0
It doesn't cost much disk performance but is permanently working.


pgsql-general by date:

Previous
From: "Guy Rouillier"
Date:
Subject: Re: How to create nightly backups in Linux
Next
From: Kai Hessing
Date:
Subject: Re: Dead Lock problem with 8.1.3