Re: slow DELETE on 12 M row table - Mailing list pgsql-performance

From Janet Jacobsen
Subject Re: slow DELETE on 12 M row table
Date
Msg-id 4A4D3937.6070509@lbl.gov
Whole thread Raw
In response to Re: slow DELETE on 12 M row table  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
Hi.  I posted a question about a very slow DELETE on a table
with 12 M rows last week, and I wanted to (1) thank everyone
who provided a reply since each clue helped to find the solution,
and (2) give the solution.

The slow DELETE was due to another user having a lock on
the table - which several people on this list pointed out must
be the case.  Since the user was only running SELECT on
the table (no inserts, deletes, or updates), it wasn't obvious at
first whether or how his process was locking the table.

Robert suggested the need for a commit or rollback, as well as
posting to the psycopg list.  Pasted below is the response that
I got from Federico Di Gregorio.

The user added a conn.rollback() to his script, and that solved
the problem.  Now it is possible to delete rows, create indexes,
etc. without having to kill the user's process.

Many thanks,
Janet


Robert Haas wrote:
> 2009/6/26 Janet Jacobsen <jsjacobsen@lbl.gov>:
>
>> Hi.  The user in question is using psycopg2, which he uses
>> psycopg2:
>>
>>> import psycopg2
>>> conn = psycopg2.connect("dbname=%s  user=%s host=%s password=%s port=%s" ...)
>>> pg_cursor = conn.cursor()
>>> pg_cursor.execute(<select string>)
>>> rows = pg_cursor.fetchall()
>>>
>> Note that
>> (1) he said that he does not set an isolation level, and
>> (2) he does not close the database connection after the
>> fetchall - instead he has a Python sleep command, so
>> he is checking the database every 60 s to see whether
>> new entries have been added to a given table.  (His
>> code is part of the analysis pipeline - we process the
>> image data and load it into the database, and other
>> groups fetch the data from the database and do some
>> analyses.)
>>
>> Yes, it is the case that the user's process shows up in
>> ps aux as "idle in transaction".
>>
>> What would you recommend in this case?  Should the
>> user set the isolation_level for psycopg, and if so to what?
>>
>> Is there any Postgres configuration parameter that I
>> should set?
>>
>> Should the user close the database connection after
>> every fetchall?
>>
>
> You need to COMMIT or ROLLBACK the in-process transaction and then not
> start a new transaction until you're ready to execute the next query.
> Possibly calling .commit() after executing your query might be all you
> need to do, but never having used psycopg2 I couldn't say.  You might
> try asking on the psycopg mailing list.
>
> ...Robert
>

> Il giorno lun, 29/06/2009 alle 12.26 -0700, Janet Jacobsen ha scritto:
> [snip]
>
>> > The user told me that he does not close the database connection
>> > after the fetchall - instead he has a Python sleep command, so that
>> > he is checking the database every 60 s to see whether new entries
>> > have been added to a given table
>> > His code is part of an analysis pipeline, whereas the part of the
>> > database that I work on is loading processed data into the
>> > database.
>> > Is there something missing from his code sample, like a commit or
>> > a set_isolation_level, that if added would prevent the "idle in
>> > transaction" from happening?
>>
>
> The user is wrong and you're right, the "idle in transaction" can be
> avoided by both a commit() (or rollback()) before going to sleep or by
> setting the transaction mode to "autocommit":
>
> conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
>
> Hope this helps,
> federico
>
> -- Federico Di Gregorio http://people.initd.org/fog Debian GNU/Linux
> Developer fog@debian.org INIT.D Developer fog@initd.org Sei una
> bergogna. Vergonga. Vergogna. -- Valentina

pgsql-performance by date:

Previous
From: Sergey Burladyan
Date:
Subject: regression ? 8.4 do not apply One-Time Filter to subquery
Next
From: Scott Marlowe
Date:
Subject: Re: slow DELETE on 12 M row table