MOVE command - Mailing list pgsql-performance

From PFC
Subject MOVE command
Date
Msg-id opskji1hmzth1vuj@musicbox
Whole thread Raw
In response to Re: Performance delay  (Hasnul Fadhly bin Hasan <hasnulfadhly.h@mimos.my>)
List pgsql-performance
    Hello,

    Here I'm implementing a session management, which has a connections table
partitioned between active and archived connections. A connection
represents a connection between a user and a chatroom.

    I use partitioning for performance reasons.

    The active table contains all the data for the active session : user_id,
chatroom_id, session start time, and other information.
    The archive table contains just the user_id, chatroom_id, session start
and end time, for logging purposes, and for displaying on the site, which
user was logged to which chatroom and from when to when.

    Thus, when a user disconnects from a chatroom, I must move one row from
the active to the archive table. This poses no problem as there is a
UNIQUE index (iser_id,chatroom_id) so I select the row FOR UPDATE, insert
it in the archive table, then delete it.

    Now, when a user logs out from the site, or when his session is purged by
the auto-expiration cron job, I must also expire ALL his open chatroom
connections.
    INSERT INTO archive (...) SELECT ... FROM active WHERE user_id = ...;
    DELETE FROM active WHERE user_id = ...;

    Now, if the user inserts a connection between the two queries above, the
thing will fail (the connection will just be deleted). I know that there
are many ways to do it right :
    - LOCK the table in exclusive mode
    - use an additional primary key on the active table which is not related
to the user_id and the chatroom_id, select the id's of the sessions to
expire in a temporary table, and use that
    - use an extra field in the table to mark that the rows are being
processed
    - use transaction isolation level SERIALIZABLE

    However, all these methods somehow don't feel right, and as this is an
often encountered problem, I'd really like to have a sql command, say
MOVE, or SELECT AND DELETE, whatever, which acts like a SELECT, returning
the rows, but deleting them as well. Then I'd just do INSERT INTO archive
(...) SELECT ... AND DELETE FROM active WHERE user_id = ...;

    which would have the following advantages :
    - No worries about locks :
        - less chance of bugs
        - higher performance because locks have to be waited on, by definition
    - No need to do the request twice (so, it is twice as fast !)
    - Simplicity and elegance

    There would be an hidden bonus, that if you acquire locks, you better
COMMIT the transaction as soon as possible to release them, whereas here,
you can happily continue in the transaction.

    I think this command would make a nice cousin to the also very popular
INSERT... OR UPDATE which tries to insert a row, and if it exists, UPDATES
it instead of inserting it !

    What do you think ?






pgsql-performance by date:

Previous
From: Hasnul Fadhly bin Hasan
Date:
Subject: Re: Performance delay
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Performance delay