Home > mailing lists
UPDATE on two large datasets is very slow - Mailing list pgsql-general

From	Steve Gerhardt
Subject	UPDATE on two large datasets is very slow
Date	April 3, 2007 15:10:33
Msg-id	4611C8FE.8030407@ocean.fraknet.org Whole thread Raw
Responses	Re: UPDATE on two large datasets is very slow Re: UPDATE on two large datasets is very slow Re: UPDATE on two large datasets is very slow
List	pgsql-general
Tree view
I've been working for the past few weeks on porting a closed source
BitTorrent tracker to use PostgreSQL instead of MySQL for storing
statistical data, but I've run in to a rather large snag. The tracker in
question buffers its updates to the database, then makes them all at
once, sending anywhere from 1-3 MiB of query data. With MySQL, this is
accomplished using the INSERT INTO...ON DUPLICATE KEY UPDATE query,
which seems to handle the insert/update very quickly; generally it only
takes about a second for the entire set of new data to be merged.

The problem I am encountering is that when I attempt to duplicate this
functionality in Postgres, it is terrifically slow to a point of utter
unusability. The tracker currently handles around 10,000-40,000 client
updates per minute, which translates roughly to the same number of rows
in the database. Part of the issue is that some of those rows cannot be
updated because they do not yet exist in the database, but there is
likely around a 100:1 ratio on updates to inserts.

After consulting with some of the folks on the PostgreSQL IRC channel on
freenode.net, I was left with this idea to try:

---------------------------------------------------------------------
BEGIN

CREATE TEMP TABLE temp_p2 ON COMMIT DROP AS (SELECT tid, uid, uploaded,
downloaded, remaining, avg_up, avg_down, active, timespent, ip, port,
peer_id, blocked FROM peers2 WHERE FALSE)

COPY temp_p2 FROM STDIN WITH CSV QUOTE AS ''''
< the data is sent by the tracker using PQputCopyData >

UPDATE peers2 AS p SET uploaded = p.uploaded + t.uploaded, downloaded =
p.downloaded + t.downloaded, remaining = t.remaining, avg_up = t.avg_up,
avg_down = t.avg_down, active = t.active, timespent = p.timespent +
t.timespent, ip = t.ip, port = t.port, blocked = t.blocked, timestamp =
CURRENT_TIMESTAMP FROM temp_p2 AS t WHERE (p.uid = t.uid AND p.tid = t.tid)

INSERT INTO peers2 (tid, uid, uploaded, downloaded, remaining, avg_up,
avg_down, active, timespent, ip, port, peer_id, blocked) SELECT t.* FROM
temp_p2 AS t LEFT JOIN peers2 USING (uid, tid) WHERE peers2.uid IS NULL
AND peers2.tid IS NULL

COMMIT
---------------------------------------------------------------------

Initial attempts showed the UPDATE query was incredibly slow. After
sitting down at the psql command line, I managed to get the query plan
for it after much waiting.

# EXPLAIN ANALYZE UPDATE peers2...etc etc
QUERY PLAN
-------------------------------------------------------------------------
  Merge Join  (cost=262518.76..271950.65 rows=14933 width=153) (actual
time=8477.422..9216.893 rows=26917 loops=1)
    Merge Cond: ((p.tid = t.tid) AND (p.uid = t.uid))
    ->  Sort  (cost=177898.12..180004.09 rows=842387 width=65) (actual
time=7803.248..8073.817 rows=109732 loops=1)
          Sort Key: p.tid, p.uid
          ->  Seq Scan on peers2 p  (cost=0.00..25885.87 rows=842387
width=65) (actual time=0.043..4510.771 rows=647686 loops=1)
    ->  Sort  (cost=84620.64..85546.64 rows=370400 width=96) (actual
time=641.438..761.893 rows=55393 loops=1)
          Sort Key: t.tid, t.uid
          ->  Seq Scan on temp_p2 t  (cost=0.00..11112.00 rows=370400
width=96) (actual time=0.093..275.110 rows=55393 loops=1)
  Total runtime: 192569.492 ms
(9 rows)

(Apologies if the formatting got ruined by my e-mail client.)

Essentially, it looks like what it's doing is sorting both tables on the
WHERE clause, then finding which positions correspond between the two.
The problem is that, as can be seen, peers2 has 600,000+ rows, so
sequential scanning and sorting it is a rather non-trivial operation. As
a sidenote, there is a unique index set up for peers2.uid and
peers2.tid, so any lookups should be fully indexed.

After this method seemed to fail miserably, I took another approach and
wrote a stored procedure, which should in theory accomplish much the
same thing. I assumed this would be faster because it would iterate over
the temp_p2 table sequentially, and do a simple index lookup + update to
the peers2 table on each step, without any sorting or craziness
required. For this to work, the tracker needs to automatically
categorize client updates into "needs UPDATE" or "needs INSERT" buffers,
which would be handled separately. The inserts are lightning quick and
are not an issue, but the updates, as illustrated below, are not very good.

Here is the first version of the stored procedure:

CREATE OR REPLACE FUNCTION tracker_update() RETURNS integer AS $PROC$
DECLARE
   rec temp_p2%ROWTYPE;
BEGIN
   FOR rec IN SELECT * FROM temp_p2 LOOP
     UPDATE peers2 SET uploaded = uploaded + rec.uploaded,
        downloaded = downloaded + rec.downloaded,
        remaining = rec.remaining,
        avg_up = rec.avg_up,
        avg_down = rec.avg_down,
        active = rec.active,
        timespent = timespent + rec.timespent,
        ip = rec.ip,
        port = rec.port,
        peer_id = rec.peer_id,
        blocked = rec.blocked,
        timestamp = CURRENT_TIMESTAMP
        WHERE uid = rec.uid AND tid = rec.tid;
   END LOOP;
   RETURN 1;
END;
$PROC$ LANGUAGE plpgsql;

Here are the results. I ran it through EXPLAIN ANALYZE just to get the
timing information, since one cannot view query plans for stored
procedures that way. This is on the same dataset as the UPDATE above.

# EXPLAIN ANALYZE SELECT tracker_update();
QUERY PLAN
-------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=496676.127..496676.128 rows=1 loops=1)
  Total runtime: 496684.305 ms
(2 rows)

Realizing this took about 2.57 times as long as the original UPDATE
query, I ventured to write a second version that used EXECUTE instead so
there was no pre-planning and it would run the UPDATES individually with
constant values. Here is that second version:

CREATE OR REPLACE FUNCTION tracker_update2() RETURNS integer AS $PROC$
DECLARE
   rec temp_p2%ROWTYPE;
BEGIN
   FOR rec IN SELECT * FROM temp_p2 LOOP
     EXECUTE 'UPDATE peers2 SET uploaded = uploaded + ' || rec.uploaded
        || ', downloaded = downloaded + ' || rec.downloaded
        || ', remaining = ' || rec.remaining
        || ', avg_up = ' || rec.avg_up
        || ', avg_down = ' || rec.avg_down
        || ', active = ' || rec.active
        || ', timespent = timespent + ' || rec.timespent
        || ', ip = ' || quote_literal(host(rec.ip))
        || ', port = ' || rec.port
        || ', peer_id = ' || quote_literal(encode(rec.peer_id, 'escape'))
        || ', blocked = ' || rec.blocked
        || ', timestamp = CURRENT_TIMESTAMP WHERE'
        || ' uid = ' || rec.uid
        || ' AND tid = ' || rec.tid;
   END LOOP;
   RETURN 1;
END;
$PROC$ LANGUAGE plpgsql;

As an aside, the quote_literal calls are needed for the ip and peer_id
fields, which are inet and bytea types, respectively.

Here is the execution time of this stored procedure on the same dataset
as the UPDATE above:
# EXPLAIN ANALYZE SELECT tracker_update2();
QUERY PLAN
------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=432153.101..432153.103 rows=1 loops=1)
  Total runtime: 432153.127 ms
(2 rows)

Hardly an improvement. Unfortunately at this time I'm at a bit of a
loss. I cannot think of any solution that would be faster than the
original UPDATE method, but this method is far, far too slow anyway,
since the tracker attempts to send a new buffer to Postgres every
minute, so if each buffer takes ~3 minutes to process, it will simply
backlog itself into oblivion, which means I might as well not bother
porting the code at all. I don't mean to come down on Postgres, but it
surprises me that for all the advanced features it has, MySQL utterly
blows it out of the water for this one aspect, but I'm absolutely
willing to admit it's probably my fault for approaching the problem the
wrong way, which is why I'm seeking help :)

Anyway, I admit I haven't done a great deal of configuration file tuning
for the Postgres setup, so if there is something obvious behind the poor
performance, I'll be happy to test out any proposed changes and report
back with the results. If anyone has *ANY* ideas as to why this is so
slow, or different approaches that may be faster than what I have tried,
please e-mail me back; I've struggled with this for week with absolutely
no results. If you need to see configuration files, sample data, or
anything else, don't hesitate to ask, I'll be happy to provide as much
as I can if it means there's a solution on the horizon.

For those of you that have actually read all of this, thank you for your
patience, I know it's a huge message.

Thanks much,

Steve Gerhardt
pgsql-general by date:
From: Listmail
Date: 03 April 2007, 15:02:04
Subject: BitmapScan mishaps
From: "jezemine"
Date: 03 April 2007, 15:10:34
Subject: Re: documentation generator for pgsql
UPDATE on two large datasets is very slow - Mailing list pgsql-general

Previous

Next