David Mitchell wrote:
> We have the following function in our home grown mirroring package, but
> it isn't running as fast as we would like. We need to select statements
> from the pending_statement table, and we want to select all the
> statements for a single transaction (pending_trans) in one go (that is,
> we either select all the statements for a transaction, or none of them).
> We select as many blocks of statements as it takes to top the 100
> statement limit (so if the last transaction we pull has enough
> statements to put our count at 110, we'll still take it, but then we're
> done).
>
> Here is our function:
>
> CREATE OR REPLACE FUNCTION dbmirror.get_pending()
> RETURNS SETOF dbmirror.pending_statement AS
> $BODY$
>
> DECLARE
> count INT4;
> transaction RECORD;
> statement dbmirror.pending_statement;
> BEGIN
> count := 0;
>
> FOR transaction IN SELECT t.trans_id as ID
> FROM pending_trans AS t WHERE fetched = false
> ORDER BY trans_id LIMIT 50
> LOOP
> update pending_trans set fetched = true where trans_id =
> transaction.id;
>
> FOR statement IN SELECT s.id, s.transaction_id, s.table_name,
> s.op, s.data
> FROM dbmirror.pending_statement AS s
> WHERE s.transaction_id = transaction.id
> ORDER BY s.id ASC
> LOOP
> count := count + 1;
>
> RETURN NEXT statement;
> END LOOP;
>
> IF count > 100 THEN
> EXIT;
> END IF;
> END LOOP;
>
> RETURN;
> END;$BODY$
> LANGUAGE 'plpgsql' VOLATILE;
David,
I'm still a newbie and it may not affect performance but why are you
aliasing the tables? Can you not simply use
FOR transaction IN SELECT trans_id
FROM pending_trans
WHERE fetched = false
ORDER BY trans_id
LIMIT 50
and
FOR statement IN SELECT id,
transaction_id,
table_name,
op,
data
FROM dbmirror.pending_statement
WHERE pending_statement.transaction_id =
transaction.trans_id
ORDER BY pending_statement.id
I am pretty sure that the ORDER BY is slowing down both of these
queries. Since you are going to go through the whole table eventually
do you really need to sort the data at this point?
--
Kind Regards,
Keith