Thread: single transaction vs multiple transactions

single transaction vs multiple transactions

From
Sven Geisler
Date:
Hi,

I have to refactoring a 'DELETE FROM x WHERE y IN (...)' because IN got
to much parameters. => 'stack depth limit exceeded'
I don't want to increase just the parameter for max_stack_depth. It is
better to refactoring because the number of arguments to IN may increase
in the future.

My approach is to do multiple 'DELETE FROM x WHERE y=...'.

My question is now, what is better for PostgreSQL from a performance
perspective?
1. all multiple deletes in one transaction
2. each delete in its own transaction

The number of arguments is around 10,000.

BTW: The arguments are generate in the application tier. I would have to
create a temporary table which I can use in 'DELETE FROM x WHERE y IN
(SELECT z FROM tmp)'.

Cheers
Sven

Re: single transaction vs multiple transactions

From
"Heikki Linnakangas"
Date:
Sven Geisler wrote:
> I have to refactoring a 'DELETE FROM x WHERE y IN (...)' because IN got
> to much parameters. => 'stack depth limit exceeded'
> I don't want to increase just the parameter for max_stack_depth. It is
> better to refactoring because the number of arguments to IN may increase
> in the future.
>
> My approach is to do multiple 'DELETE FROM x WHERE y=...'.

You could also do something in between, issuing the deletes in batches
of say 100 deletes each. But using a temporary table is much better.

> My question is now, what is better for PostgreSQL from a performance
> perspective?
> 1. all multiple deletes in one transaction
> 2. each delete in its own transaction

All in one transaction is definitely faster.

> The number of arguments is around 10,000.
>
> BTW: The arguments are generate in the application tier. I would have to
> create a temporary table which I can use in 'DELETE FROM x WHERE y IN
> (SELECT z FROM tmp)'.

I think that's exactly what you should do.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: single transaction vs multiple transactions

From
Sven Geisler
Date:
Hi Heikki

Heikki Linnakangas schrieb:
> Sven Geisler wrote:
>> I have to refactoring a 'DELETE FROM x WHERE y IN (...)' because IN got
>> to much parameters. => 'stack depth limit exceeded'
>> I don't want to increase just the parameter for max_stack_depth. It is
>> better to refactoring because the number of arguments to IN may increase
>> in the future.
[...]
>>
>> BTW: The arguments are generate in the application tier. I would have to
>> create a temporary table which I can use in 'DELETE FROM x WHERE y IN
>> (SELECT z FROM tmp)'.
>
> I think that's exactly what you should do.

I have to insert my arguments to a temporary table first, because the
arguments are only known in the application tier.
Is a multiple insert to a temporary table and a final 'DELETE FROM x
WHERE y IN (SELECT z FROM tmp)' faster than multiple deletes?

Sven.

Re: single transaction vs multiple transactions

From
"Heikki Linnakangas"
Date:
Sven Geisler wrote:
> I have to insert my arguments to a temporary table first, because the
> arguments are only known in the application tier.
> Is a multiple insert to a temporary table and a final 'DELETE FROM x
> WHERE y IN (SELECT z FROM tmp)' faster than multiple deletes?

If the number of records is high, it most likely is faster. You should
try it with your data to be sure, but in general doing all the deletes
in one batch is faster when the number of records is high because it
allows using efficient merge joins or sequential scans.

Populating the temporary table with no indexes should be quite
inexpensive if you make sure you don't do it one record at a time. Use
the COPY command or batched inserts instead.


--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: single transaction vs multiple transactions

From
Sven Geisler
Date:
Hi,

Many thanks for the fast response.
I will use temporary table with copy.
Other issue I have is the connection between app and db.
I guess, the approach with the copy will also hold down the network I/O
between app and db. Keeping in mind that I produce 10,000+ statements.

Thx
Sven

Heikki Linnakangas schrieb:
> Sven Geisler wrote:
>> I have to insert my arguments to a temporary table first, because the
>> arguments are only known in the application tier.
>> Is a multiple insert to a temporary table and a final 'DELETE FROM x
>> WHERE y IN (SELECT z FROM tmp)' faster than multiple deletes?
>
> If the number of records is high, it most likely is faster. You should
> try it with your data to be sure, but in general doing all the deletes
> in one batch is faster when the number of records is high because it
> allows using efficient merge joins or sequential scans.
>
> Populating the temporary table with no indexes should be quite
> inexpensive if you make sure you don't do it one record at a time. Use
> the COPY command or batched inserts instead.
>
>

Re: single transaction vs multiple transactions

From
Tom Lane
Date:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Sven Geisler wrote:
>> I have to refactoring a 'DELETE FROM x WHERE y IN (...)' because IN got
>> to much parameters. => 'stack depth limit exceeded'
>> The number of arguments is around 10,000.
>> ...
>> BTW: The arguments are generate in the application tier. I would have to
>> create a temporary table which I can use in 'DELETE FROM x WHERE y IN
>> (SELECT z FROM tmp)'.

> I think that's exactly what you should do.

Also, if you're planning to update to 8.2 soon, the tradeoffs will
change completely.  8.2 should avoid the stack depth problem, and you
can get something closely approximating the plan you'd get for a join
against a temp table using VALUES:

regression=# explain select * from tenk1 where unique2 in (1,2,3,4,6,8);
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Bitmap Heap Scan on tenk1  (cost=24.01..45.79 rows=6 width=244)
   Recheck Cond: (unique2 = ANY ('{1,2,3,4,6,8}'::integer[]))
   ->  Bitmap Index Scan on tenk1_unique2  (cost=0.00..24.01 rows=6 width=0)
         Index Cond: (unique2 = ANY ('{1,2,3,4,6,8}'::integer[]))
(4 rows)

regression=# explain select * from tenk1 where unique2 in (values(1),(2),(3),(4),(6),(8));
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Nested Loop  (cost=4.10..48.34 rows=6 width=244)
   ->  HashAggregate  (cost=0.09..0.15 rows=6 width=4)
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.08 rows=6 width=4)
   ->  Bitmap Heap Scan on tenk1  (cost=4.01..8.02 rows=1 width=244)
         Recheck Cond: (tenk1.unique2 = "*VALUES*".column1)
         ->  Bitmap Index Scan on tenk1_unique2  (cost=0.00..4.01 rows=1 width=0)
               Index Cond: (tenk1.unique2 = "*VALUES*".column1)
(7 rows)


            regards, tom lane

Re: single transaction vs multiple transactions

From
"Jens Schipkowski"
Date:
Hi,

PostgreSQL offers some proprietary SQL parameters and commands which
easily solves such problems.
If you are sure PostgreSQL is the DB for your app forever ;) , you could
use this parameters and commands.

Here a possible resolution for your problem.
DELETE FROM x USING tmp WHERE x.y=tmp.z;

Please read the extensive documentation of PostgreSQL first, before
posting.

CU,
Jens

On Tue, 05 Dec 2006 16:26:55 +0100, Sven Geisler <sgeisler@aeccom.com>
wrote:

> Hi,
>
> I have to refactoring a 'DELETE FROM x WHERE y IN (...)' because IN got
> to much parameters. => 'stack depth limit exceeded'
> I don't want to increase just the parameter for max_stack_depth. It is
> better to refactoring because the number of arguments to IN may increase
> in the future.
>
> My approach is to do multiple 'DELETE FROM x WHERE y=...'.
>
> My question is now, what is better for PostgreSQL from a performance
> perspective?
> 1. all multiple deletes in one transaction
> 2. each delete in its own transaction
>
> The number of arguments is around 10,000.
>
> BTW: The arguments are generate in the application tier. I would have to
> create a temporary table which I can use in 'DELETE FROM x WHERE y IN
> (SELECT z FROM tmp)'.
>
> Cheers
> Sven
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster



--
**
Dipl.-Designer Jens Schipkowski
APUS Software GmbH