Re: Delete query takes exorbitant amount of time - Mailing list pgsql-performance

From Simon Riggs
Subject Re: Delete query takes exorbitant amount of time
Date
Msg-id 1112044045.11750.919.camel@localhost.localdomain
Whole thread Raw
In response to Delete query takes exorbitant amount of time  (Karim Nassar <Karim.Nassar@acm.org>)
Responses Re: Delete query takes exorbitant amount of time
List pgsql-performance
On Mon, 2005-03-28 at 13:03 -0700, Karim A Nassar wrote:
> > Well, based upon the evidence so far, the Optimizer got it right:
>
> Agreed. So, this means that the answer to my original question is "that
> delete gonna take a long time"?
>
> Seems that there is still something wrong. From what I can tell from
> everyones questions, the FK constraint on measurement is causing multiple
> seq scans for each value deleted from int_sensor_meas_type. However, when
> deleting a single value, the FK check should use the index, so my ~190
> deletes *should* be fast, no?

No.

> > IndexScan, value=1    elapsed= 29ms    cost=883881
>
> 190 * 29ms is much less than 40 minutes. What am I missing here?

It all depends upon your data.

There are *no* values in *your* table that take 29ms to delete...

> > Karim,
> > Please do:
> >
> > select id_int_sensor_meas_type, count(*)
> > from measurement
> > group by id_int_sensor_meas_type
> > order by count(*) desc;
>
> id_int_sensor_meas_type | count
> -------------------------+--------
>                       31 | 509478
>                       30 | 509478
>                      206 | 509478
>                      205 | 509478
>                      204 | 509478
>                       40 | 509478
>                       39 | 509478
>                      197 | 509478
>                       35 | 509478
>                       34 | 509478
>                       33 | 509478
>                       32 | 509478
>                       41 | 509477
>
> This sample dataset has 13 measurements from a weather station over 3
> years, hence the even distribution.

Each value has 1/13th of the table, which is too many rows per value to
make an IndexScan an efficient way of deleting rows from the table.

Thats it.

If you have more values when measurement is bigger, the delete will
eventually switch plans (if you reconnect) and use the index. But not
yet.

There's a few ways to (re)design around it, but the distribution of your
data is not *currently* conducive to the using an index.

Best Regards, Simon Riggs


pgsql-performance by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Preventing query from hogging server
Next
From: Greg Stark
Date:
Subject: Re: which dual-CPU hardware/OS is fastest for PostgreSQL?