Thread: Delete query takes exorbitant amount of time

Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

25 March 2005, 00:07:52

v8.0.1 on a Sun v20Z running gentoo linux, 1 cpu, 1GB Ram, 1 10k scsi
disk

I have a (fairly) newly rebuilt database. In the last month it has
undergone extensive testing, hence thousands of inserts and deletes in
the table in question. After each mass unload/load cycle, I vacuum full
analyze verbose.

I tried to build a test case to isolate the issue, but the problem does
not manifest itself, so I think I have somehow made postgresql angry. I
could drop the whole db and start over, but I am interested in not
reproducing this issue.

Here is the statement:

orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
id_meas_type IN (SELECT * FROM meas_type_ids);
                                                         QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
time=1.564..2.840 rows=552 loops=1)
   Hash Cond: ("outer".id_meas_type = "inner".id_meas_type)
   ->  Seq Scan on int_sensor_meas_type  (cost=0.00..25.36 rows=636
width=10) (actual time=0.005..0.828 rows=748 loops=1)
   ->  Hash  (cost=11.03..11.03 rows=200 width=4) (actual
time=1.131..1.131 rows=0 loops=1)
         ->  HashAggregate  (cost=11.03..11.03 rows=200 width=4) (actual
time=0.584..0.826 rows=552 loops=1)
               ->  Seq Scan on meas_type_ids  (cost=0.00..9.42 rows=642
width=4) (actual time=0.002..0.231 rows=552 loops=1)
 Total runtime: 2499616.216 ms
(7 rows)

Yes, that's *40 minutes*. It drives cpu (as viewed in top) to 99%+ for
the entire duration of the query, but %mem hangs at 1% or lower.

meas_type_ids is a temp table with the id's I want to nuke. Here is a
similar query behaving as expected:

orfs=# explain analyze DELETE FROM int_station_sensor   WHERE id_sensor
IN (SELECT * FROM sensor_ids);
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=4.18..21.13 rows=272 width=6) (actual
time=0.479..0.847 rows=169 loops=1)
   Hash Cond: ("outer".id_sensor = "inner".id_sensor)
   ->  Seq Scan on int_station_sensor  (cost=0.00..11.49 rows=549
width=10) (actual time=0.007..0.265 rows=267 loops=1)
   ->  Hash  (cost=3.68..3.68 rows=200 width=4) (actual
time=0.325..0.325 rows=0 loops=1)
         ->  HashAggregate  (cost=3.68..3.68 rows=200 width=4) (actual
time=0.177..0.256 rows=169 loops=1)
               ->  Seq Scan on sensor_ids  (cost=0.00..3.14 rows=214
width=4) (actual time=0.003..0.057 rows=169 loops=1)
 Total runtime: 1.340 ms
(7 rows)


I have posted my tables, data and test cases here:
http://ccl.cens.nau.edu/~kan4/testing/long-delete


Where do I go from here?


Thanks in advance,
--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 00:52:15

Karim Nassar <Karim.Nassar@acm.org> writes:
> Here is the statement:

> orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
> id_meas_type IN (SELECT * FROM meas_type_ids);
>                                                          QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
> time=1.564..2.840 rows=552 loops=1)
> ...
>  Total runtime: 2499616.216 ms
> (7 rows)

Notice that the actual join is taking 2.8 ms.  The other ~40 minutes is
in operations that we cannot see in this plan, but we can surmise are ON
DELETE triggers.

> Where do I go from here?

Look at what your triggers are doing.  My private bet is that you have
unindexed foreign keys referencing this table, and so each deletion
forces a seqscan of some other, evidently very large, table(s).

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Mark Lewis

Date:

25 March 2005, 01:25:41

Tom,

I've got a similar problem with deletes taking a very long time.  I know
that there are lots of foreign keys referencing this table, and other
foreign keys referencing those tables, etc.  I've been curious, is there
a way to find out how long the foreign key checks take for each
dependent table?

-- Mark Lewis

On Thu, 2005-03-24 at 16:52, Tom Lane wrote:
> Karim Nassar <Karim.Nassar@acm.org> writes:
> > Here is the statement:
>
> > orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
> > id_meas_type IN (SELECT * FROM meas_type_ids);
> >                                                          QUERY PLAN
> >
-----------------------------------------------------------------------------------------------------------------------------
> >  Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
> > time=1.564..2.840 rows=552 loops=1)
> > ...
> >  Total runtime: 2499616.216 ms
> > (7 rows)
>
> Notice that the actual join is taking 2.8 ms.  The other ~40 minutes is
> in operations that we cannot see in this plan, but we can surmise are ON
> DELETE triggers.
>
> > Where do I go from here?
>
> Look at what your triggers are doing.  My private bet is that you have
> unindexed foreign keys referencing this table, and so each deletion
> forces a seqscan of some other, evidently very large, table(s).
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

Re: Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

25 March 2005, 01:45:08

On Thu, 2005-03-24 at 19:52 -0500, Tom Lane wrote:
> Karim Nassar <Karim.Nassar@acm.org> writes:
> > Here is the statement:
>
> > orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
> > id_meas_type IN (SELECT * FROM meas_type_ids);
> >                                                          QUERY PLAN
> >

-----------------------------------------------------------------------------------------------------------------------------
> >  Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
> > time=1.564..2.840 rows=552 loops=1)
> > ...
> >  Total runtime: 2499616.216 ms
> > (7 rows)
>
> Notice that the actual join is taking 2.8 ms.  The other ~40 minutes
is
> in operations that we cannot see in this plan, but we can surmise are
ON
> DELETE triggers.

There are no DELETE triggers (that I have created).

> > Where do I go from here?
>
> Look at what your triggers are doing.  My private bet is that you have
> unindexed foreign keys referencing this table, and so each deletion
> forces a seqscan of some other, evidently very large, table(s).

Almost. I have a large table (6.3 million rows) with a foreign key
reference to this one (which has 749 rows), however it is indexed.

I deleted the fk, ran the delete, then recreated the foreign key in
about 15 seconds. Thanks!

Problem now is: this referencing table I expect to grow to about 110
million rows in the next 2 months, then by 4 million rows per month
thereafter. I expect that the time for recreating the foreign key will
grow linearly with size.

Is this just the kind of thing I need to watch out for? Any other
suggestions for dealing with tables of this size? What can I do to my
indexes to make them mo' betta?

--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 01:48:53

Karim Nassar <karim.nassar@NAU.EDU> writes:
>> Look at what your triggers are doing.  My private bet is that you have
>> unindexed foreign keys referencing this table, and so each deletion
>> forces a seqscan of some other, evidently very large, table(s).

> Almost. I have a large table (6.3 million rows) with a foreign key
> reference to this one (which has 749 rows), however it is indexed.

In that case there's a datatype mismatch between the referencing and
referenced columns, which prevents the index from being used for the
FK check.

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 02:35:06

Mark Lewis <mark.lewis@mir3.com> writes:
> I've got a similar problem with deletes taking a very long time.  I know
> that there are lots of foreign keys referencing this table, and other
> foreign keys referencing those tables, etc.  I've been curious, is there
> a way to find out how long the foreign key checks take for each
> dependent table?

There is not any easy way at the moment.

Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
the runtime expended in each trigger when the statement is of a kind
that has triggers.  We couldn't break down the time *within* the
triggers, but even this info would help a lot in terms of finger
pointing ...

    Seq Scan on ... (nn.nnn ms)
    Trigger foo: nn.mmm ms
    Trigger bar: nn.mmm ms
    Total time: nn.mmm ms

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Christopher Kings-Lynne

Date:

25 March 2005, 03:37:21

Watch your pg_stats_* views before and after the delete and check what
related tables have had lots of seqscans.

Chris

Mark Lewis wrote:
> Tom,
>
> I've got a similar problem with deletes taking a very long time.  I know
> that there are lots of foreign keys referencing this table, and other
> foreign keys referencing those tables, etc.  I've been curious, is there
> a way to find out how long the foreign key checks take for each
> dependent table?
>
> -- Mark Lewis
>
> On Thu, 2005-03-24 at 16:52, Tom Lane wrote:
>
>>Karim Nassar <Karim.Nassar@acm.org> writes:
>>
>>>Here is the statement:
>>
>>>orfs=# explain analyze DELETE FROM int_sensor_meas_type WHERE
>>>id_meas_type IN (SELECT * FROM meas_type_ids);
>>>                                                         QUERY PLAN

>>>-----------------------------------------------------------------------------------------------------------------------------
>>> Hash Join  (cost=11.53..42.06 rows=200 width=6) (actual
>>>time=1.564..2.840 rows=552 loops=1)
>>>...
>>> Total runtime: 2499616.216 ms
>>>(7 rows)
>>
>>Notice that the actual join is taking 2.8 ms.  The other ~40 minutes is
>>in operations that we cannot see in this plan, but we can surmise are ON
>>DELETE triggers.
>>
>>
>>>Where do I go from here?
>>
>>Look at what your triggers are doing.  My private bet is that you have
>>unindexed foreign keys referencing this table, and so each deletion
>>forces a seqscan of some other, evidently very large, table(s).
>>
>>            regards, tom lane
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 7: don't forget to increase your free space map settings
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

Re: Delete query takes exorbitant amount of time

From

Christopher Kings-Lynne

Date:

25 March 2005, 03:38:19

> In that case there's a datatype mismatch between the referencing and
> referenced columns, which prevents the index from being used for the
> FK check.

Is creating such a foreign key a WARNING yet?

Chris

Re: Delete query takes exorbitant amount of time

From

Christopher Kings-Lynne

Date:

25 March 2005, 03:38:43

> Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
> the runtime expended in each trigger when the statement is of a kind
> that has triggers.  We couldn't break down the time *within* the
> triggers, but even this info would help a lot in terms of finger
> pointing ...
>
>     Seq Scan on ... (nn.nnn ms)
>     Trigger foo: nn.mmm ms
>     Trigger bar: nn.mmm ms
>     Total time: nn.mmm ms

That would be really cool...

Re: Delete query takes exorbitant amount of time

From

Vivek Khera

Date:

25 March 2005, 03:56:46

On Mar 24, 2005, at 10:38 PM, Christopher Kings-Lynne wrote:

>> In that case there's a datatype mismatch between the referencing and
>> referenced columns, which prevents the index from being used for the
>> FK check.
>
> Is creating such a foreign key a WARNING yet?
>

I recall getting such a warning when importing my schema from a 7.4 to
8.0 server.  I had one table with char and the other with varchar.

Re: Delete query takes exorbitant amount of time

From

Josh Berkus

Date:

25 March 2005, 05:25:36

Karim,

> Problem now is: this referencing table I expect to grow to about 110
> million rows in the next 2 months, then by 4 million rows per month
> thereafter. I expect that the time for recreating the foreign key will
> grow linearly with size.
>
> Is this just the kind of thing I need to watch out for? Any other
> suggestions for dealing with tables of this size? What can I do to my
> indexes to make them mo' betta?

How about getting some decent disk support?  A single 10K SCSI disk is a bit
sub-par for a database with 100's of millions of records.  Too bad you didn't
get a v40z ...

Beyond that, you'll want to do the same thing whenever you purge the
referencing table; drop keys, delete, re-create keys.  Or think about why it
is you need to delete batches of records from this FKed table at all.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Delete query takes exorbitant amount of time

From

Oleg Bartunov

Date:

25 March 2005, 05:38:58

On Thu, 24 Mar 2005, Tom Lane wrote:

> Mark Lewis <mark.lewis@mir3.com> writes:
>> I've got a similar problem with deletes taking a very long time.  I know
>> that there are lots of foreign keys referencing this table, and other
>> foreign keys referencing those tables, etc.  I've been curious, is there
>> a way to find out how long the foreign key checks take for each
>> dependent table?
>
> There is not any easy way at the moment.
>
> Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
> the runtime expended in each trigger when the statement is of a kind
> that has triggers.  We couldn't break down the time *within* the
> triggers, but even this info would help a lot in terms of finger
> pointing ...
>
>     Seq Scan on ... (nn.nnn ms)
>     Trigger foo: nn.mmm ms
>     Trigger bar: nn.mmm ms
>     Total time: nn.mmm ms

and if you add

         Index foo_idx: nn.mm ss
     Heap  foo_tbl: nn.mm ss


>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 06:15:40

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>> Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
>> the runtime expended in each trigger when the statement is of a kind
>> that has triggers.

> Could SPI "know" that an explain analyze is being run and add their
> output and timings to the output?

If it did, we'd be double-counting the time.

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 06:58:31

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>> In that case there's a datatype mismatch between the referencing and
>> referenced columns, which prevents the index from being used for the
>> FK check.

> Is creating such a foreign key a WARNING yet?

I believe so as of 8.0.  It's a bit tricky since 8.0 does allow some
cross-type cases to be indexed, but IIRC we have a test that understands
about that...

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

25 March 2005, 15:12:09

On Fri, 2005-03-25 at 01:58 -0500, Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> >> In that case there's a datatype mismatch between the referencing and
> >> referenced columns, which prevents the index from being used for the
> >> FK check.
>
> > Is creating such a foreign key a WARNING yet?
>
> I believe so as of 8.0.  It's a bit tricky since 8.0 does allow some
> cross-type cases to be indexed, but IIRC we have a test that understands
> about that...

src/backend/commands/tablecmds.c, line 3966 in CVSTIP
/*
 * Check that the found operator is compatible with the PK index,
 * and generate a warning if not, since otherwise costly seqscans
 * will be incurred to check FK validity.
*/
if (!op_in_opclass(oprid(o), opclasses[i]))
   ereport(WARNING,
    (errmsg("foreign key constraint \"%s\" "
        "will require costly sequential scans",
        fkconstraint->constr_name),
     errdetail("Key columns \"%s\" and \"%s\" "
           "are of different types: %s and %s.",
         strVal(list_nth(fkconstraint->fk_attrs, i)),
         strVal(list_nth(fkconstraint->pk_attrs, i)),
           format_type_be(fktypoid[i]),
           format_type_be(pktypoid[i]))));

So, yes to the WARNING. Not sure about the cross-type cases...

Karim: Did this happen? If not, can you drop and re-create and confirm
that you get the WARNING? If not, we have problems.

I vote to make this an ERROR in 8.1 - I see little benefit in allowing
this situation to continue. If users do create a FK like this, it just
becomes another performance problem on list...

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 15:17:28

Simon Riggs <simon@2ndquadrant.com> writes:
> I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> this situation to continue.

Other than spec compliance, you mean?  SQL99 says

            ... The declared type of each referencing column shall be
            comparable to the declared type of the corresponding referenced
            column.

It doesn't say that it has to be indexable, and most definitely not that
there has to be an index.

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

25 March 2005, 16:02:38

On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> > this situation to continue.
>
> Other than spec compliance, you mean?  SQL99 says
>
>             ... The declared type of each referencing column shall be
>             comparable to the declared type of the corresponding referenced
>             column.
>
> It doesn't say that it has to be indexable, and most definitely not that
> there has to be an index.

specs at dawn, eh?

Well, SQL:2003 Foundation, p.550 clause 3a) states that the the
<reference columns> in the referencing table must match a unique
constraint on the referenced table, or the PRIMARY KEY if the columns
are not specified. Either way, the referenced columns are a unique
constraint (which makes perfect sense from a logical data perspective).

We implement unique constraints via an index, so for PostgreSQL the
clause implies that it must refer to an index.

touche, Monsieur Lane and Happy Easter :-)

But even without that, there is little benefit in allowing it...

WARNING -> ERROR, please.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

25 March 2005, 16:23:28

On Fri, 25 Mar 2005, Simon Riggs wrote:

> On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> > > this situation to continue.
> >
> > Other than spec compliance, you mean?  SQL99 says
> >
> >             ... The declared type of each referencing column shall be
> >             comparable to the declared type of the corresponding referenced
> >             column.
> >
> > It doesn't say that it has to be indexable, and most definitely not that
> > there has to be an index.
>
> specs at dawn, eh?
>
> Well, SQL:2003 Foundation, p.550 clause 3a) states that the the
> <reference columns> in the referencing table must match a unique
> constraint on the referenced table, or the PRIMARY KEY if the columns
> are not specified. Either way, the referenced columns are a unique
> constraint (which makes perfect sense from a logical data perspective).
>
> We implement unique constraints via an index, so for PostgreSQL the
> clause implies that it must refer to an index.

IMHO, that reference is irrrelevant.  Yes, there must be an index due to
our implementation, however that doesn't imply that the types must be the
same, nor even that the index must be usable for the cross table
comparison.

Re: Delete query takes exorbitant amount of time

From

Josh Berkus

Date:

25 March 2005, 17:39:05

Karim,

> I guess I was wondering if there is other general tuning advice for such
> large table indexes such as increasing statistics, etc.

Well, your index use problem is being explained by Tom, Stephan and Simon;
basically your FKed data types are incompatible for index use purposes so the
system *can't* use an index while loading.

If you're going with the drop/load/recreate option, then I'd suggest
increasing work_mem for the duration.  Hmmm ... or maintenance_work_mem?
What gets used for FK checks?  Simon?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

25 March 2005, 18:26:05

On Fri, 2005-03-25 at 08:23 -0800, Stephan Szabo wrote:
> On Fri, 25 Mar 2005, Simon Riggs wrote:
>
> > On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> > > Simon Riggs <simon@2ndquadrant.com> writes:
> > > > I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> > > > this situation to continue.
> > >
> > > Other than spec compliance, you mean?  SQL99 says
> > >
> > >             ... The declared type of each referencing column shall be
> > >             comparable to the declared type of the corresponding referenced
> > >             column.
> > >
> > > It doesn't say that it has to be indexable, and most definitely not that
> > > there has to be an index.
> >
> > specs at dawn, eh?
> >
> > Well, SQL:2003 Foundation, p.550 clause 3a) states that the the
> > <reference columns> in the referencing table must match a unique
> > constraint on the referenced table, or the PRIMARY KEY if the columns
> > are not specified. Either way, the referenced columns are a unique
> > constraint (which makes perfect sense from a logical data perspective).
> >
> > We implement unique constraints via an index, so for PostgreSQL the
> > clause implies that it must refer to an index.
>
> IMHO, that reference is irrrelevant.

Tom had said SQL99 required this; I have pointed out SQL:2003, which
supercedes the SQL99 standard, does not require this.

Leading us back to my original point - what is the benefit of continuing
with having a WARNING when that leads people into trouble later?

> Yes, there must be an index due to
> our implementation, however that doesn't imply that the types must be the
> same

No, it doesn't imply it, but what benefit do you see from the
interpretation that they are allowed to differ? That interpretation
currently leads to many mistakes leading to poor performance.

There is clear benefit from forcing them to be the same. In logical data
terms, they *should* be the same. I don't check fruit.apple_grade
against fruit_type.orange_grade. When would I want to make a check of
that nature? If there is a reason, thats great, lets keep status quo
then.

I respect the effort and thought that has already gone into the
implementation; I seek only to offer a very minor improvement based upon
recent list issues.

> nor even that the index must be usable for the cross table
> comparison.

Thats a separate discussion, possibly the next one.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 18:47:54

Simon Riggs <simon@2ndquadrant.com> writes:
> On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
>>> Other than spec compliance, you mean?  SQL99 says
>>>
>>> ... The declared type of each referencing column shall be
>>> comparable to the declared type of the corresponding referenced
>>> column.

> Tom had said SQL99 required this; I have pointed out SQL:2003, which
> supercedes the SQL99 standard, does not require this.

You're reading the wrong part of SQL:2003.  11.8 <referential constraint
definition> syntax rule 9 still has the text I quoted.

> Leading us back to my original point - what is the benefit of continuing
> with having a WARNING when that leads people into trouble later?

Accepting spec-compliant schemas.

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

25 March 2005, 20:30:24

On Fri, 2005-03-25 at 13:47 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> >>> Other than spec compliance, you mean?  SQL99 says
> >>>
> >>> ... The declared type of each referencing column shall be
> >>> comparable to the declared type of the corresponding referenced
> >>> column.
>
> > Tom had said SQL99 required this; I have pointed out SQL:2003, which
> > supercedes the SQL99 standard, does not require this.
>
> You're reading the wrong part of SQL:2003.  11.8 <referential constraint
> definition> syntax rule 9 still has the text I quoted.

So, we have this from SQL:2003 section 11.8 p.550
- 3a) requires us to have an index
- 9) requires the data types to be "comparable"

In the name of spec-compliance we wish to accept an interpretation of
the word "comparable" that means we will accept two datatypes that are
not actually the same.

So we are happy to enforce having the index, but not happy to ensure the
index is actually usable for the task?

> > Leading us back to my original point - what is the benefit of continuing
> > with having a WARNING when that leads people into trouble later?
>
> Accepting spec-compliant schemas.

I definitely want this too - as you know I have worked on documenting
compliance previously.

Is the word "comparable" defined elsewhere in the standard?

Currently, datatypes with similar type categories are comparable and yet
(in 8.0) will now use the index. So, we are taking comparable to include
fairly radically different datatypes?

Could it be that because PostgreSQL has a very highly developed sense of
datatype comparison that we might be taking this to extremes? Would any
other RDBMS consider two different datatypes to be comparable?

Please consider this.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

25 March 2005, 21:10:56

On Fri, 25 Mar 2005, Simon Riggs wrote:

> On Fri, 2005-03-25 at 13:47 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> > >>> Other than spec compliance, you mean?  SQL99 says
> > >>>
> > >>> ... The declared type of each referencing column shall be
> > >>> comparable to the declared type of the corresponding referenced
> > >>> column.
> >
> > > Tom had said SQL99 required this; I have pointed out SQL:2003, which
> > > supercedes the SQL99 standard, does not require this.
> >
> > You're reading the wrong part of SQL:2003.  11.8 <referential constraint
> > definition> syntax rule 9 still has the text I quoted.
>
> So, we have this from SQL:2003 section 11.8 p.550
> - 3a) requires us to have an index
> - 9) requires the data types to be "comparable"
>
> In the name of spec-compliance we wish to accept an interpretation of
> the word "comparable" that means we will accept two datatypes that are
> not actually the same.
>
> So we are happy to enforce having the index, but not happy to ensure the
> index is actually usable for the task?

The indexes "usability" only applies to the purpose of guaranteeing
uniqueness which doesn't depend on the referencing type AFAICS.

> > > Leading us back to my original point - what is the benefit of continuing
> > > with having a WARNING when that leads people into trouble later?
> >
> > Accepting spec-compliant schemas.
>
> I definitely want this too - as you know I have worked on documenting
> compliance previously.
>
> Is the word "comparable" defined elsewhere in the standard?

Yes.  And at least in SQL99, there's a bunch of statements in 4.* about
what are comparable.

> Currently, datatypes with similar type categories are comparable and yet
> (in 8.0) will now use the index. So, we are taking comparable to include
> fairly radically different datatypes?

Not entirely. I believe a referenced column of int, and a referencing
column of numeric currently displays that warning, but appears to be
allowed by the spec (as the numeric types are considered mutually
comparable).

> Could it be that because PostgreSQL has a very highly developed sense of
> datatype comparison that we might be taking this to extremes? Would any
> other RDBMS consider two different datatypes to be comparable?

We do have a broader comparable than the spec. However, if we were to
limit it to the spec then many of the implicit casts and cross-type
comparison operators we have would be invalid as well since the comparison
between those types would have to fail as well unless we treat the
comparable used by <comparison predicate> differently.

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

25 March 2005, 21:25:26

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Fri, 25 Mar 2005, Simon Riggs wrote:
>> Could it be that because PostgreSQL has a very highly developed sense of
>> datatype comparison that we might be taking this to extremes? Would any
>> other RDBMS consider two different datatypes to be comparable?

> We do have a broader comparable than the spec.

However, the set of comparisons that we can presently support *with
indexes* is narrower than the spec, so rejecting nonindexable cases
would be a problem.

It's worth noting also that the test being discussed checks whether the
PK index is usable for testing the RI constraint.  In the problem that
started this thread, the difficulty is lack of a usable index on the FK
column, not the PK (because that's the table that has to be searched to
do a delete in the PK table).  We cannot enforce that there be a usable
index on the FK column (since indexes on the FK table may not have been
built yet when the constraint is declared), and shouldn't anyway because
there are reasonable usage patterns where you don't need one.

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

25 March 2005, 21:44:18

On Fri, 2005-03-25 at 15:10 +0000, Simon Riggs wrote:
> Karim: Did this happen? If not, can you drop and re-create and confirm
> that you get the WARNING? If not, we have problems.

No. Nor do I think that I should. SERIAL is shortcut for INTEGER, no? I
think there is some other (TBD) problem causing my big seq scan.

orfs=# ALTER TABLE measurement DROP CONSTRAINT measurement_id_int_sensor_meas_type_fkey;
ALTER TABLE
orfs=# ALTER TABLE ONLY measurement ADD CONSTRAINT measurement_id_int_sensor_meas_type_fkey
orfs-#      FOREIGN KEY (id_int_sensor_meas_type) REFERENCES int_sensor_meas_type(id_int_sensor_meas_type);
ALTER TABLE
orfs=#

The add constraint statement comes directly from a pg_dump.

For clarity, the table/indexes were created as such:

CREATE TABLE int_sensor_meas_type(
    id_int_sensor_meas_type SERIAL PRIMARY KEY,
    id_sensor integer NOT NULL REFERENCES sensor,
    id_meas_type integer NOT NULL REFERENCES meas_type UNIQUE);

CREATE TABLE measurement (
    id_measurement SERIAL PRIMARY KEY,
    id_int_sensor_meas_type integer NOT NULL REFERENCES int_sensor_meas_type,
    datetime timestamp WITH TIME ZONE NOT NULL,
    value numeric(15,5) NOT NULL,
    created timestamp with time zone NOT NULL DEFAULT now(),
    created_by TEXT NOT NULL REFERENCES public.person(id_person));

CREATE INDEX measurement__id_int_sensor_meas_type_idx ON measurement(id_int_sensor_meas_type);

Regards,
--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

25 March 2005, 21:51:28

On Fri, 2005-03-25 at 16:25 -0500, Tom Lane wrote:
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Fri, 25 Mar 2005, Simon Riggs wrote:
> >> Could it be that because PostgreSQL has a very highly developed sense of
> >> datatype comparison that we might be taking this to extremes? Would any
> >> other RDBMS consider two different datatypes to be comparable?
>
> > We do have a broader comparable than the spec.
>
> However, the set of comparisons that we can presently support *with
> indexes* is narrower than the spec, so rejecting nonindexable cases
> would be a problem.

OK. Can we have a TODO item then?

* Ensure that all SQL:2003 comparable datatypes are also indexable when
compared

...or something like that

> It's worth noting also that the test being discussed checks whether the
> PK index is usable for testing the RI constraint.  In the problem that
> started this thread, the difficulty is lack of a usable index on the FK
> column, not the PK (because that's the table that has to be searched to
> do a delete in the PK table).  We cannot enforce that there be a usable
> index on the FK column (since indexes on the FK table may not have been
> built yet when the constraint is declared), and shouldn't anyway because
> there are reasonable usage patterns where you don't need one.

Yes, I agree for CASCADE we wouldn't always want an index.

Alright then, time to leave it there.

I want to write up some additional comments for performance tips:
- Consider defining RI constraints after tables have been loaded
- Remember to add an index on the referencing table if the constraint is
defined as CASCADEing

Have a good Easter, all, wherever you are and whatever you believe in.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Christopher Kings-Lynne

Date:

26 March 2005, 06:34:13

> There is clear benefit from forcing them to be the same. In logical data
> terms, they *should* be the same. I don't check fruit.apple_grade
> against fruit_type.orange_grade. When would I want to make a check of
> that nature? If there is a reason, thats great, lets keep status quo
> then.
>
> I respect the effort and thought that has already gone into the
> implementation; I seek only to offer a very minor improvement based upon
> recent list issues.

The main problem would be people getting errors when upgrading their
databases, or restoring from a backup, say.

Chris

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

26 March 2005, 15:55:45

On Fri, 25 Mar 2005, Karim Nassar wrote:

> On Fri, 2005-03-25 at 15:10 +0000, Simon Riggs wrote:
> > Karim: Did this happen? If not, can you drop and re-create and confirm
> > that you get the WARNING? If not, we have problems.
>
> No. Nor do I think that I should. SERIAL is shortcut for INTEGER, no? I
> think there is some other (TBD) problem causing my big seq scan.
>
> orfs=# ALTER TABLE measurement DROP CONSTRAINT measurement_id_int_sensor_meas_type_fkey;
> ALTER TABLE
> orfs=# ALTER TABLE ONLY measurement ADD CONSTRAINT measurement_id_int_sensor_meas_type_fkey
> orfs-#      FOREIGN KEY (id_int_sensor_meas_type) REFERENCES int_sensor_meas_type(id_int_sensor_meas_type);
> ALTER TABLE
> orfs=#
>
> The add constraint statement comes directly from a pg_dump.
>
> For clarity, the table/indexes were created as such:
>
> CREATE TABLE int_sensor_meas_type(
>     id_int_sensor_meas_type SERIAL PRIMARY KEY,
>     id_sensor integer NOT NULL REFERENCES sensor,
>     id_meas_type integer NOT NULL REFERENCES meas_type UNIQUE);
>
> CREATE TABLE measurement (
>     id_measurement SERIAL PRIMARY KEY,
>     id_int_sensor_meas_type integer NOT NULL REFERENCES int_sensor_meas_type,
>     datetime timestamp WITH TIME ZONE NOT NULL,
>     value numeric(15,5) NOT NULL,
>     created timestamp with time zone NOT NULL DEFAULT now(),
>     created_by TEXT NOT NULL REFERENCES public.person(id_person));
>
> CREATE INDEX measurement__id_int_sensor_meas_type_idx ON measurement(id_int_sensor_meas_type);

That seems like it should be okay, hmm, what does something like:

PREPARE test(int) AS SELECT 1 from measurement where
id_int_sensor_meas_type = $1 FOR UPDATE;
EXPLAIN ANALYZE EXECUTE TEST(1);

give you as the plan?

Re: Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

26 March 2005, 19:33:05

On Sat, 2005-03-26 at 07:55 -0800, Stephan Szabo wrote:
> That seems like it should be okay, hmm, what does something like:
>
> PREPARE test(int) AS SELECT 1 from measurement where
> id_int_sensor_meas_type = $1 FOR UPDATE;
> EXPLAIN ANALYZE EXECUTE TEST(1);
>
> give you as the plan?

                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6)
                          (actual time=11608.402..11608.402 rows=0 loops=1)
                          Filter: (id_int_sensor_meas_type = $1)
 Total runtime: 11608.441 ms
(3 rows)


--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

26 March 2005, 23:18:19

On Sat, 26 Mar 2005, Karim Nassar wrote:

> On Sat, 2005-03-26 at 07:55 -0800, Stephan Szabo wrote:
> > That seems like it should be okay, hmm, what does something like:
> >
> > PREPARE test(int) AS SELECT 1 from measurement where
> > id_int_sensor_meas_type = $1 FOR UPDATE;
> > EXPLAIN ANALYZE EXECUTE TEST(1);
> >
> > give you as the plan?
>
>                                                       QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------
>  Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6)
>                           (actual time=11608.402..11608.402 rows=0 loops=1)
>                           Filter: (id_int_sensor_meas_type = $1)
>  Total runtime: 11608.441 ms
> (3 rows)

Hmm, has measurement been analyzed recently?  You might want to see if
raising the statistics target on measurement.id_int_sensor_meas_type and
reanalyzing changes the estimated rows down from 500k.

Re: Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

27 March 2005, 00:42:16

On Sat, 2005-03-26 at 15:18 -0800, Stephan Szabo wrote:
> On Sat, 26 Mar 2005, Karim Nassar wrote:
>
> > On Sat, 2005-03-26 at 07:55 -0800, Stephan Szabo wrote:
> > > That seems like it should be okay, hmm, what does something like:
> > >
> > > PREPARE test(int) AS SELECT 1 from measurement where
> > > id_int_sensor_meas_type = $1 FOR UPDATE;
> > > EXPLAIN ANALYZE EXECUTE TEST(1);
> > >
> > > give you as the plan?
> >
> >                                                       QUERY PLAN
> >
-----------------------------------------------------------------------------------------------------------------------
> >  Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6)
> >                           (actual time=11608.402..11608.402 rows=0 loops=1)
> >                           Filter: (id_int_sensor_meas_type = $1)
> >  Total runtime: 11608.441 ms
> > (3 rows)
>
> Hmm, has measurement been analyzed recently?  You might want to see if
> raising the statistics target on measurement.id_int_sensor_meas_type and
> reanalyzing changes the estimated rows down from 500k.

orfs=# ALTER TABLE measurement ALTER COLUMN id_int_sensor_meas_type SET STATISTICS 1000;
ALTER TABLE
orfs=# VACUUM FULL ANALYZE VERBOSE;
<snip>
INFO:  free space map: 52 relations, 13501 pages stored; 9760 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 300000 pages = 1864 kB shared memory.
VACUUM
orfs=# PREPARE test(int) AS SELECT 1 from measurement where
orfs-# id_int_sensor_meas_type = $1 FOR UPDATE;
PREPARE
orfs=# EXPLAIN ANALYZE EXECUTE TEST(1);
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6) (actual time=8948.452..8948.452 rows=0 loops=1)
   Filter: (id_int_sensor_meas_type = $1)
 Total runtime: 8948.494 ms
(3 rows)

orfs=# EXPLAIN ANALYZE EXECUTE TEST(1);
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6) (actual time=3956.616..3956.616 rows=0 loops=1)
   Filter: (id_int_sensor_meas_type = $1)
 Total runtime: 3956.662 ms
(3 rows)



Some improvement. Even better once it's cached. Row estimate didn't
change. Is this the best I can expect? Is there any other optimizations
I am missing?

TIA,

--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

27 March 2005, 16:05:42

On Sat, 26 Mar 2005, Karim Nassar wrote:

> On Sat, 2005-03-26 at 15:18 -0800, Stephan Szabo wrote:
> > On Sat, 26 Mar 2005, Karim Nassar wrote:
> >
> > > On Sat, 2005-03-26 at 07:55 -0800, Stephan Szabo wrote:
> > > > That seems like it should be okay, hmm, what does something like:
> > > >
> > > > PREPARE test(int) AS SELECT 1 from measurement where
> > > > id_int_sensor_meas_type = $1 FOR UPDATE;
> > > > EXPLAIN ANALYZE EXECUTE TEST(1);
> > > >
> > > > give you as the plan?
> > >
> > >                                                       QUERY PLAN
> > >
-----------------------------------------------------------------------------------------------------------------------
> > >  Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6)
> > >                           (actual time=11608.402..11608.402 rows=0 loops=1)
> > >                           Filter: (id_int_sensor_meas_type = $1)
> > >  Total runtime: 11608.441 ms
> > > (3 rows)
> >
> > Hmm, has measurement been analyzed recently?  You might want to see if
> > raising the statistics target on measurement.id_int_sensor_meas_type and
> > reanalyzing changes the estimated rows down from 500k.
>
> orfs=# ALTER TABLE measurement ALTER COLUMN id_int_sensor_meas_type SET STATISTICS 1000;
> ALTER TABLE
> orfs=# VACUUM FULL ANALYZE VERBOSE;
> <snip>
> INFO:  free space map: 52 relations, 13501 pages stored; 9760 total pages needed
> DETAIL:  Allocated FSM size: 1000 relations + 300000 pages = 1864 kB shared memory.
> VACUUM
> orfs=# PREPARE test(int) AS SELECT 1 from measurement where
> orfs-# id_int_sensor_meas_type = $1 FOR UPDATE;
> PREPARE
> orfs=# EXPLAIN ANALYZE EXECUTE TEST(1);
>                                                      QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
>  Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6) (actual time=8948.452..8948.452 rows=0 loops=1)
>    Filter: (id_int_sensor_meas_type = $1)
>  Total runtime: 8948.494 ms
> (3 rows)
>
> orfs=# EXPLAIN ANALYZE EXECUTE TEST(1);
>                                                      QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
>  Seq Scan on measurement  (cost=0.00..164559.16 rows=509478 width=6) (actual time=3956.616..3956.616 rows=0 loops=1)
>    Filter: (id_int_sensor_meas_type = $1)
>  Total runtime: 3956.662 ms
> (3 rows)
>
>
>
> Some improvement. Even better once it's cached. Row estimate didn't
> change. Is this the best I can expect? Is there any other optimizations
> I am missing?

I'm not sure, really. Running a seq scan for each removed row in the
referenced table doesn't seem like a particularly good plan in general
though, especially if the average number of rows being referenced isn't
on the order of 500k per value. I don't know what to look at next though.

Re: Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

27 March 2005, 16:23:16

On Thu, 2005-03-24 at 20:48 -0500, Tom Lane wrote:
> In that case there's a datatype mismatch between the referencing and
> referenced columns, which prevents the index from being used for the
> FK check.

Can I have more words on this? Here is how I created the tables:

CREATE TABLE int_sensor_meas_type(
    id_int_sensor_meas_type SERIAL PRIMARY KEY,
    id_sensor integer NOT NULL REFERENCES sensor,
    id_meas_type integer NOT NULL REFERENCES meas_type UNIQUE);


CREATE TABLE measurement (
    id_measurement SERIAL PRIMARY KEY,
    id_int_sensor_meas_type integer NOT NULL REFERENCES int_sensor_meas_type,
    datetime timestamp WITH TIME ZONE NOT NULL,
    value numeric(15,5) NOT NULL,
    created timestamp with time zone NOT NULL DEFAULT now(),
    created_by TEXT NOT NULL REFERENCES public.person(id_person));

CREATE INDEX measurement__id_int_sensor_meas_type_idx ON measurement(id_int_sensor_meas_type);

Do I need to cast the id_int_sensor_meas_type column when creating the
index? Both referrer and referenced look like INTEGER to me...

http://www.postgresql.org/docs/8.0/interactive/datatype.html#DATATYPE-SERIAL
says: "The type names serial and serial4 are equivalent: both create
integer columns"

TIA,

--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Karim Nassar

Date:

27 March 2005, 16:26:15

On Thu, 2005-03-24 at 21:24 -0800, Josh Berkus wrote:
> Karim,
> How about getting some decent disk support?  A single 10K SCSI disk is a bit
> sub-par for a database with 100's of millions of records.  Too bad you didn't
> get a v40z ...

Hehe. I have one I am setting up that will be dedicated to postgresql,
hence my question about a week ago about disk partitioning/striping :-)

> Beyond that, you'll want to do the same thing whenever you purge the
> referencing table; drop keys, delete, re-create keys.  Or think about why it
> is you need to delete batches of records from this FKed table at all.

The database is for weather data from multiple sources. When adding a
new dataset, I have to create/test/delete/recreate the config in the
FKed table. Users don't have this power, but I need it.
Drop/delete/recreate is a totally acceptable solution for this scenario.

I guess I was wondering if there is other general tuning advice for such
large table indexes such as increasing statistics, etc.

Thanks,
--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

28 March 2005, 11:22:49

On Sun, 2005-03-27 at 07:05 -0800, Stephan Szabo wrote:
> On Sat, 26 Mar 2005, Karim Nassar wrote:
> > Some improvement. Even better once it's cached. Row estimate didn't
> > change. Is this the best I can expect? Is there any other optimizations
> > I am missing?
>
> I'm not sure, really. Running a seq scan for each removed row in the
> referenced table doesn't seem like a particularly good plan in general
> though, especially if the average number of rows being referenced isn't
> on the order of 500k per value. I don't know what to look at next though.
>

Karim, please...

run the EXPLAIN after doing
    SET enable_seqscan = off

Thanks,

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

28 March 2005, 17:27:44

On Mon, 28 Mar 2005, Karim A Nassar wrote:

> On Mon, 28 Mar 2005, Simon Riggs wrote:
> > run the EXPLAIN after doing
> >     SET enable_seqscan = off
>
> The results I previously supplied were searching for a non-existent
> value, so I have provided output for both cases.
>
> ***
> *** Searching for non-existent value
> ***
>
> orfs=# PREPARE test2(int) AS SELECT 1 from measurement where
> orfs-# id_int_sensor_meas_type = $1 FOR UPDATE;
> PREPARE
> orfs=#  EXPLAIN ANALYZE EXECUTE TEST2(1);
>  QUERY PLAN
> --------------------------------------------------------------------------
>  Seq Scan on measurement
>     (cost=0.00..164559.16 rows=509478 width=6)
>     (actual time=6421.849..6421.849 rows=0 loops=1)
>    Filter: (id_int_sensor_meas_type = $1)
>  Total runtime: 6421.917 ms
> (3 rows)
>
> orfs=#  SET enable_seqscan = off;

I think you have to prepare with enable_seqscan=off, because it effects
how the query is planned and prepared.

Re: Delete query takes exorbitant amount of time

From

Mark Lewis

Date:

28 March 2005, 18:36:54

Tom Lane Wrote:
> Hmm ... I wonder how hard it would be to teach EXPLAIN ANALYZE to show
> the runtime expended in each trigger when the statement is of a kind
> that has triggers.  We couldn't break down the time *within* the
> triggers, but even this info would help a lot in terms of finger
> pointing ...
>
>     Seq Scan on ... (nn.nnn ms)
>     Trigger foo: nn.mmm ms
>     Trigger bar: nn.mmm ms
>     Total time: nn.mmm ms


So I got the latest from CVS on Friday night to see how hard it would be
to implement this, but it turns out that Tom has already committed the
improvement, so I'm in Tom's fan club today.  I imported my test dataset
and was almost immediately able to track down the cause of my
performance problem.

Thanks!
Mark Lewis

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

28 March 2005, 20:27:42

On Mon, 2005-03-28 at 09:37 -0700, Karim A Nassar wrote:
> On Mon, 28 Mar 2005, Stephan Szabo wrote:
> > > On Mon, 28 Mar 2005, Simon Riggs wrote:
> > > > run the EXPLAIN after doing
> > > >     SET enable_seqscan = off
>
> ...
>
> > I think you have to prepare with enable_seqscan=off, because it
> > effects how the query is planned and prepared.
>
> orfs=# SET enable_seqscan = off;
> SET
> orfs=# PREPARE test2(int) AS SELECT 1 from measurement where
> orfs-# id_int_sensor_meas_type = $1 FOR UPDATE;
> PREPARE
> orfs=#  EXPLAIN ANALYZE EXECUTE TEST2(1); -- non-existent
>
> QUERY PLAN
> -------------------------------------------------------------------------
>  Index Scan using measurement__id_int_sensor_meas_type_idx on measurement
>     (cost=0.00..883881.49 rows=509478 width=6)
>     (actual time=29.207..29.207 rows=0 loops=1)
>    Index Cond: (id_int_sensor_meas_type = $1)
>  Total runtime: 29.277 ms
> (3 rows)
>
> orfs=#  EXPLAIN ANALYZE EXECUTE TEST2(197); -- existing value
>
> QUERY PLAN
> -------------------------------------------------------------------------
>  Index Scan using measurement__id_int_sensor_meas_type_idx on measurement
>     (cost=0.00..883881.49 rows=509478 width=6)
>     (actual time=12.903..37478.167 rows=509478 loops=1)
>    Index Cond: (id_int_sensor_meas_type = $1)
>  Total runtime: 38113.338 ms
> (3 rows)
>

"That process starts upon the supposition that when you have eliminated
all which is impossible, then whatever remains, however improbable, must
be the truth." - Sherlock Holmes

Well, based upon the evidence so far, the Optimizer got it right:

Normal
SeqScan, value=1    elapsed= 6.4s    cost=164559
SeqScan, value=197    elapsed=28.1s    cost=164559

SeqScan=off
IndexScan, value=1    elapsed= 29ms    cost=883881
IndexScan, value=197    elapsed=38.1s    cost=883881

With SeqScan=off the index is used, proving that it has been correctly
defined for use in queries.

The FK CASCADE delete onto measurement will only be triggered by the
deletion of a real row, so the actual value will be the time taken. This
is longer than a SeqScan, so the Optimizer is correct.

My guess is that Measurement has a greatly non-uniform distribution of
values and that 197 is one of the main values. Other values exist in the
lookup table, but are very infrequently occurring in the larger table.

Karim,
Please do:

select id_int_sensor_meas_type, count(*)
from measurement
group by id_int_sensor_meas_type
order by count(*) desc;

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

28 March 2005, 22:09:27

On Mon, 2005-03-28 at 13:03 -0700, Karim A Nassar wrote:
> > Well, based upon the evidence so far, the Optimizer got it right:
>
> Agreed. So, this means that the answer to my original question is "that
> delete gonna take a long time"?
>
> Seems that there is still something wrong. From what I can tell from
> everyones questions, the FK constraint on measurement is causing multiple
> seq scans for each value deleted from int_sensor_meas_type. However, when
> deleting a single value, the FK check should use the index, so my ~190
> deletes *should* be fast, no?

No.

> > IndexScan, value=1    elapsed= 29ms    cost=883881
>
> 190 * 29ms is much less than 40 minutes. What am I missing here?

It all depends upon your data.

There are *no* values in *your* table that take 29ms to delete...

> > Karim,
> > Please do:
> >
> > select id_int_sensor_meas_type, count(*)
> > from measurement
> > group by id_int_sensor_meas_type
> > order by count(*) desc;
>
> id_int_sensor_meas_type | count
> -------------------------+--------
>                       31 | 509478
>                       30 | 509478
>                      206 | 509478
>                      205 | 509478
>                      204 | 509478
>                       40 | 509478
>                       39 | 509478
>                      197 | 509478
>                       35 | 509478
>                       34 | 509478
>                       33 | 509478
>                       32 | 509478
>                       41 | 509477
>
> This sample dataset has 13 measurements from a weather station over 3
> years, hence the even distribution.

Each value has 1/13th of the table, which is too many rows per value to
make an IndexScan an efficient way of deleting rows from the table.

Thats it.

If you have more values when measurement is bigger, the delete will
eventually switch plans (if you reconnect) and use the index. But not
yet.

There's a few ways to (re)design around it, but the distribution of your
data is not *currently* conducive to the using an index.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 00:01:06

On Fri, 2005-03-25 at 09:38 -0800, Josh Berkus wrote:
> > I guess I was wondering if there is other general tuning advice for such
> > large table indexes such as increasing statistics, etc.
>

> If you're going with the drop/load/recreate option, then I'd suggest
> increasing work_mem for the duration.  Hmmm ... or maintenance_work_mem?
> What gets used for FK checks?  Simon?
>

In 8.0, maintenance_work_mem is used for index creation, vacuum and
initial check of FK checks at time of creation. Everything else uses
work_mem as the limit.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 12:31:30

On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > Each value has 1/13th of the table, which is too many rows per value to
> > make an IndexScan an efficient way of deleting rows from the table.
>
> But, the original question was that the delete that was taking a long time
> was on a different table. I tried to delete 150 rows from a table with 750
> rows, which is FK referenced from this large table. If I understand
> correctly, Tom suggested that the length of time was due to a sequential
> scan being done on the large table for each value being deleted from the
> small one.

> For this FK check, there only need be one referring id to invalidate the
> delete. ISTM that for any delete with a FK reference, the index could
> always be used to search for a single value in the referring table
> (excepting very small tables). Why then must a sequential scan be
> performed in this case, and/or in general?

My understanding was that you were doing a DELETE on the smaller table
and that this was doing a DELETE on the measurement table because you
had the FK defined as ON DELETE CASCADE. You are right - only a single
row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
action of CASCADE then you will want to touch all rows referenced, so a
SeqScan is a perfectly valid consequence of such actions.
I think now that you are using the default action, rather than
specifically requesting CASCADE?

Stephan, Tom:
The SQL generated for RI checking by the RI triggers currently applies a
limit at execution time, not at prepare time. i.e. there is no LIMIT
clause in the SQL.

We know whether the check will be limit 1 or limit 0 at prepare time, so
why not add a LIMIT clause to the SQL so it changes the plan, not just
the number of rows returned when the check query executes?
(I note that PREPARE does allow you to supply a LIMIT 1 clause).

That is *ought* to have some effect on the plan used by the RI check
queries. In costsize.c:cost_index we would have tuples_fetched==1 and it
would be hard (but not impossible) for the index cost to ever be more
than the cost of a SeqScan.

...but, I see no way for OidFunctionCall8 to ever return an answer of
"always just 1 row, no matter how big the relation"...so tuples_fetched
is always proportional to the size of the relation. Are unique indexes
treated just as very-low-selectivity indexes? - they're a very similar
situation in terms of forcing an absolute, not relative, number of rows
returned.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

29 March 2005, 14:50:44

On Tue, 29 Mar 2005, Simon Riggs wrote:

> On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > > Each value has 1/13th of the table, which is too many rows per value to
> > > make an IndexScan an efficient way of deleting rows from the table.
> >
> > But, the original question was that the delete that was taking a long time
> > was on a different table. I tried to delete 150 rows from a table with 750
> > rows, which is FK referenced from this large table. If I understand
> > correctly, Tom suggested that the length of time was due to a sequential
> > scan being done on the large table for each value being deleted from the
> > small one.
>
> > For this FK check, there only need be one referring id to invalidate the
> > delete. ISTM that for any delete with a FK reference, the index could
> > always be used to search for a single value in the referring table
> > (excepting very small tables). Why then must a sequential scan be
> > performed in this case, and/or in general?
>
> My understanding was that you were doing a DELETE on the smaller table
> and that this was doing a DELETE on the measurement table because you
> had the FK defined as ON DELETE CASCADE. You are right - only a single
> row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
> action of CASCADE then you will want to touch all rows referenced, so a
> SeqScan is a perfectly valid consequence of such actions.
> I think now that you are using the default action, rather than
> specifically requesting CASCADE?
>
> Stephan, Tom:
> The SQL generated for RI checking by the RI triggers currently applies a
> limit at execution time, not at prepare time. i.e. there is no LIMIT
> clause in the SQL.
>
> We know whether the check will be limit 1 or limit 0 at prepare time, so
> why not add a LIMIT clause to the SQL so it changes the plan, not just
> the number of rows returned when the check query executes?

Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
nicely together, so you could sometimes get a result where if the first
row was locked, the FOR UPDATE would wait on it, but if it was deleted by
the other transaction you could get 0 rows back in the trigger.

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

29 March 2005, 15:20:26

On Tue, 29 Mar 2005, Stephan Szabo wrote:

> On Tue, 29 Mar 2005, Simon Riggs wrote:
>
> > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > > > Each value has 1/13th of the table, which is too many rows per value to
> > > > make an IndexScan an efficient way of deleting rows from the table.
> > >
> > > But, the original question was that the delete that was taking a long time
> > > was on a different table. I tried to delete 150 rows from a table with 750
> > > rows, which is FK referenced from this large table. If I understand
> > > correctly, Tom suggested that the length of time was due to a sequential
> > > scan being done on the large table for each value being deleted from the
> > > small one.
> >
> > > For this FK check, there only need be one referring id to invalidate the
> > > delete. ISTM that for any delete with a FK reference, the index could
> > > always be used to search for a single value in the referring table
> > > (excepting very small tables). Why then must a sequential scan be
> > > performed in this case, and/or in general?
> >
> > My understanding was that you were doing a DELETE on the smaller table
> > and that this was doing a DELETE on the measurement table because you
> > had the FK defined as ON DELETE CASCADE. You are right - only a single
> > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
> > action of CASCADE then you will want to touch all rows referenced, so a
> > SeqScan is a perfectly valid consequence of such actions.
> > I think now that you are using the default action, rather than
> > specifically requesting CASCADE?
> >
> > Stephan, Tom:
> > The SQL generated for RI checking by the RI triggers currently applies a
> > limit at execution time, not at prepare time. i.e. there is no LIMIT
> > clause in the SQL.
> >
> > We know whether the check will be limit 1 or limit 0 at prepare time, so
> > why not add a LIMIT clause to the SQL so it changes the plan, not just
> > the number of rows returned when the check query executes?
>
> Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
> nicely together, so you could sometimes get a result where if the first
> row was locked, the FOR UPDATE would wait on it, but if it was deleted by
> the other transaction you could get 0 rows back in the trigger.

If there were some way to pass a "limit" into SPI_prepare that was treated
similarly to a LIMIT clause for planning purposes but didn't actually
change the output plan to only return that number of rows, we could use
that.

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

29 March 2005, 15:29:39

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Tue, 29 Mar 2005, Simon Riggs wrote:
>> The SQL generated for RI checking by the RI triggers currently applies a
>> limit at execution time, not at prepare time. i.e. there is no LIMIT
>> clause in the SQL.
>>
>> We know whether the check will be limit 1 or limit 0 at prepare time, so
>> why not add a LIMIT clause to the SQL so it changes the plan, not just
>> the number of rows returned when the check query executes?

> Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
> nicely together, so you could sometimes get a result where if the first
> row was locked, the FOR UPDATE would wait on it, but if it was deleted by
> the other transaction you could get 0 rows back in the trigger.

Yeah, this is still true.  It would probably be a good idea to change it
but I haven't looked into exactly what would be involved.  The basic
problem is that the FOR UPDATE filter needs to execute before LIMIT
instead of after, so presumably the FOR UPDATE shenanigans in execMain.c
would need to be pushed into a separate plan node that could go
underneath the LIMIT node.

Originally this would have led to even more broken behavior --- locks
taken on rows that weren't returned --- because the original coding of
the LIMIT node tended to pull one more row from the lower plan than it
would actually return.  But we fixed that.

I think having such a node might allow us to support FOR UPDATE in
subqueries, as well, but I haven't looked at the details.  (Whether that
is a good idea is another question --- the problem of pulling rows that
aren't nominally necessary, and thereby locking them, would apply in
spades.)

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

29 March 2005, 15:41:07

Simon Riggs <simon@2ndquadrant.com> writes:
> ...but, I see no way for OidFunctionCall8 to ever return an answer of
> "always just 1 row, no matter how big the relation"...so tuples_fetched
> is always proportional to the size of the relation. Are unique indexes
> treated just as very-low-selectivity indexes?

Yeah.  It is not the job of amcostestimate to estimate the number of
rows, only the index access cost.  (IIRC there is someplace in the
planner that explicitly considers unique indexes as a part of developing
selectivity estimates ... but it's not that part.)

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

29 March 2005, 15:56:54

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> If there were some way to pass a "limit" into SPI_prepare that was treated
> similarly to a LIMIT clause for planning purposes but didn't actually
> change the output plan to only return that number of rows, we could use
> that.

Hmm ... the planner does have the ability to do that sort of thing (we
use it for cursors).  SPI_prepare doesn't expose the capability.
Perhaps adding a SPI_prepare variant that does expose it would be the
quickest route to a solution.

I get a headache every time I look at the RI triggers ;-).  Do they
always know at the time of preparing a plan which way it will be used?

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 16:21:59

On Tue, 2005-03-29 at 05:50 -0800, Stephan Szabo wrote:
> On Tue, 29 Mar 2005, Simon Riggs wrote:
>
> > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > > > Each value has 1/13th of the table, which is too many rows per value to
> > > > make an IndexScan an efficient way of deleting rows from the table.
> > >
> > > But, the original question was that the delete that was taking a long time
> > > was on a different table. I tried to delete 150 rows from a table with 750
> > > rows, which is FK referenced from this large table. If I understand
> > > correctly, Tom suggested that the length of time was due to a sequential
> > > scan being done on the large table for each value being deleted from the
> > > small one.
> >
> > > For this FK check, there only need be one referring id to invalidate the
> > > delete. ISTM that for any delete with a FK reference, the index could
> > > always be used to search for a single value in the referring table
> > > (excepting very small tables). Why then must a sequential scan be
> > > performed in this case, and/or in general?
> >
> > My understanding was that you were doing a DELETE on the smaller table
> > and that this was doing a DELETE on the measurement table because you
> > had the FK defined as ON DELETE CASCADE. You are right - only a single
> > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
> > action of CASCADE then you will want to touch all rows referenced, so a
> > SeqScan is a perfectly valid consequence of such actions.
> > I think now that you are using the default action, rather than
> > specifically requesting CASCADE?
> >
> > Stephan, Tom:
> > The SQL generated for RI checking by the RI triggers currently applies a
> > limit at execution time, not at prepare time. i.e. there is no LIMIT
> > clause in the SQL.
> >
> > We know whether the check will be limit 1 or limit 0 at prepare time, so
> > why not add a LIMIT clause to the SQL so it changes the plan, not just
> > the number of rows returned when the check query executes?
>
> Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
> nicely together, so you could sometimes get a result where if the first
> row was locked, the FOR UPDATE would wait on it, but if it was deleted by
> the other transaction you could get 0 rows back in the trigger.
>

Well, sorry to ask more...

...but surely we only need FOR UPDATE clause if we are performing a
CASCADE action? whereas we only want the LIMIT 1 clause if we are NOT
performing a CASCADE action? That way the two clauses are mutually
exclusive and the problem you outline should never (need to) occur.

The current code doesn't seem to vary the check query according to the
requested FK action...

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 16:26:14

On Tue, 2005-03-29 at 09:56 -0500, Tom Lane wrote:
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > If there were some way to pass a "limit" into SPI_prepare that was treated
> > similarly to a LIMIT clause for planning purposes but didn't actually
> > change the output plan to only return that number of rows, we could use
> > that.
>
> Hmm ... the planner does have the ability to do that sort of thing (we
> use it for cursors).  SPI_prepare doesn't expose the capability.
> Perhaps adding a SPI_prepare variant that does expose it would be the
> quickest route to a solution.
>
> I get a headache every time I look at the RI triggers ;-).  Do they
> always know at the time of preparing a plan which way it will be used?

If action is NO ACTION or RESTRICT then
    we need to SELECT at most 1 row that matches the criteria
    which means we can use LIMIT 1

If action is CASCADE, SET NULL, SET DEFAULT then
    we need to UPDATE or DELETE all rows that match the criteria
    which means we musnt use LIMIT and need to use FOR UPDATE

We know that at CONSTRAINT creation time, which always occurs before
plan preparation time.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

29 March 2005, 16:32:00

Simon Riggs <simon@2ndquadrant.com> writes:
> If action is NO ACTION or RESTRICT then
>     we need to SELECT at most 1 row that matches the criteria
>     which means we can use LIMIT 1

> If action is CASCADE, SET NULL, SET DEFAULT then
>     we need to UPDATE or DELETE all rows that match the criteria
>     which means we musnt use LIMIT and need to use FOR UPDATE

Huh?  UPDATE/DELETE don't use FOR UPDATE.  I think you have failed
to break down the cases sufficiently.  In particular it matters which
side of the RI constraint you are working from ...

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 17:03:13

On Tue, 2005-03-29 at 10:31 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > If action is NO ACTION or RESTRICT then
> >     we need to SELECT at most 1 row that matches the criteria
> >     which means we can use LIMIT 1
>
> > If action is CASCADE, SET NULL, SET DEFAULT then
> >     we need to UPDATE or DELETE all rows that match the criteria
> >     which means we musnt use LIMIT and need to use FOR UPDATE
>
> Huh?  UPDATE/DELETE don't use FOR UPDATE.  I think you have failed
> to break down the cases sufficiently.  In particular it matters which
> side of the RI constraint you are working from ...

OK... too quick, sorry. I'll hand over to Stephan for a better and more
exhaustive explanation/analysis... but AFAICS we *can* always know the
correct formulation of the query prepare time, whether or not we do
currently.

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

29 March 2005, 17:33:28

On Tue, 29 Mar 2005, Simon Riggs wrote:

> On Tue, 2005-03-29 at 10:31 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > If action is NO ACTION or RESTRICT then
> > >     we need to SELECT at most 1 row that matches the criteria
> > >     which means we can use LIMIT 1
> >
> > > If action is CASCADE, SET NULL, SET DEFAULT then
> > >     we need to UPDATE or DELETE all rows that match the criteria
> > >     which means we musnt use LIMIT and need to use FOR UPDATE
> >
> > Huh?  UPDATE/DELETE don't use FOR UPDATE.  I think you have failed
> > to break down the cases sufficiently.  In particular it matters which
> > side of the RI constraint you are working from ...
>
> OK... too quick, sorry. I'll hand over to Stephan for a better and more
> exhaustive explanation/analysis... but AFAICS we *can* always know the
> correct formulation of the query prepare time, whether or not we do
> currently.

We currently use FOR UPDATE on the NO ACTION check, because otherwise we
might get back a row that's already marked for deletion by a concurrent
transaction.  I think that's intended to wait and succeed, not fail.

Re: Delete query takes exorbitant amount of time

From

Stephan Szabo

Date:

29 March 2005, 17:38:40

On Tue, 29 Mar 2005, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > If there were some way to pass a "limit" into SPI_prepare that was treated
> > similarly to a LIMIT clause for planning purposes but didn't actually
> > change the output plan to only return that number of rows, we could use
> > that.
>
> Hmm ... the planner does have the ability to do that sort of thing (we
> use it for cursors).  SPI_prepare doesn't expose the capability.
> Perhaps adding a SPI_prepare variant that does expose it would be the
> quickest route to a solution.
>
> I get a headache every time I look at the RI triggers ;-).  Do they

Me too, honestly.

> always know at the time of preparing a plan which way it will be used?

I believe so.  I think each saved plan pretty much lives for a single
trigger type/argument set and is basically used in only one place.

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 18:19:38

On Tue, 2005-03-29 at 09:40 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > ...but, I see no way for OidFunctionCall8 to ever return an answer of
> > "always just 1 row, no matter how big the relation"...so tuples_fetched
> > is always proportional to the size of the relation. Are unique indexes
> > treated just as very-low-selectivity indexes?
>
> Yeah.  It is not the job of amcostestimate to estimate the number of
> rows, only the index access cost.  (IIRC there is someplace in the
> planner that explicitly considers unique indexes as a part of developing
> selectivity estimates ... but it's not that part.)

Well, I mention this because costsize.c:cost_index *does* calculate the
number of rows returned. If unique indexes are handled elsewhere then
this would not cause problems for them...but for LIMIT queries..?

cost_index gets the selectivity then multiplies that by number of tuples
in the relation to calc tuples_fetched, so it can use that in the
Mackert & Lohman formula. There's no consideration of the query limits.

That implies to me that LIMIT queries are not considered correctly in
the M&L formula and thus we are more likely to calculate a too-high cost
for using an index in those circumstances....and thus more likely to
SeqScan for medium sized relations?

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Tom Lane

Date:

29 March 2005, 18:32:09

Simon Riggs <simon@2ndquadrant.com> writes:
> That implies to me that LIMIT queries are not considered correctly in
> the M&L formula and thus we are more likely to calculate a too-high cost
> for using an index in those circumstances....and thus more likely to
> SeqScan for medium sized relations?

You misunderstand how LIMIT is handled.  The plan structure is

    LIMIT ...
        regular plan ...

and so the strategy is to plan and cost the regular plan as though it
would be carried out in full, and then take an appropriate fraction
of that at the LIMIT stage.

            regards, tom lane

Re: Delete query takes exorbitant amount of time

From

Simon Riggs

Date:

29 March 2005, 19:55:09

On Tue, 2005-03-29 at 12:31 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > That implies to me that LIMIT queries are not considered correctly in
> > the M&L formula and thus we are more likely to calculate a too-high cost
> > for using an index in those circumstances....and thus more likely to
> > SeqScan for medium sized relations?
>
> You misunderstand how LIMIT is handled.

Huh? Well, not this time. (Though my error rate is admittedly high.)

> The plan structure is
>
>     LIMIT ...
>         regular plan ...
>
> and so the strategy is to plan and cost the regular plan as though it
> would be carried out in full, and then take an appropriate fraction
> of that at the LIMIT stage.

To cost it as if it would be carried out in full and then not execute in
full is the same thing as saying it overestimates the actual execution
cost. Which can lead to selection of SeqScan plan when the IndexScan
would have been cheaper, all things considered.

...it could work like this

    LIMIT ....
        regular plan (plan chosen knowing that LIMIT follows)

so that the LIMIT would be considered in the M&L formula.

Not that I am driven by how other systems work, but both DB2 and Oracle
allow this form of optimization.

There's not a huge benefit in sending LIMIT 1 through on the FK check
queries unless they'd be taken into account in the planning.

Anyway, I'm not saying I know how to do this yet/ever, just to say it is
possible to use the information available to better effect.

This looks like a TODO item to me? Thoughts?

Best Regards, Simon Riggs

Re: Delete query takes exorbitant amount of time

From

Karim A Nassar

Date:

03 April 2005, 03:44:06

> Each value has 1/13th of the table, which is too many rows per value to
> make an IndexScan an efficient way of deleting rows from the table.

But, the original question was that the delete that was taking a long time
was on a different table. I tried to delete 150 rows from a table with 750
rows, which is FK referenced from this large table. If I understand
correctly, Tom suggested that the length of time was due to a sequential
scan being done on the large table for each value being deleted from the
small one.

(I have no formal training in database administration nor database theory,
so please excuse me if I am being dumb.)

For this FK check, there only need be one referring id to invalidate the
delete. ISTM that for any delete with a FK reference, the index could
always be used to search for a single value in the referring table
(excepting very small tables). Why then must a sequential scan be
performed in this case, and/or in general?

--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Karim A Nassar

Date:

03 April 2005, 03:47:11

> Well, based upon the evidence so far, the Optimizer got it right:

Agreed. So, this means that the answer to my original question is "that
delete gonna take a long time"?

Seems that there is still something wrong. From what I can tell from
everyones questions, the FK constraint on measurement is causing multiple
seq scans for each value deleted from int_sensor_meas_type. However, when
deleting a single value, the FK check should use the index, so my ~190
deletes *should* be fast, no?

> IndexScan, value=1    elapsed= 29ms    cost=883881

190 * 29ms is much less than 40 minutes. What am I missing here?


> Karim,
> Please do:
>
> select id_int_sensor_meas_type, count(*)
> from measurement
> group by id_int_sensor_meas_type
> order by count(*) desc;

id_int_sensor_meas_type | count
-------------------------+--------
                      31 | 509478
                      30 | 509478
                     206 | 509478
                     205 | 509478
                     204 | 509478
                      40 | 509478
                      39 | 509478
                     197 | 509478
                      35 | 509478
                      34 | 509478
                      33 | 509478
                      32 | 509478
                      41 | 509477

This sample dataset has 13 measurements from a weather station over 3
years, hence the even distribution.


Continued thanks,
--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Karim A Nassar

Date:

03 April 2005, 03:47:34

On Mon, 28 Mar 2005, Stephan Szabo wrote:
> > On Mon, 28 Mar 2005, Simon Riggs wrote:
> > > run the EXPLAIN after doing
> > >     SET enable_seqscan = off

...

> I think you have to prepare with enable_seqscan=off, because it
> effects how the query is planned and prepared.

orfs=# SET enable_seqscan = off;
SET
orfs=# PREPARE test2(int) AS SELECT 1 from measurement where
orfs-# id_int_sensor_meas_type = $1 FOR UPDATE;
PREPARE
orfs=#  EXPLAIN ANALYZE EXECUTE TEST2(1); -- non-existent

QUERY PLAN
-------------------------------------------------------------------------
 Index Scan using measurement__id_int_sensor_meas_type_idx on measurement
    (cost=0.00..883881.49 rows=509478 width=6)
    (actual time=29.207..29.207 rows=0 loops=1)
   Index Cond: (id_int_sensor_meas_type = $1)
 Total runtime: 29.277 ms
(3 rows)

orfs=#  EXPLAIN ANALYZE EXECUTE TEST2(197); -- existing value

QUERY PLAN
-------------------------------------------------------------------------
 Index Scan using measurement__id_int_sensor_meas_type_idx on measurement
    (cost=0.00..883881.49 rows=509478 width=6)
    (actual time=12.903..37478.167 rows=509478 loops=1)
   Index Cond: (id_int_sensor_meas_type = $1)
 Total runtime: 38113.338 ms
(3 rows)

--
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221

Re: Delete query takes exorbitant amount of time

From

Bruno Wolff III

Date:

03 April 2005, 13:01:40

On Tue, Mar 29, 2005 at 01:48:48 -0700,
  Karim A Nassar <Karim.Nassar@NAU.EDU> wrote:
>
> For this FK check, there only need be one referring id to invalidate the
> delete. ISTM that for any delete with a FK reference, the index could
> always be used to search for a single value in the referring table
> (excepting very small tables). Why then must a sequential scan be
> performed in this case, and/or in general?

First the index needs to exist. It isn't created automatically because not
everyone wants such an index. Second, you need to have analyzed the
referencing table so that the planner will know it is big enough that
using an indexed search is worthwhile. The planner is getting better
about dealing with size changes without reanalyzing, but it seems there
are still some gotchas in 8.0.

Re: Delete query takes exorbitant amount of time

From

Gaetano Mendola

Date:

29 September 2005, 10:30:45

Mark Lewis wrote:
> I imported my test dataset
> and was almost immediately able to track down the cause of my
> performance problem.

Why don't you tell us what the problem was :-) ?

Regards
Gaetano Mendola