Thread: Except operation

Except operation

From
Satyajeet Seth
Date:
Hi
The query:
select * from webdata except select * from webdata1;
 takes abysmally long .How can I optimise it?
The particulars are:

tracedb=> \d webdata
Table    = webdata
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                |
Length|
+----------------------------------+----------------------------------+-------+
| tid                              | int4                             |
4 |
| itemid                           | int4                             |
4 |
| ordering                         | int4                             |
4 |
+----------------------------------+----------------------------------+-------+

tracedb=> \d webdata1
Table    = webdata1
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                |
Length|
+----------------------------------+----------------------------------+-------+
| tid                              | int4                             |
4 |
| itemid                           | int4                             |
4 |
| ordering                         | int4                             |
4 |
+----------------------------------+----------------------------------+-------+
tracedb=> select count(*) from webdata;
count
-----
91155
(1 row)

tracedb=> select count(*) from webdata1;
count
-----
 9550
(1 row)

Regards
Satya





Re: [GENERAL] Except operation

From
Adriaan Joubert
Date:
Satyajeet Seth wrote:
>
> Hi
> The query:
> select * from webdata except select * from webdata1;
>  takes abysmally long .How can I optimise it?
> The particulars are:

You could try

  select * from webdata w
  where not exists
    (select * from webdata1 w1
     where w1.tid=w.tid
      ...
        )

If you have the correct indexes on webdata1 this can be quite fast.

Adriaan

Re: [GENERAL] Except operation

From
Date:
how about I'm a sql92 lover? :-)

the idea is to use index. but the engine only use index when it sees
"where". so, how about add "where w.tid = w.tid" to the first (or
both) select on the two sides of the except?

not tested. if work, please let us know, thanks.

Kai
On Thu, 2 Dec 1999, Adriaan Joubert wrote:

> Satyajeet Seth wrote:
> >
> > Hi
> > The query:
> > select * from webdata except select * from webdata1;
> >  takes abysmally long .How can I optimise it?
> > The particulars are:
>
> You could try
>
>   select * from webdata w
>   where not exists
>     (select * from webdata1 w1
>      where w1.tid=w.tid
>       ...
>         )
>
> If you have the correct indexes on webdata1 this can be quite fast.
>
> Adriaan
>
> ************
>


Re: [GENERAL] Except operation

From
Mike Mascari
Date:
kaiq@realtyideas.com wrote:

> how about I'm a sql92 lover? :-)
>
> the idea is to use index. but the engine only use index when it sees
> "where". so, how about add "where w.tid = w.tid" to the first (or
> both) select on the two sides of the except?
>
> not tested. if work, please let us know, thanks.
>
> Kai

That's not the problem. The problem is that the INTERSECT/EXCEPT code
uses the query rewriter to automatically rewrite the query to something
like:

SELECT * FROM webdata w
WHERE (<webdata.fieldlist>) NOT IN (
    SELECT * FROM webdata1);

and 'IN' clauses in PostgreSQL can't use indices.  The result is always a
sequential scan on the IN values. INTERSECT/EXCEPT should have been
written to rewrite the query using correlated subqueries and the EXISTS
test, as Adriaan suggests below. In fact, one of the possible solutions
for the un-usability of IN clauses is to have PostgreSQL rewrite those as
EXISTS....but no one's done that yet.

Mike

>
> On Thu, 2 Dec 1999, Adriaan Joubert wrote:
>
> > Satyajeet Seth wrote:
> > >
> > > Hi
> > > The query:
> > > select * from webdata except select * from webdata1;
> > >  takes abysmally long .How can I optimise it?
> > > The particulars are:
> >
> > You could try
> >
> >   select * from webdata w
> >   where not exists
> >       (select * from webdata1 w1
> >        where w1.tid=w.tid
> >         ...
> >         )
> >
> > If you have the correct indexes on webdata1 this can be quite fast.
> >
> > Adriaan