Re: tricky query - Mailing list pgsql-performance

From Cosimo Streppone
Subject Re: tricky query
Date
Msg-id 42C16AAD.4070903@streppone.it
Whole thread Raw
In response to tricky query  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
List pgsql-performance
Merlin Moncure wrote:

> I need a fast way (sql only preferred) to solve the following problem:
> I need the smallest integer that is greater than zero that is not in the
> column of a table.
>
> I've already worked out a query using generate_series (not scalable) and
> pl/pgsql.  An SQL only solution would be preferred, am I missing
> something obvious?

Probably not, but I thought about this "brute-force" approach... :-)
This should work well provided that:

- you have a finite number of integers. Your column should have a biggest
   integer value with a reasonable maximum like 100,000 or 1,000,000.
   #define YOUR_MAX 99999

- you can accept that query execution time depends on smallest integer found.
   The bigger the found integer, the slower execution you get.

Ok, so:

Create a relation "integers" (or whatever) with every single integer from 1 to
YOUR_MAX:

   CREATE TABLE integers (id integer primary key);
   INSERT INTO integers (id) VALUES (1);
   INSERT INTO integers (id) VALUES (2);
   ...
   INSERT INTO integers (id) VALUES (YOUR_MAX);

Create your relation:

   CREATE TABLE merlin (id integer primary key);
   <and fill it with values>

Query is simple now:

   SELECT a.id FROM integers a
     LEFT JOIN merlin b ON a.id=b.id
         WHERE b.id IS NULL
      ORDER BY a.id LIMIT 1;

Execution times with 100k tuples in "integers" and
99,999 tuples in "merlin":

   >\timing
   Timing is on.
   >select i.id from integers i left join merlin s on i.id=s.id where s.id is
null order by i.id limit 1;
    99999

   Time: 233.618 ms
   >insert into merlin (id) values (99999);
   INSERT 86266614 1
   Time: 0.579 ms
   >delete from merlin where id=241;
   DELETE 1
   Time: 0.726 ms
   >select i.id from integers i left join merlin s on i.id=s.id where s.id is
null order by i.id limit 1;
    241

   Time: 1.336 ms
   >

--
Cosimo


pgsql-performance by date:

Previous
From: Michael Stone
Date:
Subject: read block size
Next
From: John A Meinel
Date:
Subject: Re: read block size