Thread: Need advice to avoid ORDER BY

Need advice to avoid ORDER BY

From
Condor
Date:
Hello,

I have one query in my postgresql 9.2.3 that took 137 ms to me executed
and looking a way
what I can do to optimize it. I have one table generated numbers from 1
to 1 000 000 and
I need to get first free id, meanwhile id's when is taken can be free
(deleted data and id
is free for next job). Table is simple:


id serial,
jobid text,
valids int default 0

(Yes, I have index).


my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC
LIMIT 1

I need the first id only.

My question is: Is there a way how I can avoid using ORDER BY to
receive the first
free id from mytable ?



Cheers,
Condor



Re: Need advice to avoid ORDER BY

From
Merlin Moncure
Date:
On Thu, Apr 4, 2013 at 4:32 PM, Condor <condor@stz-bg.com> wrote:
> Hello,
>
> I have one query in my postgresql 9.2.3 that took 137 ms to me executed and
> looking a way
> what I can do to optimize it. I have one table generated numbers from 1 to 1
> 000 000 and
> I need to get first free id, meanwhile id's when is taken can be free
> (deleted data and id
> is free for next job). Table is simple:
>
>
> id serial,
> jobid text,
> valids int default 0
>
> (Yes, I have index).
>
>
> my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC
> LIMIT 1
>
> I need the first id only.
>
> My question is: Is there a way how I can avoid using ORDER BY to receive the
> first
> free id from mytable ?

well, you can (via EXISTS()), but you can really optimize this with
partial index.

CREATE INDEX ON mytable (id) WHERE valids = 0;

then,

 SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1;

should return in zero time since btree indexes can optimize order by
expressions and the partial index will bypass having to wade through
the rows you don't want.

merlin


Re: Need advice to avoid ORDER BY

From
Condor
Date:
On 2013-04-05 00:38, Merlin Moncure wrote:
> On Thu, Apr 4, 2013 at 4:32 PM, Condor <condor@stz-bg.com> wrote:
>> Hello,
>>
>> I have one query in my postgresql 9.2.3 that took 137 ms to me
>> executed and
>> looking a way
>> what I can do to optimize it. I have one table generated numbers from
>> 1 to 1
>> 000 000 and
>> I need to get first free id, meanwhile id's when is taken can be free
>> (deleted data and id
>> is free for next job). Table is simple:
>>
>>
>> id serial,
>> jobid text,
>> valids int default 0
>>
>> (Yes, I have index).
>>
>>
>> my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id
>> ASC
>> LIMIT 1
>>
>> I need the first id only.
>>
>> My question is: Is there a way how I can avoid using ORDER BY to
>> receive the
>> first
>> free id from mytable ?
>
> well, you can (via EXISTS()), but you can really optimize this with
> partial index.
>
> CREATE INDEX ON mytable (id) WHERE valids = 0;
>
> then,
>
>  SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1;
>
> should return in zero time since btree indexes can optimize order by
> expressions and the partial index will bypass having to wade through
> the rows you don't want.
>
> merlin


Hm,
I only can say: Thank You!
Your solution is work, but Im now a little confused. I has a index
CREATE INDEX ON mytable (valids) USING BTREE (valids) and the
query to find valids = 0 tooks 137 ms.

Why, your solution is worked ? Yes, it's worked.


Cheers,
Condor


Re: Need advice to avoid ORDER BY

From
Merlin Moncure
Date:
On Thu, Apr 4, 2013 at 4:49 PM, Condor <condor@stz-bg.com> wrote:
>>  SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1;
>>
>> should return in zero time since btree indexes can optimize order by
>> expressions and the partial index will bypass having to wade through
>> the rows you don't want.
>>
>> merlin
>
>
>
> Hm,
> I only can say: Thank You!
> Your solution is work, but Im now a little confused. I has a index
> CREATE INDEX ON mytable (valids) USING BTREE (valids) and the
> query to find valids = 0 tooks 137 ms.

problem is that you are looking for needles (valids = 0) in the
haystack.   the problem wasn't really the order, but the fact that you
had to scan an arbitrary amount of rows before finding a candidate
record.  so the partial index manages this problem by creating index
entries *only for records that match a criteria*, and the planner
recognizes this and prefers that index when the criteria is also
present in the query.  In other words, index only the needles.

merlin


Re: Need advice to avoid ORDER BY

From
John R Pierce
Date:
On 4/4/2013 2:49 PM, Condor wrote:
> Your solution is work, but Im now a little confused. I has a index
> CREATE INDEX ON mytable (valids) USING BTREE (valids) and the
> query to find valids = 0 tooks 137 ms.

the query can't use that index, and the separate index on id at the same
time, it has to pick one.

Merlin's partial index is how you get around this.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Need advice to avoid ORDER BY

From
Tom Lane
Date:
Merlin Moncure <mmoncure@gmail.com> writes:
> problem is that you are looking for needles (valids = 0) in the
> haystack.   the problem wasn't really the order, but the fact that you
> had to scan an arbitrary amount of rows before finding a candidate
> record.  so the partial index manages this problem by creating index
> entries *only for records that match a criteria*, and the planner
> recognizes this and prefers that index when the criteria is also
> present in the query.  In other words, index only the needles.

The other way to fix it is a two-column index on (valids, id), which
will be more useful if sometimes you need the minimum/maximum id
for some nonzero value of valids.

The real point here is that you want the index to contain consecutive
entries for the rows with the particular valids value you want, *in
order by id*.  Then the planner knows the first/last such index entry
contains the answer.  When you index only valids, it has to collect all
the matching rows and sort them by id.

            regards, tom lane


Re: Need advice to avoid ORDER BY

From
Merlin Moncure
Date:
On Thu, Apr 4, 2013 at 5:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> problem is that you are looking for needles (valids = 0) in the
>> haystack.   the problem wasn't really the order, but the fact that you
>> had to scan an arbitrary amount of rows before finding a candidate
>> record.  so the partial index manages this problem by creating index
>> entries *only for records that match a criteria*, and the planner
>> recognizes this and prefers that index when the criteria is also
>> present in the query.  In other words, index only the needles.
>
> The other way to fix it is a two-column index on (valids, id), which
> will be more useful if sometimes you need the minimum/maximum id
> for some nonzero value of valids.

right -- that's a more general solution -- here we are exploiting that
A: the OP only needs access to "=0" rows and especially B: "=0" rows
are a tiny fraction of the overall set (we know this because otherwise
the query would have returned quickly anyways).  So we get to squeak
out with a tiny index pointing to only the candidate rows.

Partial indexes are an underutilized trick -- the efficiency savings
can be enormous.  They are often useful when coding ad hoc queue
operations in the database where the queued items are intermixed with
items that have been resolved.

merlin


Re: Need advice to avoid ORDER BY

From
Scott Marlowe
Date:
Try an index like:

create index yada on mytable (id) where valids=0;

then

select max(jobid) from mytable where valids=0;


On Thu, Apr 4, 2013 at 3:32 PM, Condor <condor@stz-bg.com> wrote:
Hello,

I have one query in my postgresql 9.2.3 that took 137 ms to me executed and looking a way
what I can do to optimize it. I have one table generated numbers from 1 to 1 000 000 and
I need to get first free id, meanwhile id's when is taken can be free (deleted data and id
is free for next job). Table is simple:


id serial,
jobid text,
valids int default 0

(Yes, I have index).


my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC LIMIT 1

I need the first id only.

My question is: Is there a way how I can avoid using ORDER BY to receive the first
free id from mytable ?



Cheers,
Condor



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
To understand recursion, one must first understand recursion.

Re: Need advice to avoid ORDER BY

From
Condor
Date:
On 2013-04-05 01:54, Merlin Moncure wrote:
> On Thu, Apr 4, 2013 at 5:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> problem is that you are looking for needles (valids = 0) in the
>>> haystack.   the problem wasn't really the order, but the fact that
>>> you
>>> had to scan an arbitrary amount of rows before finding a candidate
>>> record.  so the partial index manages this problem by creating index
>>> entries *only for records that match a criteria*, and the planner
>>> recognizes this and prefers that index when the criteria is also
>>> present in the query.  In other words, index only the needles.
>>
>> The other way to fix it is a two-column index on (valids, id), which
>> will be more useful if sometimes you need the minimum/maximum id
>> for some nonzero value of valids.
>
> right -- that's a more general solution -- here we are exploiting that
> A: the OP only needs access to "=0" rows and especially B: "=0" rows
> are a tiny fraction of the overall set (we know this because otherwise
> the query would have returned quickly anyways).  So we get to squeak
> out with a tiny index pointing to only the candidate rows.
>
> Partial indexes are an underutilized trick -- the efficiency savings
> can be enormous.  They are often useful when coding ad hoc queue
> operations in the database where the queued items are intermixed with
> items that have been resolved.
>
> merlin


Thank you for every one for suggestions. I'll try to make
changes tomorrow night to see what will be happened.


Cheers,
Condor


Re: Need advice to avoid ORDER BY

From
Jasen Betts
Date:
On 2013-04-04, Condor <condor@stz-bg.com> wrote:
> Hello,
>
> I have one query in my postgresql 9.2.3 that took 137 ms to me executed
> and looking a way
> what I can do to optimize it. I have one table generated numbers from 1
> to 1 000 000 and
> I need to get first free id, meanwhile id's when is taken can be free
> (deleted data and id
> is free for next job). Table is simple:
>
>
> id serial,
> jobid text,
> valids int default 0
>
> (Yes, I have index).
>
>
> my query is: SELECT jobid FROM mytable WHERE valids = 0 ORDER BY id ASC
> LIMIT 1
>
> I need the first id only.
>
> My question is: Is there a way how I can avoid using ORDER BY to
> receive the first
> free id from mytable ?

create index freejobs on mytable(id) where valids = 0 ;

retry the same query.

--
⚂⚃ 100% natural