Thread: help optimise this ?

help optimise this ?

From
"Peter Galbavy"
Date:
I have a table of image 'instances' where the columns include:

md5 char(32),                    -- the 'original' image md5 key
file_md5 char(32) primary key,   -- the md5 of each version of an image
image_width int,
image_length int

I want to then find either the largest (max) or smallest (min) version of an
image that falls within some range of sizes:

e.g.

select file_md5 from image_instance
where image_width =       (select min(image_width) from image_instance where md5 =
'546b94e94851a56ee721f3b755f58462')   and image_length =       (select min(image_length) from image_instance where md5
=
'546b94e94851a56ee721f3b755f58462')   and md5 = '546b94e94851a56ee721f3b755f58462'   and image_width between 0 and 160
and image_length between 0 and 160;
 

Now, having to do three selects on 'md5' to limit the search seems a little
unoptimal to me. Note that the test tables are small and I have no other
indexes apart from the 'primary key' constraint yet - this is not my primary
concern at this point, I would just like cleaner SQL.

All I want back is (for some definition) the 'file_md5' that best matches my
min/max criteria.

I have not - and will leave for now - the case where a cropped image results
in a scale change between width and length such that the min/max test
returns a different set of rows for each dimension. Argh.

And help given is greatly appreciated.

rgds,
--
Peter



Re: help optimise this ?

From
"Henshall, Stuart - Design & Print"
Date:
<p><font size="2">Peter Galbavy wrote:</font><br /><font size="2">> I have a table of image 'instances' where the
columnsinclude:</font><br /><font size="2">> </font><br /><font size="2">> md5 char(32),                    --
the'original' image md5 key</font><br /><font size="2">> file_md5 char(32) primary key,   -- the md5 of each version
ofan</font><br /><font size="2">> image image_width int,</font><br /><font size="2">> image_length int</font><br
/><fontsize="2">> </font><br /><font size="2">> I want to then find either the largest (max) or smallest
(min)</font><br/><font size="2">> version of an image that falls within some range of sizes:</font><br /><font
size="2">></font><br /><font size="2">> e.g.</font><br /><font size="2">> </font><br /><font size="2">>
selectfile_md5 from image_instance</font><br /><font size="2">> where image_width =</font><br /><font
size="2">>        (select min(image_width) from image_instance where md5 =</font><br /><font size="2">>
'546b94e94851a56ee721f3b755f58462')</font><br/><font size="2">>     and image_length =</font><br /><font
size="2">>        (select min(image_length) from image_instance where md5 =</font><br /><font size="2">>
'546b94e94851a56ee721f3b755f58462')</font><br/><font size="2">>     and md5 =
'546b94e94851a56ee721f3b755f58462'</font><br/><font size="2">>     and image_width between 0 and 160</font><br
/><fontsize="2">>     and image_length between 0 and 160;</font><br /><font size="2">> </font><br /><font
size="2">>Now, having to do three selects on 'md5' to limit the search seems a</font><br /><font size="2">>
littleunoptimal to me. Note that the test tables are small and I</font><br /><font size="2">> have no other indexes
apartfrom the 'primary key' constraint yet -</font><br /><font size="2">> this is not my primary concern at this
point,I would just like</font><br /><font size="2">> cleaner SQL. </font><br /><font size="2">> </font><br
/><fontsize="2">> All I want back is (for some definition) the 'file_md5' that best</font><br /><font size="2">>
matchesmy min/max criteria.</font><br /><font size="2">> </font><br /><font size="2">> I have not - and will
leavefor now - the case where a cropped image</font><br /><font size="2">> results in a scale change between width
andlength such that the</font><br /><font size="2">> min/max test returns a different set of rows for each
dimension.Argh.</font><br /><font size="2">> </font><br /><font size="2">> And help given is greatly
appreciated.</font><br/><font size="2">> </font><br /><font size="2">> rgds,</font><br /><font size="2">>
--</font><br/><font size="2">> Peter</font><br /><font size="2">> </font><br /><font size="2">If you are willing
touse pgsqlism how about:</font><br /><font size="2">select file_md5 from image_instance WHERE </font><br />       
<fontsize="2">md5 = '546b94e94851a56ee721f3b755f58462' AND</font><br />        <font size="2">image_width between 0 and
160AND </font><br />        <font size="2">image_length between 0 and 160 AND</font><br />        <font size="2">ORDER
BYimage_width::int8*image_length::int8 LIMIT 1</font><p><font size="2">This should get the smallest overall image size
withinyour bounds.</font><br /><font size="2">It might be faster to do ORDER BY image_width,image_length LIMIT
1</font><br/><font size="2">but this wouldn't necessarily give the smallest if the aspect ratio changed</font><br
/><fontsize="2">hth,</font><br /><font size="2">- Stuart</font><br />         

Re: help optimise this ?

From
Tom Lane
Date:
"Peter Galbavy" <peter.galbavy@knowtion.net> writes:
> I want to then find either the largest (max) or smallest (min) version of an
> image that falls within some range of sizes:

Depends on how you want to define "largest" and "smallest", but if
"area" is a good enough definition, seems like this would work:

select file_md5 from image_instance
where md5 = '546b94e94851a56ee721f3b755f58462'   and image_width between 0 and 160   and image_length between 0 and
160
order by image_width * image_length
limit 1;

Probably an index on md5 would be sufficient to make this go fast ---
I assume you're not going to be storing a vast number of sizes of
the same image.
        regards, tom lane


Re: help optimise this ?

From
"Peter Galbavy"
Date:
Wow. Three people have replied with an effectively identical solution.

Why didn't I think of this ? Answers on a postcard to...

Thanks to all that have replied.

Peter
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Peter Galbavy" <peter.galbavy@knowtion.net>
Cc: <pgsql-sql@postgresql.org>
Sent: Thursday, November 21, 2002 3:46 PM
Subject: Re: [SQL] help optimise this ?


> "Peter Galbavy" <peter.galbavy@knowtion.net> writes:
> > I want to then find either the largest (max) or smallest (min) version
of an
> > image that falls within some range of sizes:
>
> Depends on how you want to define "largest" and "smallest", but if
> "area" is a good enough definition, seems like this would work:
>
> select file_md5 from image_instance
> where md5 = '546b94e94851a56ee721f3b755f58462'
>     and image_width between 0 and 160
>     and image_length between 0 and 160
> order by image_width * image_length
> limit 1;
>
> Probably an index on md5 would be sufficient to make this go fast ---
> I assume you're not going to be storing a vast number of sizes of
> the same image.
>
> regards, tom lane
>