Thread: DONT_CARE Aggregate

DONT_CARE Aggregate

From
Robert James
Date:
Is there an aggregate that will return an arbitrary instance? That is,
not necessarily the max or min, just any one? (Which might perform
better than max or min)

More importantly:
Is there one which will return an arbitrary instance as long as it's not NULL

And even better:
An aggregate which will return the first instance that meets a certain
predicate?


Re: DONT_CARE Aggregate

From
Sergey Konoplev
Date:
On Wed, Dec 19, 2012 at 5:28 PM, Robert James <srobertjames@gmail.com> wrote:
> And even better:
> An aggregate which will return the first instance that meets a certain
> predicate?

Take a look at DISTINCT ON.

--
Sergey Konoplev
Database and Software Architect
http://www.linkedin.com/in/grayhemp

Phones:
USA +1 415 867 9984
Russia, Moscow +7 901 903 0499
Russia, Krasnodar +7 988 888 1979

Skype: gray-hemp
Jabber: gray.ru@gmail.com


Re: DONT_CARE Aggregate

From
Robert James
Date:
Sergey - That's an interesting option, but I'm not sure how to use it
as an aggregate.  Could you give an example?

On 12/20/12, Sergey Konoplev <gray.ru@gmail.com> wrote:
> On Wed, Dec 19, 2012 at 5:28 PM, Robert James <srobertjames@gmail.com>
> wrote:
>> And even better:
>> An aggregate which will return the first instance that meets a certain
>> predicate?
>
> Take a look at DISTINCT ON.
>
> --
> Sergey Konoplev
> Database and Software Architect
> http://www.linkedin.com/in/grayhemp
>
> Phones:
> USA +1 415 867 9984
> Russia, Moscow +7 901 903 0499
> Russia, Krasnodar +7 988 888 1979
>
> Skype: gray-hemp
> Jabber: gray.ru@gmail.com
>


Re: DONT_CARE Aggregate

From
Richard Broersma
Date:
On Thu, Dec 20, 2012 at 5:45 AM, Robert James <srobertjames@gmail.com> wrote:
Sergey - That's an interesting option, but I'm not sure how to use it
as an aggregate.  Could you give an example?

Here is an example:

buildinghac=>
  SELECT itemnbr, buildingnbr
    FROM Actionitems
ORDER BY buildingnbr
   LIMIT 10;
 itemnbr | buildingnbr
---------+-------------
    1181 | B-0106
     363 | B-0106
     185 | B-0106
     483 | B-0106
      67 | B-0106
     125 | B-0106
     303 | B-0106
     245 | B-0106
      68 | B-0107
     304 | B-0107
(10 rows)


buildinghac=>

  SELECT DISTINCT ON ( buildingnbr )
            itemnbr, buildingnbr
    FROM Actionitems
ORDER BY buildingnbr
   LIMIT 10;
 itemnbr | buildingnbr
---------+-------------
     245 | B-0106
     364 | B-0107
    1170 | B-0111
     361 | B-0112
     128 | B-0116
    1013 | B-0117
     129 | B-0118
     368 | B-0300
    1141 | B-0307
      74 | B-0423
(10 rows)



--
Regards,
Richard Broersma Jr.

Re: DONT_CARE Aggregate

From
Robert James
Date:
I see.  What if I need to do this along with an Aggregate Query.  Eg
something like:

SELECT x,y,z, MAX(a), MAX(b), DONT_CARE_AS_LONG_AS_NOT_NULL(c),
DONT_CAR_AS_LONG_AS_P_IS_TRUE(d,p)
...
GROUP BY x,y,z


On 12/20/12, Richard Broersma <richard.broersma@gmail.com> wrote:
> On Thu, Dec 20, 2012 at 5:45 AM, Robert James
> <srobertjames@gmail.com>wrote:
>
>> Sergey - That's an interesting option, but I'm not sure how to use it
>> as an aggregate.  Could you give an example?
>>
>
> Here is an example:
>
> buildinghac=>
>   SELECT itemnbr, buildingnbr
>     FROM Actionitems
> ORDER BY buildingnbr
>    LIMIT 10;
>  itemnbr | buildingnbr
> ---------+-------------
>     1181 | B-0106
>      363 | B-0106
>      185 | B-0106
>      483 | B-0106
>       67 | B-0106
>      125 | B-0106
>      303 | B-0106
>      245 | B-0106
>       68 | B-0107
>      304 | B-0107
> (10 rows)
>
>
> buildinghac=>
>
>   SELECT DISTINCT ON ( buildingnbr )
>             itemnbr, buildingnbr
>     FROM Actionitems
> ORDER BY buildingnbr
>    LIMIT 10;
>  itemnbr | buildingnbr
> ---------+-------------
>      245 | B-0106
>      364 | B-0107
>     1170 | B-0111
>      361 | B-0112
>      128 | B-0116
>     1013 | B-0117
>      129 | B-0118
>      368 | B-0300
>     1141 | B-0307
>       74 | B-0423
> (10 rows)
>
>
> --
> Regards,
> Richard Broersma Jr.
>


Re: DONT_CARE Aggregate

From
Chris Curvey
Date:

On Thu, Dec 20, 2012 at 12:00 PM, Robert James <srobertjames@gmail.com> wrote:
I see.  What if I need to do this along with an Aggregate Query.  Eg
something like:

SELECT x,y,z, MAX(a), MAX(b), DONT_CARE_AS_LONG_AS_NOT_NULL(c),
DONT_CAR_AS_LONG_AS_P_IS_TRUE(d,p)
...
GROUP BY x,y,z


ah, I get what you're trying to do.  If you truly don't care about the value of C, then just use MIN() or MAX().  

for the conditional part, use a CASE statement, along with MAX or MIN, like this:

SELECT MIN(CASE WHEN P=TRUE THEN D ELSE NULL END)

Because MIN() and MAX() ignore NULL values.  (Except for the special case where all the values are null.)


--
e-Mail is the equivalent of a postcard written in pencil.  This message may not have been sent by me, or intended for you.  It may have been read or even modified while in transit.  e-Mail disclaimers have the same force in law as a note passed in study hall.  If your corporate attorney says that you need an disclaimer in your signature, you need a new corporate attorney.

Re: DONT_CARE Aggregate

From
Marti Raudsepp
Date:
On Thu, Dec 20, 2012 at 3:28 AM, Robert James <srobertjames@gmail.com> wrote:
> Is there an aggregate that will return an arbitrary instance? That is,
> not necessarily the max or min, just any one? (Which might perform
> better than max or min)
>
> More importantly:
> Is there one which will return an arbitrary instance as long as it's not NULL

There's an extension on PGXN which implements first()/last()
aggregates in C: http://pgxn.org/dist/first_last_agg/

It should be slightly faster than min()/max(), but the difference is
probably not significant in more complex queries.

Regards,
Marti


Re: DONT_CARE Aggregate

From
Tom Lane
Date:
Marti Raudsepp <marti@juffo.org> writes:
> On Thu, Dec 20, 2012 at 3:28 AM, Robert James <srobertjames@gmail.com> wrote:
>> Is there an aggregate that will return an arbitrary instance? That is,
>> not necessarily the max or min, just any one? (Which might perform
>> better than max or min)
>>
>> More importantly:
>> Is there one which will return an arbitrary instance as long as it's not NULL

> There's an extension on PGXN which implements first()/last()
> aggregates in C: http://pgxn.org/dist/first_last_agg/

> It should be slightly faster than min()/max(), but the difference is
> probably not significant in more complex queries.

Another thing to consider is that the presence of any "generic"
aggregate forces a full-table scan, since the system doesn't know that
the aggregate has any particular behavior.  MIN/MAX on the other hand
can be optimized into index probes, if they are on indexed columns.
If the query otherwise uses only MIN/MAX aggregates, it's not hard
to believe that adding a FIRST() or LAST() instead of a MIN/MAX
aggregate could make the query significantly slower, not faster.

However, if you're targeting queries containing a variety of aggregates,
or if any of them are on unindexed columns, then this special case may
not be of much interest.

            regards, tom lane