Re: Embarassing GROUP question - Mailing list pgsql-general

From Corey Tisdale
Subject Re: Embarassing GROUP question
Date
Msg-id AACC696E-4D4D-4DD2-BE34-3A6A158ED32D@eyewantmedia.com
Whole thread Raw
In response to Re: Embarassing GROUP question  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Embarassing GROUP question
List pgsql-general
You may have nailed it. Everythig would have been indexed I. The order
it was grouped by, so perhaps the order in which things are indexed
and accesse is the kicker, or perhaps we've been consistantly lucky.

We also weren't adding image data to blobs, we were bit mapping
faceted data to blob and shifting to allow people to shop by artist or
color or subject matter across millions of posters. Normalized tables
just weren't cutting it, and bit shifting up to 32 bit was crazy fast.
After we rolled it out in production, we found mysql converts blobs to
32 bit unsigned ints before shifting. Postgres appears to not do this
at all, or our arbitrarily large test data did not trigger it on
postgres.

After the last few days, it is becoming apparent how much of a joke
mysql has been. Thanks again for such quick direction!

Corey Tisdale

On Oct 3, 2009, at 5:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Corey Tisdale <corey@eyewantmedia.com> writes:
>> We're coming from mysql 4, and changing the sort order changes the
>> values of all columns as you would expect, given that you would
>> expect
>> a sort statement to affect grouping.  This certainly isn't the only
>> time I've used this syntax. I've been mysql user for ten years, and
>> the outcome has been consistant across hundreds of tables and
>> millions
>> of rows and thousands of queries. If you ever have to use or modify a
>> mysql db, just keep this in mind in case it saves you some time.
>
> Okay, I got sufficiently interested to drag out the nearest copy of
> mysql and try it ...
>
> mysql> create table t (f1 int, f2 int, f3 int);
> Query OK, 0 rows affected (0.00 sec)
>
> mysql> insert into t values(1,11,111), (1,22,222), (1,44,444),
> (1,33,333);
> Query OK, 4 rows affected (0.00 sec)
> Records: 4  Duplicates: 0  Warnings: 0
>
> mysql> insert into t values(2,55,555), (2,22,222), (2,44,444),
> (2,33,333);
> Query OK, 4 rows affected (0.00 sec)
> Records: 4  Duplicates: 0  Warnings: 0
>
> mysql> insert into t values(3,55,555), (3,22,222), (3,44,444),
> (3,77,777);
> Query OK, 4 rows affected (0.00 sec)
> Records: 4  Duplicates: 0  Warnings: 0
>
> mysql> select * from t group by f1 order by f2;
> +------+------+------+
> | f1   | f2   | f3   |
> +------+------+------+
> |    1 |   11 |  111 |
> |    2 |   55 |  555 |
> |    3 |   55 |  555 |
> +------+------+------+
> 3 rows in set (0.00 sec)
>
> mysql> select * from t group by f1 order by f2 desc;
> +------+------+------+
> | f1   | f2   | f3   |
> +------+------+------+
> |    2 |   55 |  555 |
> |    3 |   55 |  555 |
> |    1 |   11 |  111 |
> +------+------+------+
> 3 rows in set (0.00 sec)
>
> Looks to me like we're arbitrarily getting the physically-first row in
> each f1 group.  It's certainly not looking for the minimum or
> maximum f2.
>
> The above is with 5.1.37, but I find essentially the same wording in
> the 3.x/4.x manual as in the 5.1 manual.
>
> Now it's certainly possible that in particular circumstances you might
> happen to get the right results --- for example, a scan that was using
> an index might happen to deliver the rows in the right order.  But I
> don't see any evidence that mysql is reliably producing groupwise
> minimums or maximums with this syntax.  The long discussions in the
> comments here:
> http://dev.mysql.com/doc/refman/4.1/en/example-maximum-column-group-row.html
> don't suggest that anyone else believes it works, either.
>
>            regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Embarassing GROUP question
Next
From: Sam Mason
Date:
Subject: Re: Procedure for feature requests?