Thread: problem with distinct rows

problem with distinct rows

From
tony
Date:
Hello,

I am having a problem with returning distinct rows this is probably a
newbie question but here goes:

Tables are artist, created_by and works
the join is on created_by.work_id and created_by.artist_id

A work of art can have two or more artists listed as creators in created
by. In the inventory lists we don't care we only need one reference to
each work AND (this is the part that hurts) they must be ordered
alphabetically by the _first_ artists name.

example:
artist_name                 : title            :    inventory_number
Bernadette Genée et Alain Le Borgne     : Pièce à conviction    : 004090101

Should be after "F" and before "H"

But if I do DISTINCT ON inventory_number I must order by
inventory_number then artist_name which totally defeats my purpose. I
have also played with GROUP BY and HAVING which

Clues much appreciated

Tony Grant




Re: problem with distinct rows

From
John Sidney-Woollett
Date:
If the created_by table includes an artist number/position to indicate
the first, second artist etc, eg

create table created_by (
   work_id  integer,
   artist_id  integer,
   position   integer,
   primary key (work_id, artist_id, position)
);

then you can simply use the following query

select a.artist_name, w.title, w.inventory_number
from artist a, works w, created_by cb
where w.work_id = cb.work_id
and a.artist_id = cb.artist_id
and cb.position = 1
order by a.artist_name, w.title;

If you don't have a position or similar field in the created_by table,
you will have more difficulty as you're finding.

An alternative approach is to create a function which arbitrarily
returns one artist name for a work, and then sort on that but it wont be
as efficient.

select artist_name, title, inventory_number from (
   select GetArtistName(w.inventory_number) as artist_name, w.title,
     w.inventory_number
   from works w
) as t
order by artist_name, title

BTW, I haven't checked any of this, but hopefully it will give you some
pointers or ideas.

John Sidney-Woollett

tony wrote:
> Hello,
>
> I am having a problem with returning distinct rows this is probably a
> newbie question but here goes:
>
> Tables are artist, created_by and works
> the join is on created_by.work_id and created_by.artist_id
>
> A work of art can have two or more artists listed as creators in created
> by. In the inventory lists we don't care we only need one reference to
> each work AND (this is the part that hurts) they must be ordered
> alphabetically by the _first_ artists name.
>
> example:
> artist_name                 : title            :    inventory_number
> Bernadette Genée et Alain Le Borgne     : Pièce à conviction    : 004090101
>
> Should be after "F" and before "H"
>
> But if I do DISTINCT ON inventory_number I must order by
> inventory_number then artist_name which totally defeats my purpose. I
> have also played with GROUP BY and HAVING which
>
> Clues much appreciated
>
> Tony Grant
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

Re: problem with distinct rows

From
Martijn van Oosterhout
Date:
You can put the DISTINCT ON() query as a subquery, and an ORDER BY in
the outer query.

Sorting by surname is tricky since you need to tell the computer how to
find it...

Hope this helps,

On Tue, Mar 08, 2005 at 10:03:48AM +0100, tony wrote:
> Hello,
>
> I am having a problem with returning distinct rows this is probably a
> newbie question but here goes:
>
> Tables are artist, created_by and works
> the join is on created_by.work_id and created_by.artist_id
>
> A work of art can have two or more artists listed as creators in created
> by. In the inventory lists we don't care we only need one reference to
> each work AND (this is the part that hurts) they must be ordered
> alphabetically by the _first_ artists name.
>
> example:
> artist_name                 : title            :    inventory_number
> Bernadette Genée et Alain Le Borgne     : Pièce à conviction    : 004090101
>
> Should be after "F" and before "H"
>
> But if I do DISTINCT ON inventory_number I must order by
> inventory_number then artist_name which totally defeats my purpose. I
> have also played with GROUP BY and HAVING which
>
> Clues much appreciated
>
> Tony Grant
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: problem with distinct rows

From
John Sidney-Woollett
Date:
Add a SORTNAME field to the artist table and use that for sorting. This
will help you deal with diacrtics and accented characters by transposing
them to a regular character instead.

The sortname value for Genée would be "genee".

With a few changes to your data model (and possibly application) you
will probably end up making your life a little easier.

John Sidney-Woollett

Martijn van Oosterhout wrote:
> You can put the DISTINCT ON() query as a subquery, and an ORDER BY in
> the outer query.
>
> Sorting by surname is tricky since you need to tell the computer how to
> find it...
>
> Hope this helps,
>
> On Tue, Mar 08, 2005 at 10:03:48AM +0100, tony wrote:
>
>>Hello,
>>
>>I am having a problem with returning distinct rows this is probably a
>>newbie question but here goes:
>>
>>Tables are artist, created_by and works
>>the join is on created_by.work_id and created_by.artist_id
>>
>>A work of art can have two or more artists listed as creators in created
>>by. In the inventory lists we don't care we only need one reference to
>>each work AND (this is the part that hurts) they must be ordered
>>alphabetically by the _first_ artists name.
>>
>>example:
>>artist_name                 : title            :    inventory_number
>>Bernadette Genée et Alain Le Borgne     : Pièce à conviction    : 004090101
>>
>>Should be after "F" and before "H"
>>
>>But if I do DISTINCT ON inventory_number I must order by
>>inventory_number then artist_name which totally defeats my purpose. I
>>have also played with GROUP BY and HAVING which
>>
>>Clues much appreciated
>>
>>Tony Grant
>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 9: the planner will ignore your desire to choose an index scan if your
>>      joining column's datatypes do not match
>
>

Re: problem with distinct rows

From
tony
Date:
Le mardi 08 mars 2005 à 11:04 +0100, Martijn van Oosterhout a écrit :
> You can put the DISTINCT ON() query as a subquery, and an ORDER BY in
> the outer query.

That was my first guess

select number.inventorynumber, name, first_name
from (select distinct on (inventorynumber) inventorynumber from works)
as number, artist, blah
order by name

???

> Sorting by surname is tricky since you need to tell the computer how to
> find it...

in artist table we have name, first_name etc so no problem there

Tony




Re: problem with distinct rows

From
Martijn van Oosterhout
Date:
On Tue, Mar 08, 2005 at 12:22:35PM +0100, tony wrote:
> Le mardi 08 mars 2005 à 11:04 +0100, Martijn van Oosterhout a écrit :
> > You can put the DISTINCT ON() query as a subquery, and an ORDER BY in
> > the outer query.
>
> That was my first guess
>
> select number.inventorynumber, name, first_name
> from (select distinct on (inventorynumber) inventorynumber from works)
> as number, artist, blah
> order by name

and, does it work? Obviously you need to put the joins in and such. But
the outer query should be last, like:

SELECT * FROM
  (... subquery ...) AS x
ORDER BY name;

The outer query is only for ordering, the joins, etc should all be in
the subquery...

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: problem with distinct rows

From
tony
Date:
Le mardi 08 mars 2005 à 12:29 +0100, Martijn van Oosterhout a écrit :

> and, does it work? Obviously you need to put the joins in and such. But
> the outer query should be last, like:
>
> SELECT * FROM
>   (... subquery ...) AS x
> ORDER BY name;
>
> The outer query is only for ordering, the joins, etc should all be in
> the subquery...

You are my man!!!!!

Thanks for that

Beer =:-D

Tony