Home > mailing lists

Re: how do i count() similar items - Mailing list pgsql-general

From	Darren Duncan
Subject	Re: how do i count() similar items
Date	September 8, 2010 19:24:00
Msg-id	4C87DC98.7000302@darrenduncan.net Whole thread Raw
In response to	how do i count() similar items (jackassplus <jackassplus@gmail.com>)
List	pgsql-general

Tree view

jackassplus wrote:
> lets say I hve the following in the 'fruits' table:
>
> Round orange
> Sunkist orange
> navel orange
> strawberry
> blueberry
> sunkist orange
> apple
>
> how would I get something like the following:
>
> count as c | Fruit type
> ---------------------------------
> 4              | orange
> 2              | berry
> 1              | apple

Your best solution is to have separate database fields for your 2 levels of
detail, say call them "fruit type" and "variety", or maybe other fields
depending on purpose.

This is because the organization of fruit is somewhat arbitrary and isn't easily
encoded into variety names without kludges.  Also, many fruit with similar names
are actually very different.  So simple textual analysis of the field you have
often won't be very useful.

For example, a kiwifruit is a berry, and a strawberry is very different from a
blueberry, the first not actually being a berry at all.  There are also various
citrus which are crosses between oranges and other citrus.  There are also stone
fruits which are crosses between plums and apricots.  Also, "Sunkist" isn't a
variety of orange but rather is a brand name used for multiple varieties.

(Yes, members of my family cultivate fruit trees, so I have a lot of second-hand
experience with this.)

If you still want to go by textual analysis as you suggest, it will be
nontrivial and involve pattern matching for common suffixes where some are
separate words and some aren't, and you can do this pattern matching in an extra
select-list item which you then group by.

-- Darren Duncan

pgsql-general by date:

From: Scott Marlowe
Date: 08 September 2010, 19:02:00
Subject: Re: Memory Errors

From: Tom Lane
Date: 08 September 2010, 19:30:25
Subject: Re: Memory Errors

Re: how do i count() similar items - Mailing list pgsql-general

Previous

Next