Home > mailing lists

Re: extracting min date and grouping - Mailing list pgsql-novice

From	Michael Fuhr
Subject	Re: extracting min date and grouping
Date	December 22, 2004 06:53:06
Msg-id	20041222065240.GA95986@winnie.fuhr.org Whole thread Raw
In response to	extracting min date and grouping ("Keith Worthington" <keithw@narrowpathinc.com>)
List	pgsql-novice

Tree view

On Tue, Dec 21, 2004 at 10:37:56PM -0500, Keith Worthington wrote:

>    scan_timestamp    | item_id | quantity | employee_id | void
> ---------------------+---------+----------+-------------+------
>  2004-12-20 16:09:47 | SEB12   |      555 | 116         | f
>  2004-12-20 16:10:03 | B346.0  |      555 | 116         | f
>  2004-12-20 16:10:11 | B346.5  |      888 | 116         | f
>  2004-12-20 16:09:33 | SAC38   |       66 | 116         | f
>  2004-12-19 09:05:29 | SNAP50  |     2255 | 116         | f
>  2004-12-19 09:05:39 | RSN2222 |      525 | 116         | f
>  2004-12-19 09:05:49 | SAC38   |      658 | 116         | f

It would be helpful to see the SQL statements used to create and
populate the sample table -- that way readers can paste them into
their database and experiment with the queries you're after.

> I can get the min date but isn't there an easier/faster way?
> IPADB=# SELECT CAST(extract(year from min(scan_timestamp)) || extract(month
> from min(scan_timestamp)) || extract(day from min(scan_timestamp)) AS date) AS
> inv_date FROM inventory.tbl_scanner;

SELECT MIN(scan_timestamp)::DATE AS inv_date FROM inventory.tbl_scanner;

If the table is large then you could speed up the operation by
creating an index on DATE(scan_timestamp) and using ORDER BY and
LIMIT (a PostgreSQL trick to speed up certain aggregate queries):

CREATE INDEX tbl_scanner_date_idx
  ON inventory.tbl_scanner (DATE(scan_timestamp));

SELECT DATE(scan_timestamp) AS inv_date
FROM inventory.tbl_scanner
ORDER BY DATE(scan_timestamp) LIMIT 1;

If you use EXPLAIN ANALYZE on the various queries, you should see
that the last one uses the index while the others do a sequential
scan.

> The group and the sum is straightforward but I get this
> IPADB=# SELECT CAST(extract(year from min(scan_timestamp)) || extract(month
> from min(scan_timestamp)) || extract(day from min(scan_timestamp)) AS date) AS
> inventory_date, item_id, sum(quantity) as total FROM inventory.tbl_scanner
> GROUP BY item_id;
>  inventory_date | item_id | total
> ----------------+---------+-------
>  2004-12-20     | B346.0  |   555
>  2004-12-20     | B346.5  |   888
>  2004-12-19     | RSN2222 |   525
>  2004-12-19     | SAC38   |   724
>  2004-12-20     | SEB12   |   555
>  2004-12-19     | SNAP50  |  2255
> (6 rows)
>
> When what I really want is this.
>  inventory_date | item_id | total
> ----------------+---------+-------
>  2004-12-19     | B346.0  |   555
>  2004-12-19     | B346.5  |   888
>  2004-12-19     | RSN2222 |   525
>  2004-12-19     | SAC38   |   724
>  2004-12-19     | SEB12   |   555
>  2004-12-19     | SNAP50  |  2255

Here's one way to get that result for your example data set -- it
gets the table's minimum date in a subselect and cross joins it to
the table's records:

SELECT inventory_date, item_id, SUM(quantity) AS total
FROM (SELECT MIN(scan_timestamp)::DATE AS inventory_date
      FROM inventory.tbl_scanner) AS s
CROSS JOIN inventory.tbl_scanner
GROUP BY inventory_date, item_id
ORDER BY item_id;

Why do you want all records to have an inventory_date of 2004-12-19?
Because that's the beginning of an inventory period?  If so, how
is that beginning date determined, i.e., how do you know when one
period ends and another begins?  If we knew more about what you're
trying to do then maybe we could suggest improvements.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

pgsql-novice by date:

From: "Keith Worthington"
Date: 22 December 2004, 05:55:16
Subject: Problem with ALIAS?

From: Michael Fuhr
Date: 22 December 2004, 07:07:23
Subject: Re: Problem with ALIAS?

Re: extracting min date and grouping - Mailing list pgsql-novice

Previous

Next