Re: extracting min date and grouping - Mailing list pgsql-novice
From | Michael Fuhr |
---|---|
Subject | Re: extracting min date and grouping |
Date | |
Msg-id | 20041222065240.GA95986@winnie.fuhr.org Whole thread Raw |
In response to | extracting min date and grouping ("Keith Worthington" <keithw@narrowpathinc.com>) |
List | pgsql-novice |
On Tue, Dec 21, 2004 at 10:37:56PM -0500, Keith Worthington wrote: > scan_timestamp | item_id | quantity | employee_id | void > ---------------------+---------+----------+-------------+------ > 2004-12-20 16:09:47 | SEB12 | 555 | 116 | f > 2004-12-20 16:10:03 | B346.0 | 555 | 116 | f > 2004-12-20 16:10:11 | B346.5 | 888 | 116 | f > 2004-12-20 16:09:33 | SAC38 | 66 | 116 | f > 2004-12-19 09:05:29 | SNAP50 | 2255 | 116 | f > 2004-12-19 09:05:39 | RSN2222 | 525 | 116 | f > 2004-12-19 09:05:49 | SAC38 | 658 | 116 | f It would be helpful to see the SQL statements used to create and populate the sample table -- that way readers can paste them into their database and experiment with the queries you're after. > I can get the min date but isn't there an easier/faster way? > IPADB=# SELECT CAST(extract(year from min(scan_timestamp)) || extract(month > from min(scan_timestamp)) || extract(day from min(scan_timestamp)) AS date) AS > inv_date FROM inventory.tbl_scanner; SELECT MIN(scan_timestamp)::DATE AS inv_date FROM inventory.tbl_scanner; If the table is large then you could speed up the operation by creating an index on DATE(scan_timestamp) and using ORDER BY and LIMIT (a PostgreSQL trick to speed up certain aggregate queries): CREATE INDEX tbl_scanner_date_idx ON inventory.tbl_scanner (DATE(scan_timestamp)); SELECT DATE(scan_timestamp) AS inv_date FROM inventory.tbl_scanner ORDER BY DATE(scan_timestamp) LIMIT 1; If you use EXPLAIN ANALYZE on the various queries, you should see that the last one uses the index while the others do a sequential scan. > The group and the sum is straightforward but I get this > IPADB=# SELECT CAST(extract(year from min(scan_timestamp)) || extract(month > from min(scan_timestamp)) || extract(day from min(scan_timestamp)) AS date) AS > inventory_date, item_id, sum(quantity) as total FROM inventory.tbl_scanner > GROUP BY item_id; > inventory_date | item_id | total > ----------------+---------+------- > 2004-12-20 | B346.0 | 555 > 2004-12-20 | B346.5 | 888 > 2004-12-19 | RSN2222 | 525 > 2004-12-19 | SAC38 | 724 > 2004-12-20 | SEB12 | 555 > 2004-12-19 | SNAP50 | 2255 > (6 rows) > > When what I really want is this. > inventory_date | item_id | total > ----------------+---------+------- > 2004-12-19 | B346.0 | 555 > 2004-12-19 | B346.5 | 888 > 2004-12-19 | RSN2222 | 525 > 2004-12-19 | SAC38 | 724 > 2004-12-19 | SEB12 | 555 > 2004-12-19 | SNAP50 | 2255 Here's one way to get that result for your example data set -- it gets the table's minimum date in a subselect and cross joins it to the table's records: SELECT inventory_date, item_id, SUM(quantity) AS total FROM (SELECT MIN(scan_timestamp)::DATE AS inventory_date FROM inventory.tbl_scanner) AS s CROSS JOIN inventory.tbl_scanner GROUP BY inventory_date, item_id ORDER BY item_id; Why do you want all records to have an inventory_date of 2004-12-19? Because that's the beginning of an inventory period? If so, how is that beginning date determined, i.e., how do you know when one period ends and another begins? If we knew more about what you're trying to do then maybe we could suggest improvements. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
pgsql-novice by date: