Hi,
I've got a table which includes complete versioning and history for each
entry. The fields in question are:
- createdate (date of creation of this row)
- vno (version number, increased by 1 for each new version)
- active ('Y' for newest version, 'N' for all older ones, simply set to
'N' for all versions, if a row is deleted)
- dam_id (key value, but not unique, same value for each
version of an entry)
I want to select the following data:
(1) All the changes, which were made since a specific date in the past,
ordered by their date of creation descending and a name field in ascending
way.
(2) As I want to set a special mark in the output if that was a new entry
or just the change of an old existing entry, I need to check the version
number for each date of creation.
(3) As there can be multiple versions on one day, I need to check for the
lowest version number on each date of creation to see, wether it really
was a change or a new entry.
(4) Finally I need to exclude all the entries, if they don't have a valid
entry on the current day any more, which means they were deleted after the
date the last entry was made -> short: only entrys which have a active='Y'
entry are allowed to be selected.
Thought quite a while about it and came to the following conclusion:
SELECT d.field1,
d.field2, ...
d.createdate,
(SELECT d4.createdate
FROM dam d4
WHERE d.dam_id = d4.dam_id
ORDER BY createdate ASC LIMIT 1) AS firstdate
FROM dam d
WHERE createdate >= '1999-06-01'
AND vno=(SELECT d2.vno
FROM dam d2
WHERE d2.dam_id=d.dam_id
AND d2.createdate=d.createdate
ORDER BY vno ASC LIMIT 1)
AND EXISTS
(SELECT *
FROM dam d3
WHERE d3.dam_id=d.dam_id
AND d3.active=d.active
AND d.active='Y')
ORDER BY
createdate DESC,
dam ASC
LIMIT 200;
The explain plan looks like:
NOTICE: QUERY PLAN:
Limit (cost=137625.77..137625.77 rows=200 width=73)
-> Sort (cost=137625.77..137625.77 rows=395 width=73)
-> Seq Scan on dam d (cost=0.00..137608.74 rows=395 width=73)
SubPlan
-> Limit (cost=7.24..7.24 rows=1 width=4)
-> Sort (cost=7.24..7.24 rows=1 width=4)
-> Index Scan using ix_dam_dam_id on dam d4
(cost=0.00..7.23 rows=1 width=4)
-> Limit (cost=7.24..7.24 rows=1 width=4)
-> Sort (cost=7.24..7.24 rows=1 width=4)
-> Index Scan using ix_dam_dam_id on dam d2
(cost=0.00..7.23 rows=1 width=4)
-> Result (cost=0.00..7.23 rows=1 width=287)
-> Index Scan using ix_dam_dam_id on dam d3
(cost=0.00..7.23 rows=1 width=287)
Maybe someone wants to discuss about that approach with me or wants to
point out possible errors. All opinions are welcome.
Kind regards
... Ralph ...