I have below a simplified version of what I'm trying to do. Basically, I am
trying to get both an aggregate (an average) and "most recent" value.
g | v | ts
---+----+----------------------------
1 | 10 | 2003-08-20 16:00:27.010769
1 | 20 | 2003-08-20 16:00:30.380476
2 | 40 | 2003-08-20 16:00:37.399717
2 | 80 | 2003-08-20 16:00:40.265717
I would like, as output, something like this:
g | v | avg | ts
---+----+--------------------+----------------------------
1 | 20 | 15.000000000000000 | 2003-08-20 16:00:30.380476
2 | 80 | 60.000000000000000 | 2003-08-20 16:00:40.265717
which I got by a query like:
SELECT
t2.g,t2.v,t1.avg,t2.ts
FROM
(SELECT
g,avg(v)
FROM t
GROUP BY g
) t1,
(SELECT
DISTINCT ON (g)
* FROM t
ORDER BY g,ts DESC
) t2
WHERE t1.g = t2.g;
That produces the results that I need, but it seems inefficient to join a
table with itself like that. My real query (not this simplified example)
takes 5+ seconds and I suspect this join is why.
Is there a better way?
For my real query, it's using index scans where I'd expect, and I frequently
VACUUM ANALYZE the big table and I have all the stats turned on. Also, I have
more shared buffers than needed to put everything in RAM.
Right now I'm using 7.2.1. Any improvements in 7.3 or 7.4 that would help this
issue?
Regards,
Jeff Davis