On Wed, Jan 14, 2009 at 07:36:03PM -0800, mailinglists@net-virtual.com wrote:
> CREATE TABLE listings (
> trans_id SERIAL,
> mode CHAR(1),
> listing_id INT,
> region_id INT,
> category INT
> );
>
> "SELECT * FROM listings ORDER BY region_id, category, listing_id,
> trans_id"
> [...] what I want to do is get only the last transaction for
> a given listing_id, because the earlier ones don't matter.
If you have an index on (region_id,category,listing_id,trans_id) you
should be able to do:
SELECT region_id,category,listing_id,MAX(trans_id)
FROM listings
GROUP BY region_id,category,listing_id;
And have PG answer this using the index (it'll only do this if it thinks
there are many transactions for each group though).
> On top of
> that, each region_id and category_id has its own index. I need to be able
> to process the indexes in-full, one-at-a-time because there are too many
> to hold that many open filehandles/processes at one time.
Not sure what you mean by "index" here; I'm assuming you're talking
about something outside PG, or am I missing some context?
If it's inside PG, then you do know that every index you have will slow
down every modification of the table?
> So, my question is, is there some way to return the rows in a
> deterministic order, without actually having to do an explicit sort on the
> data? What I mean is, I don't care if category_id 4 / region_id 10 /
> listing_id 10000 comes before category_id 1 / region_id 1 / lisitng_id 1
> -- I just need them returned to me in that sort of grouped order (although
> sorted by trans_id).
If you want to know all the transaction ids then you need to do the
sort, if you only want the largest/latest then you're probably better
off telling PG that's what you want (i.e. using GROUP BY and MAX
aggregate and letting it make an appropiate decision).
--
Sam http://samason.me.uk/