Re: A challenge for the SQL gurus out there... - Mailing list pgsql-general

From Uwe C. Schroeder
Subject Re: A challenge for the SQL gurus out there...
Date
Msg-id 200809071231.49379.uwe@oss4u.com
Whole thread Raw
In response to Re: A challenge for the SQL gurus out there...  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-general

On Sunday 07 September 2008, Gregory Stark wrote:
> "Uwe C. Schroeder" <uwe@oss4u.com> writes:
> > I want to get a list looking like
> >
> > forum id    thread_id    post_id
> > 1        6    443
> > 2        9    123
> > 3        3    557
>
> ...
>
> > It all boils down to me not being able to come up with a query that gives
> > me the latest post per forum_id.
>
> In a situation like this I would probably denormalize the tables slightly
> by adding a form_id key to the individual posts. That would make it hard to
> ever move a thread from one forum to another, though not impossible, but
> would help in this case as well as any other time you want to do an
> operation on all posts in a forum regardless of thread.
>
> If you add that column then you could index <form_id,date> and get the
> result you're looking for instantly with a DISTINCT ON query (which is a
> Postgres SQL extension).
>
> SELECT DISTINCT ON (form_id)
>        forum_id, thread_id, post_id
>   FROM thread
>  ORDER BY forum_id, date DESC
>
> (actually you would have to make the index on <form_id, date DESC> or make
> both columns DESC in the query and then re-order them in an outer query)
>
> Alternatively you could have a trigger on posts which updates a
> last_updated field on every thread (and possibly a recent_post_id) then you
> could have a query on forums which pulls the most recently updated thread
> directly without having to join on form_post at all. That would slow down
> inserts but speed up views -- possibly a good trade-off for a forum system.

Thanks Gregory.
Just to put my final solution on the list: I ended up with a combined approach
of what you suggested:
I added the forum_id to the posts table and created 2 triggers: one that sets
the forum_id in the posts table to the forum_id in the threads table on
insert (therefor no change in the application was necessary).
The second trigger is to overcome the downside of adding the forum_id to the
posts table. On an update to forum_thread.forum_id the trigger updates all
posts in that thread to reflect the change in forum_id. That way one can just
move the whole thread by changing the forum_id and the posts are moved along
by the trigger.

Very nice! The query time is now 198ms instead of up to 48seconds !!!

Thanks for the idea

Uwe

pgsql-general by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] New shapshot RPMs (Sep 7 2008) are ready for testing
Next
From: "c k"
Date:
Subject: secure connections