Thread: query to select a linked list

query to select a linked list

From
Louis-David Mitterrand
Date:
Hi,

To build a threaded forum application I came up the following schema:

forum
------
id_forum | integer| not null  default nextval('forum_id_forum_seq'::regclass)
id_parent| integer| 
subject  | text   | not null
message  | text   | 

Each message a unique id_forum and an id_parent pointing to the replied 
post (empty if first post).

How can I build an elegant query to select all messages in a thread?

Thanks,


Re: query to select a linked list

From
Gregory Stark
Date:
"Louis-David Mitterrand" <vindex+lists-pgsql-sql@apartia.org> writes:

> Each message a unique id_forum and an id_parent pointing to the replied 
> post (empty if first post).
>
> How can I build an elegant query to select all messages in a thread?

You would need recursive queries which Postgres doesn't support. There is a
patch out there to add support but I don't think it's up-to-date with 8.2 and
in any case the resulting queries can be quite intense.

I would recommend you look into the contrib module named "ltree". It's easy to
use and works well with the gist indexes. It does require changing your data
model denormalizing it slightly which makes it hard to "reparent" children,
but if that isn't an operation you have to support I think it makes most other
operations you might want to do much easier to support.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: query to select a linked list

From
Louis-David Mitterrand
Date:
On Wed, May 09, 2007 at 02:24:22PM +0100, Gregory Stark wrote:
> "Louis-David Mitterrand" <vindex+lists-pgsql-sql@apartia.org> writes:
> 
> > Each message a unique id_forum and an id_parent pointing to the replied 
> > post (empty if first post).
> >
> > How can I build an elegant query to select all messages in a thread?
> 
> You would need recursive queries which Postgres doesn't support. There is a
> patch out there to add support but I don't think it's up-to-date with 8.2 and
> in any case the resulting queries can be quite intense.
> 
> I would recommend you look into the contrib module named "ltree". It's easy to
> use and works well with the gist indexes. It does require changing your data
> model denormalizing it slightly which makes it hard to "reparent" children,
> but if that isn't an operation you have to support I think it makes most other
> operations you might want to do much easier to support.

After looking around a little I came to the same conclusions.

Thanks for you help,

Cheers,


Re: query to select a linked list

From
"Aaron Bono"
Date:
On 5/9/07, Louis-David Mitterrand <vindex+lists-pgsql-sql@apartia.org> wrote:
Hi,

To build a threaded forum application I came up the following schema:

forum
------
id_forum | integer| not null  default nextval('forum_id_forum_seq'::regclass)
id_parent| integer|
subject  | text   | not null
message  | text   |

Each message a unique id_forum and an id_parent pointing to the replied
post (empty if first post).

How can I build an elegant query to select all messages in a thread?

Thanks,

Unlike Oracle, PostgreSQL doesn't have anything like a connect by so you would need to write your own stored procedure for that (if new versions of PostgreSQL will have connect by, let me know guys).

What I did was add a little redundancy to my forum tables and had a table structure kind of like this:

forum
   forum_id BIGSERIAL PK,
   name VARCHAR(50)

forum_topic
   forum_topic_id BIGSERIAL PK,
   forum_id BIGINT FK to forum

forum_post
   forum_post_id BIGSERIAL PK,
   create_dt TIMESTAMP,
   subject VARCHAR(255),
   message TEXT,
   forum_topic_id BIGINT FK to forum_topic

and if you want threading, you add a parent_forum_post_id to forum_post (this is where you get the redundancy since only the top forum_post record needs a reference to forum_topic and forum_topic wouldn't even really be needed.

-Aaron
  

--
==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

Re: query to select a linked list

From
Achilleas Mantzios
Date:
Στις Τετάρτη 09 Μάιος 2007 15:55, ο/η Louis-David Mitterrand έγραψε:
> Hi,
>
> To build a threaded forum application I came up the following schema:
>
> forum
> ------
> id_forum | integer| not null  default
> nextval('forum_id_forum_seq'::regclass) id_parent| integer|
> subject  | text   | not null
> message  | text   |
>
> Each message a unique id_forum and an id_parent pointing to the replied
> post (empty if first post).
>
> How can I build an elegant query to select all messages in a thread?
>

Your question is about storing hierarchies in postgresql.
One way is to use the genealogical approach, where we store for
any node the path to its root.
I have used this technique to store description of tanker vessels machinery
(over 1M items) and the plan maintenance on them, and the performance is very
good, while the representation is highly intuitive and flexible,
unlike some wierd approcahes i have hit on.
When i did a small research on the complexity/index usage on various
operations (UPDATE, INSERT, DELETE, SELECT), the performance
was at least as good as the "nested pair" approch that many seemed to
promote.

You add a column "parents" (rather than just the parentid) as an integer[].
For every node you store the path to the root node starting from the most
immediate ancestor.

Then you just make an index on this column using the intarray contrib package.
Then you can easily query for nodes under a specific node, or for nodes just
one level below a specific node, nodes with no descendents (leaf nodes)
etc...

Of course you could do smth simpler, but in the long run,
representing data in the correct way will certainly pay off.

> Thanks,
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--
Achilleas Mantzios


Re: query to select a linked list

From
Louis-David Mitterrand
Date:
On Wed, May 09, 2007 at 02:55:20PM +0200, Louis-David Mitterrand wrote:
> Hi,
> 
> To build a threaded forum application I came up the following schema:
> 
> forum
> ------
> id_forum | integer| not null  default nextval('forum_id_forum_seq'::regclass)
> id_parent| integer| 
> subject  | text   | not null
> message  | text   | 
> 
> Each message a unique id_forum and an id_parent pointing to the replied 
> post (empty if first post).
> 
> How can I build an elegant query to select all messages in a thread?

I am trying to write a recursive pl/sql function to return all thread 
children:

create or replace function forum_children(integer) returns setof forum as $$
declare   rec record;
begin
   for rec in select * from forum where $1 in (id_parent,id_forum) loop
       select * from forum_children(rec.id_forum);       return next rec;
   end loop;
   return;

end;
$$ language 'plpgsql';


But it does not work as intended (infinite loop?).

What did I miss?


Re: query to select a linked list

From
Louis-David Mitterrand
Date:
On Wed, May 09, 2007 at 04:30:21PM +0200, Louis-David Mitterrand wrote:
> On Wed, May 09, 2007 at 02:55:20PM +0200, Louis-David Mitterrand wrote:
> > Hi,
> > 
> > To build a threaded forum application I came up the following schema:
> > 
> > forum
> > ------
> > id_forum | integer| not null  default nextval('forum_id_forum_seq'::regclass)
> > id_parent| integer| 
> > subject  | text   | not null
> > message  | text   | 
> > 
> > Each message a unique id_forum and an id_parent pointing to the replied 
> > post (empty if first post).
> > 
> > How can I build an elegant query to select all messages in a thread?
> 
> I am trying to write a recursive pl/sql function to return all thread 
> children:
> 
> create or replace function forum_children(integer) returns setof forum as $$
> declare
>     rec record;
> begin
> 
>     for rec in select * from forum where $1 in (id_parent,id_forum) loop

Oops, I meant :
for rec in select * from forum where id_parent=$1 loop

which works fine.

Sorry,

>         select * from forum_children(rec.id_forum);
>         return next rec;
> 
>     end loop;
> 
>     return;
> 
> end;
> $$ language 'plpgsql';
> 
> 
> But it does not work as intended (infinite loop?).
> 
> What did I miss?
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend


Re: query to select a linked list

From
Scott Marlowe
Date:
On Wed, 2007-05-09 at 08:24, Gregory Stark wrote:
> "Louis-David Mitterrand" <vindex+lists-pgsql-sql@apartia.org> writes:
> 
> > Each message a unique id_forum and an id_parent pointing to the replied 
> > post (empty if first post).
> >
> > How can I build an elegant query to select all messages in a thread?
> 
> You would need recursive queries which Postgres doesn't support. There is a
> patch out there to add support but I don't think it's up-to-date with 8.2 and
> in any case the resulting queries can be quite intense.
> 
> I would recommend you look into the contrib module named "ltree". It's easy to
> use and works well with the gist indexes. It does require changing your data
> model denormalizing it slightly which makes it hard to "reparent" children,
> but if that isn't an operation you have to support I think it makes most other
> operations you might want to do much easier to support.

Are you sure the tablefunc functions, which include both connectby and
crosstab functions, aren't up to date with 8.2?  They certainly are up
to 8.1, where I'm running them right now on my workstation.  They built
for 8.2 and installed, but I haven't tried using them.

I would think that connectby is at least worth looking into.


Re: query to select a linked list

From
Scott Marlowe
Date:
On Wed, 2007-05-09 at 08:29, Aaron Bono wrote:
> On 5/9/07, Louis-David Mitterrand <vindex+lists-pgsql-sql@apartia.org>
> wrote:
>         Hi,
>         
>         To build a threaded forum application I came up the following
>         schema:
>         
>         forum
>         ------
>         id_forum | integer| not null  default
>         nextval('forum_id_forum_seq'::regclass)
>         id_parent| integer|
>         subject  | text   | not null 
>         message  | text   |
>         
>         Each message a unique id_forum and an id_parent pointing to
>         the replied
>         post (empty if first post).
>         
>         How can I build an elegant query to select all messages in a
>         thread?
>         
>         Thanks, 
> 
> Unlike Oracle, PostgreSQL doesn't have anything like a connect by so
> you would need to write your own stored procedure for that (if new
> versions of PostgreSQL will have connect by, let me know guys). 

Take a look at the tablefunc in contrib, that seems to provide
connectby.  I've only ever used the crosstab functions in there, but I
can't see what would have broken in connectby with 8.2 or anything.


Re: query to select a linked list

From
Gregory Stark
Date:
"Scott Marlowe" <smarlowe@g2switchworks.com> writes:

> Are you sure the tablefunc functions, which include both connectby and
> crosstab functions, aren't up to date with 8.2?  They certainly are up
> to 8.1, where I'm running them right now on my workstation.  They built
> for 8.2 and installed, but I haven't tried using them.
>
> I would think that connectby is at least worth looking into.

Uhm, no, I guess I'm not sure. I didn't realize it was in the tablefunc module
either.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: query to select a linked list

From
Robert Edwards
Date:
Hi Louis-David,

I also have written a forum application using PostgreSQL.

My schema has a "threadid" for each posting, which is actually also the
"messageid" of the first posting in the thread, but that is irrelevant.

I can then just select all messages belonging to that thread. The actual
hierarchy of messages (which posting is in response to which) is dealt
with by a "parentid", identifying the messageid of the post being
responded to. Sorting that out is done by the middleware (PHP in this
case) - the SQL query simply returns all messages in the thread in a
single query. Because our database is somewhat busy, I have opted to
keep the queries to the database simple and let the middleware sort
out the heirarchical structure (which it is quite good at).

I hope this helps.

Bob Edwards.

Louis-David Mitterrand wrote:
> Hi,
> 
> To build a threaded forum application I came up the following schema:
> 
> forum
> ------
> id_forum | integer| not null  default nextval('forum_id_forum_seq'::regclass)
> id_parent| integer| 
> subject  | text   | not null
> message  | text   | 
> 
> Each message a unique id_forum and an id_parent pointing to the replied 
> post (empty if first post).
> 
> How can I build an elegant query to select all messages in a thread?
> 
> Thanks,
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly



Re: query to select a linked list

From
Louis-David Mitterrand
Date:
On Thu, May 10, 2007 at 09:49:32AM +1000, Robert Edwards wrote:
> 
> Hi Louis-David,
> 
> I also have written a forum application using PostgreSQL.
> 
> My schema has a "threadid" for each posting, which is actually also the
> "messageid" of the first posting in the thread, but that is irrelevant.
> 
> I can then just select all messages belonging to that thread. The actual
> hierarchy of messages (which posting is in response to which) is dealt
> with by a "parentid", identifying the messageid of the post being
> responded to. Sorting that out is done by the middleware (PHP in this
> case) - the SQL query simply returns all messages in the thread in a
> single query. Because our database is somewhat busy, I have opted to
> keep the queries to the database simple and let the middleware sort
> out the heirarchical structure (which it is quite good at).
> 
> I hope this helps.

This helps a lot, thanks.

I just wrote a little pl/sql function to compensate for the absence of a 
threadid in my schema:
create or replace function forum_children(integer) returns setof forum as $$declare    rec record;    subrec
record;begin   for rec in select * from forum where id_parent=$1 loop        return next rec;        for subrec in
select* from forum_children(rec.id_forum) loop            return next subrec;        end loop;    end loop;
return;end;$$language 'plpgsql';
 

But in the end it might just be more convenient and clear to have that 
threadid column as you did. 

Sorting in middleware (perl in my case) also seems like good compromise.

Cheers,