Thread: query help/sugestions

query help/sugestions

From
"John Cavacas"
Date:
Hello,
 
I'm developing a small web application that essentialy lets users post articles and then lets other users post comments regarding those articles. This application is already running actualy, i'm working on a new version. The current version runs on PHP+MySQL the new version will be using J2EE with PostgreSQL.
 
Anyway, i'm trying to solve a problem which I had with my previous version. The problem boils down to how to efficiently display an article in either in its full form or in a article listing page (think front page of slashdot) while at the same time finding out what the comment count is for that article.
 
I have 2 tables a "content" table which holds the articles, and a "comments" table that you guessed it holds the comments. In the past what I did was to create one query that returned all of the articles (remember I was using MySQL), then while I looped through the results of that query, in PHP I would create another SQL object and issue another query that went out and did a select count(newsid) for each article. This did what I wanted but I always thought it was a bit of a hack and it did not perform well under stress.
 
Is there a better way to this? I have a few ideas...
 
I was thinking of using a SQL sub select, but to be honest I can't seem to construct the proper query to give me what I need, which would be a result set that would look something like this:
 
newsid | user | date | title | intro | commentcount
 
Another idea I had was to just create a count column in my content table, which gets incremented each time a comment is posted.
 
Taking the above idea a step forward, i could problably create a trigger to keep that column counter updated whenver a new comment is inserted. Now that I think of it, it problably wouldn't work as the trigger could not be generic.
 
Anyway, I was just wondering if anyone out here had any ideas or past experience that wouldn't mind sharing.
 
Thanks a bunch,
John

Re: query help/sugestions

From
Roberto Mello
Date:
On Sat, Feb 08, 2003 at 12:50:20AM -0500, John Cavacas wrote:

Can you please fix your e-mail client so it breaks lines at 72 characters?

> Anyway, i'm trying to solve a problem which I had with my previous version. The problem boils down to how to
efficientlydisplay an article in either in its full form or in a article listing page (think front page of slashdot)
whileat the same time finding out what the comment count is for that article.
 
> 
> I have 2 tables a "content" table which holds the articles, and a "comments" table that you guessed it holds the
comments.In the past what I did was to create one query that returned all of the articles (remember I was using MySQL),
thenwhile I looped through the results of that query, in PHP I would create another SQL object and issue another query
thatwent out and did a select count(newsid) for each article. This did what I wanted but I always thought it was a bit
ofa hack and it did not perform well under stress.
 

Well, no wonder. For each article you'd issue a new query that had to be
sent to, parsed, planned, optimized, executed, and returned by the
database. All you needed was to join the tables, use a group by or some
other technique.

I suggest you read C.J. Date's "An Introduction to Database Systems".

> Is there a better way to this? I have a few ideas...

Definitely.

> I was thinking of using a SQL sub select, but to be honest I can't seem to construct the proper query to give me what
Ineed, which would be a result set that would look something like this:
 
> 
> newsid | user | date | title | intro | commentcount

You could use a GROUP BY but that could become expensive.

> Another idea I had was to just create a count column in my content table, which gets incremented each time a comment
isposted.
 
> 
> Taking the above idea a step forward, i could problably create a trigger to keep that column counter updated whenver
anew comment is inserted. Now that I think of it, it problably wouldn't work as the trigger could not be generic.
 
> 
> Anyway, I was just wondering if anyone out here had any ideas or past experience that wouldn't mind sharing.

This has been solved many times by several open source projects (OpenACS
being one of them).

You could, for the page that shoms articles with comment count, do a query
that would join with the comments table, doing a COUNT on it (sorry you
didn't send your table definitions).

For the full article page, you could query for the comment and all the
comments, caching part or all of it in ram if you foresee it'll be
requested often.

-Roberto
-- 
+----|        Roberto Mello   -    http://www.brasileiro.net/  |------+
+       Computer Science Graduate Student, Utah State University      +
+       USU Free Software & GNU/Linux Club - http://fslc.usu.edu/     +
if(crash) grab_ankles();kiss_xxx_goodbye()


Re: query help/sugestions

From
"John Cavacas"
Date:
Roberto,

Thanks for the sugestions. I was actually starting to look down that
path. Here are my table defs:

CREATE TABLE "content" ("id"                int4 DEFAULT nextval('public.content_id_seq')  NOT
NULL ,"userid"            int4 NOT NULL ,"catid"             int4 NOT NULL ,"type"              int4 NOT NULL ,"status"
          int4 NOT NULL ,"datecreated"       timestamp without time zone DEFAULT now()  NOT NULL
 
,"lastmodified"      timestamp without time zone DEFAULT now()  NOT NULL
,"title"             text NOT NULL ,"intro"             text NOT NULL ,"body"              text NOT NULL ,
CONSTRAINT "userid" FOREIGN KEY ("userid") REFERENCES users(id) MATCH
FULL ,
CONSTRAINT "categories_content_fkey" FOREIGN KEY ("catid") REFERENCES
categories(id) MATCH FULL ,
CONSTRAINT "types_content_fkey" FOREIGN KEY ("type") REFERENCES
types(type) MATCH FULL ,
CONSTRAINT "status_content_fkey" FOREIGN KEY ("status") REFERENCES
status(id) MATCH FULL ,
CONSTRAINT "content_pkey" PRIMARY KEY ("id")
);

CREATE TABLE "comments" ("id"                int4 DEFAULT nextval('public.comments_id_seq')  NOT
NULL ,"cid"               int4 NOT NULL ,"pid"               int4 NOT NULL ,"userid"            int4 NOT NULL
,"datecreated"      timestamp without time zone DEFAULT now()  NOT NULL
 
,"lastmodified"      timestamp without time zone DEFAULT now()  NOT NULL
,"title"             text NOT NULL ,"commentbody"       text NOT NULL ,"ip"                varchar(16) NOT NULL ,
CONSTRAINT "content_comments_fkey" FOREIGN KEY ("cid") REFERENCES
content(id) MATCH FULL ,
CONSTRAINT "comments_pkey" PRIMARY KEY ("id")
);

I set my client to break lines at 72 chars, i hope it worked...

This query:

select c.id, c.title, COUNT(cc.cid) AS ccount FROM content c
LEFT JOIN comments cc ON (c.id=cc.cid)  GROUP BY  c.id, c.title

Does what I need, however I need to query 3 other tables as well to get
the data that I need to display, trying to make that work now.

Is that what you ment?

Thanks a bunch,
John



----- Original Message -----
From: "Roberto Mello" <rmello@cc.usu.edu>
To: "John Cavacas" <oogly@rogers.com>
Cc: <pgsql-sql@postgresql.org>
Sent: February 8, 2003 1:35 AM
Subject: Re: [SQL] query help/sugestions


> On Sat, Feb 08, 2003 at 12:50:20AM -0500, John Cavacas wrote:
>
> Can you please fix your e-mail client so it breaks lines at 72
characters?
>
> > Anyway, i'm trying to solve a problem which I had with my previous
version. The problem boils down to how to efficiently display an article
in either in its full form or in a article listing page (think front
page of slashdot) while at the same time finding out what the comment
count is for that article.
> >
> > I have 2 tables a "content" table which holds the articles, and a
"comments" table that you guessed it holds the comments. In the past
what I did was to create one query that returned all of the articles
(remember I was using MySQL), then while I looped through the results of
that query, in PHP I would create another SQL object and issue another
query that went out and did a select count(newsid) for each article.
This did what I wanted but I always thought it was a bit of a hack and
it did not perform well under stress.
>
> Well, no wonder. For each article you'd issue a new query that had to
be
> sent to, parsed, planned, optimized, executed, and returned by the
> database. All you needed was to join the tables, use a group by or
some
> other technique.
>
> I suggest you read C.J. Date's "An Introduction to Database Systems".
>
> > Is there a better way to this? I have a few ideas...
>
> Definitely.
>
> > I was thinking of using a SQL sub select, but to be honest I can't
seem to construct the proper query to give me what I need, which would
be a result set that would look something like this:
> >
> > newsid | user | date | title | intro | commentcount
>
> You could use a GROUP BY but that could become expensive.
>
> > Another idea I had was to just create a count column in my content
table, which gets incremented each time a comment is posted.
> >
> > Taking the above idea a step forward, i could problably create a
trigger to keep that column counter updated whenver a new comment is
inserted. Now that I think of it, it problably wouldn't work as the
trigger could not be generic.
> >
> > Anyway, I was just wondering if anyone out here had any ideas or
past experience that wouldn't mind sharing.
>
> This has been solved many times by several open source projects
(OpenACS
> being one of them).
>
> You could, for the page that shoms articles with comment count, do a
query
> that would join with the comments table, doing a COUNT on it (sorry
you
> didn't send your table definitions).
>
> For the full article page, you could query for the comment and all the
> comments, caching part or all of it in ram if you foresee it'll be
> requested often.
>
> -Roberto
>
> --
> +----|        Roberto Mello   -    http://www.brasileiro.net/
|------+
> +       Computer Science Graduate Student, Utah State University
+
> +       USU Free Software & GNU/Linux Club - http://fslc.usu.edu/
+
> if(crash) grab_ankles();kiss_xxx_goodbye()



Re: query help/sugestions

From
"John Cavacas"
Date:
Roberto,

Here's my solution.

CREATE VIEW viewcontent
(id,statusid,title,datecreated,lastmodified,userid,username,catid,catego
ry,intro,body,typeid,content,commentcount) as

SELECT a.id as id,      a.status as statusid,      a.title as title,      a.datecreated as datecreated,
a.lastmodifiedas lastmodified,      b.id as userid,      b.username as username,      c.id as catid,      c.longname as
category,     a.intro as intro,      a.body as body,      d.type as typeid,      d.name as content,      COUNT(e.cid)
ascommentcount
 
FROM content a LEFT JOIN comments e ON (a.id = e.cid) , users b,
categories c, types d
WHERE a.id = b.idAND a.catid = c.idAND a.type = d.typeAND a.status = 1
GROUP BY
a.id,      a.status,      a.title,      a.datecreated,      a.lastmodified,      b.id,      b.username,      c.id,
c.longname,     a.intro,      a.body,      d.type,      d.name
 

ORDER BY a.datecreated DESC ;

Is creating the view a good idea in terms of performance? This is a
general query, I would for example not include the main body of the text
needed for the front page listing.

Thanks again,
John


----- Original Message -----
From: "Roberto Mello" <rmello@cc.usu.edu>
To: "John Cavacas" <oogly@rogers.com>
Cc: <pgsql-sql@postgresql.org>
Sent: February 8, 2003 1:35 AM
Subject: Re: [SQL] query help/sugestions


> On Sat, Feb 08, 2003 at 12:50:20AM -0500, John Cavacas wrote:
>
> Can you please fix your e-mail client so it breaks lines at 72
characters?
>
> > Anyway, i'm trying to solve a problem which I had with my previous
version. The problem boils down to how to efficiently display an article
in either in its full form or in a article listing page (think front
page of slashdot) while at the same time finding out what the comment
count is for that article.
> >
> > I have 2 tables a "content" table which holds the articles, and a
"comments" table that you guessed it holds the comments. In the past
what I did was to create one query that returned all of the articles
(remember I was using MySQL), then while I looped through the results of
that query, in PHP I would create another SQL object and issue another
query that went out and did a select count(newsid) for each article.
This did what I wanted but I always thought it was a bit of a hack and
it did not perform well under stress.
>
> Well, no wonder. For each article you'd issue a new query that had to
be
> sent to, parsed, planned, optimized, executed, and returned by the
> database. All you needed was to join the tables, use a group by or
some
> other technique.
>
> I suggest you read C.J. Date's "An Introduction to Database Systems".
>
> > Is there a better way to this? I have a few ideas...
>
> Definitely.
>
> > I was thinking of using a SQL sub select, but to be honest I can't
seem to construct the proper query to give me what I need, which would
be a result set that would look something like this:
> >
> > newsid | user | date | title | intro | commentcount
>
> You could use a GROUP BY but that could become expensive.
>
> > Another idea I had was to just create a count column in my content
table, which gets incremented each time a comment is posted.
> >
> > Taking the above idea a step forward, i could problably create a
trigger to keep that column counter updated whenver a new comment is
inserted. Now that I think of it, it problably wouldn't work as the
trigger could not be generic.
> >
> > Anyway, I was just wondering if anyone out here had any ideas or
past experience that wouldn't mind sharing.
>
> This has been solved many times by several open source projects
(OpenACS
> being one of them).
>
> You could, for the page that shoms articles with comment count, do a
query
> that would join with the comments table, doing a COUNT on it (sorry
you
> didn't send your table definitions).
>
> For the full article page, you could query for the comment and all the
> comments, caching part or all of it in ram if you foresee it'll be
> requested often.
>
> -Roberto
>
> --
> +----|        Roberto Mello   -    http://www.brasileiro.net/
|------+
> +       Computer Science Graduate Student, Utah State University
+
> +       USU Free Software & GNU/Linux Club - http://fslc.usu.edu/
+
> if(crash) grab_ankles();kiss_xxx_goodbye()



Re: query help/sugestions

From
Roberto Mello
Date:
On Sat, Feb 08, 2003 at 02:33:58AM -0500, John Cavacas wrote:
> 
> CREATE VIEW viewcontent
> (id,statusid,title,datecreated,lastmodified,userid,username,catid,catego
> ry,intro,body,typeid,content,commentcount) as
> 
> SELECT a.id as id,
>        a.status as statusid,
>        a.title as title,
>        a.datecreated as datecreated,
...

Have you run the query above under EXPLAIN to see what kind of execution
plan you're getting?

> Is creating the view a good idea in terms of performance? This is a
> general query, I would for example not include the main body of the text
> needed for the front page listing.

The performance will be as good as your underlying query is. The VIEW is
just an "alias" if you will, and they are generally a good idea.
-Roberto

P.S.: Please trim the messages to which you respond. For this message you
left dozens of lines of cruft hanging after your message (replying at the
top is a bad idea too, because you lose context). Not only this is bad for
the archives, but it wastes bandwidth and shows lack of respect for
other people's inbox.

-- 
+----|        Roberto Mello   -    http://www.brasileiro.net/  |------+
+       Computer Science Graduate Student, Utah State University      +
+       USU Free Software & GNU/Linux Club - http://fslc.usu.edu/     +
Boss spelled backwards is "double-SOB"


Re: query help/sugestions

From
"John Cavacas"
Date:
> Have you run the query above under EXPLAIN to see what kind of
execution
> plan you're getting?
>
> > Is creating the view a good idea in terms of performance? This is a
> > general query, I would for example not include the main body of the
text
> > needed for the front page listing.
>
> The performance will be as good as your underlying query is. The VIEW
is
> just an "alias" if you will, and they are generally a good idea.

I just started using PostgreSQL recently, and as such I am not sure how
to properlly evaluate the results of EXPLAIN on that query. I started
looking at the documentation last night regarding EXPLAIN but I haven't
gotten my head around it. Any sugestions in that area?

Thanks again,
John





Re: query help/sugestions

From
Roberto Mello
Date:
On Sat, Feb 08, 2003 at 12:27:34PM -0500, John Cavacas wrote:
> 
> I just started using PostgreSQL recently, and as such I am not sure how
> to properlly evaluate the results of EXPLAIN on that query. I started
> looking at the documentation last night regarding EXPLAIN but I haven't
> gotten my head around it. Any sugestions in that area?

Do you have any specific questions? In general you're trying to get rid of
sequential scans (Seq Scan).

-Roberto

-- 
+----|        Roberto Mello   -    http://www.brasileiro.net/  |------+
+       Computer Science Graduate Student, Utah State University      +
+       USU Free Software & GNU/Linux Club - http://fslc.usu.edu/     +
File not found. Try again? Try again? Ah, come'on, pleeeeeease!