Thread: query help/sugestions
Hello,
I'm developing a small web application that essentialy lets users post articles and then lets other users post comments regarding those articles. This application is already running actualy, i'm working on a new version. The current version runs on PHP+MySQL the new version will be using J2EE with PostgreSQL.
Anyway, i'm trying to solve a problem which I had with my previous version. The problem boils down to how to efficiently display an article in either in its full form or in a article listing page (think front page of slashdot) while at the same time finding out what the comment count is for that article.
I have 2 tables a "content" table which holds the articles, and a "comments" table that you guessed it holds the comments. In the past what I did was to create one query that returned all of the articles (remember I was using MySQL), then while I looped through the results of that query, in PHP I would create another SQL object and issue another query that went out and did a select count(newsid) for each article. This did what I wanted but I always thought it was a bit of a hack and it did not perform well under stress.
Is there a better way to this? I have a few ideas...
I was thinking of using a SQL sub select, but to be honest I can't seem to construct the proper query to give me what I need, which would be a result set that would look something like this:
newsid | user | date | title | intro | commentcount
Another idea I had was to just create a count column in my content table, which gets incremented each time a comment is posted.
Taking the above idea a step forward, i could problably create a trigger to keep that column counter updated whenver a new comment is inserted. Now that I think of it, it problably wouldn't work as the trigger could not be generic.
Anyway, I was just wondering if anyone out here had any ideas or past experience that wouldn't mind sharing.
Thanks a bunch,
John
On Sat, Feb 08, 2003 at 12:50:20AM -0500, John Cavacas wrote: Can you please fix your e-mail client so it breaks lines at 72 characters? > Anyway, i'm trying to solve a problem which I had with my previous version. The problem boils down to how to efficientlydisplay an article in either in its full form or in a article listing page (think front page of slashdot) whileat the same time finding out what the comment count is for that article. > > I have 2 tables a "content" table which holds the articles, and a "comments" table that you guessed it holds the comments.In the past what I did was to create one query that returned all of the articles (remember I was using MySQL), thenwhile I looped through the results of that query, in PHP I would create another SQL object and issue another query thatwent out and did a select count(newsid) for each article. This did what I wanted but I always thought it was a bit ofa hack and it did not perform well under stress. Well, no wonder. For each article you'd issue a new query that had to be sent to, parsed, planned, optimized, executed, and returned by the database. All you needed was to join the tables, use a group by or some other technique. I suggest you read C.J. Date's "An Introduction to Database Systems". > Is there a better way to this? I have a few ideas... Definitely. > I was thinking of using a SQL sub select, but to be honest I can't seem to construct the proper query to give me what Ineed, which would be a result set that would look something like this: > > newsid | user | date | title | intro | commentcount You could use a GROUP BY but that could become expensive. > Another idea I had was to just create a count column in my content table, which gets incremented each time a comment isposted. > > Taking the above idea a step forward, i could problably create a trigger to keep that column counter updated whenver anew comment is inserted. Now that I think of it, it problably wouldn't work as the trigger could not be generic. > > Anyway, I was just wondering if anyone out here had any ideas or past experience that wouldn't mind sharing. This has been solved many times by several open source projects (OpenACS being one of them). You could, for the page that shoms articles with comment count, do a query that would join with the comments table, doing a COUNT on it (sorry you didn't send your table definitions). For the full article page, you could query for the comment and all the comments, caching part or all of it in ram if you foresee it'll be requested often. -Roberto -- +----| Roberto Mello - http://www.brasileiro.net/ |------+ + Computer Science Graduate Student, Utah State University + + USU Free Software & GNU/Linux Club - http://fslc.usu.edu/ + if(crash) grab_ankles();kiss_xxx_goodbye()
Roberto, Thanks for the sugestions. I was actually starting to look down that path. Here are my table defs: CREATE TABLE "content" ("id" int4 DEFAULT nextval('public.content_id_seq') NOT NULL ,"userid" int4 NOT NULL ,"catid" int4 NOT NULL ,"type" int4 NOT NULL ,"status" int4 NOT NULL ,"datecreated" timestamp without time zone DEFAULT now() NOT NULL ,"lastmodified" timestamp without time zone DEFAULT now() NOT NULL ,"title" text NOT NULL ,"intro" text NOT NULL ,"body" text NOT NULL , CONSTRAINT "userid" FOREIGN KEY ("userid") REFERENCES users(id) MATCH FULL , CONSTRAINT "categories_content_fkey" FOREIGN KEY ("catid") REFERENCES categories(id) MATCH FULL , CONSTRAINT "types_content_fkey" FOREIGN KEY ("type") REFERENCES types(type) MATCH FULL , CONSTRAINT "status_content_fkey" FOREIGN KEY ("status") REFERENCES status(id) MATCH FULL , CONSTRAINT "content_pkey" PRIMARY KEY ("id") ); CREATE TABLE "comments" ("id" int4 DEFAULT nextval('public.comments_id_seq') NOT NULL ,"cid" int4 NOT NULL ,"pid" int4 NOT NULL ,"userid" int4 NOT NULL ,"datecreated" timestamp without time zone DEFAULT now() NOT NULL ,"lastmodified" timestamp without time zone DEFAULT now() NOT NULL ,"title" text NOT NULL ,"commentbody" text NOT NULL ,"ip" varchar(16) NOT NULL , CONSTRAINT "content_comments_fkey" FOREIGN KEY ("cid") REFERENCES content(id) MATCH FULL , CONSTRAINT "comments_pkey" PRIMARY KEY ("id") ); I set my client to break lines at 72 chars, i hope it worked... This query: select c.id, c.title, COUNT(cc.cid) AS ccount FROM content c LEFT JOIN comments cc ON (c.id=cc.cid) GROUP BY c.id, c.title Does what I need, however I need to query 3 other tables as well to get the data that I need to display, trying to make that work now. Is that what you ment? Thanks a bunch, John ----- Original Message ----- From: "Roberto Mello" <rmello@cc.usu.edu> To: "John Cavacas" <oogly@rogers.com> Cc: <pgsql-sql@postgresql.org> Sent: February 8, 2003 1:35 AM Subject: Re: [SQL] query help/sugestions > On Sat, Feb 08, 2003 at 12:50:20AM -0500, John Cavacas wrote: > > Can you please fix your e-mail client so it breaks lines at 72 characters? > > > Anyway, i'm trying to solve a problem which I had with my previous version. The problem boils down to how to efficiently display an article in either in its full form or in a article listing page (think front page of slashdot) while at the same time finding out what the comment count is for that article. > > > > I have 2 tables a "content" table which holds the articles, and a "comments" table that you guessed it holds the comments. In the past what I did was to create one query that returned all of the articles (remember I was using MySQL), then while I looped through the results of that query, in PHP I would create another SQL object and issue another query that went out and did a select count(newsid) for each article. This did what I wanted but I always thought it was a bit of a hack and it did not perform well under stress. > > Well, no wonder. For each article you'd issue a new query that had to be > sent to, parsed, planned, optimized, executed, and returned by the > database. All you needed was to join the tables, use a group by or some > other technique. > > I suggest you read C.J. Date's "An Introduction to Database Systems". > > > Is there a better way to this? I have a few ideas... > > Definitely. > > > I was thinking of using a SQL sub select, but to be honest I can't seem to construct the proper query to give me what I need, which would be a result set that would look something like this: > > > > newsid | user | date | title | intro | commentcount > > You could use a GROUP BY but that could become expensive. > > > Another idea I had was to just create a count column in my content table, which gets incremented each time a comment is posted. > > > > Taking the above idea a step forward, i could problably create a trigger to keep that column counter updated whenver a new comment is inserted. Now that I think of it, it problably wouldn't work as the trigger could not be generic. > > > > Anyway, I was just wondering if anyone out here had any ideas or past experience that wouldn't mind sharing. > > This has been solved many times by several open source projects (OpenACS > being one of them). > > You could, for the page that shoms articles with comment count, do a query > that would join with the comments table, doing a COUNT on it (sorry you > didn't send your table definitions). > > For the full article page, you could query for the comment and all the > comments, caching part or all of it in ram if you foresee it'll be > requested often. > > -Roberto > > -- > +----| Roberto Mello - http://www.brasileiro.net/ |------+ > + Computer Science Graduate Student, Utah State University + > + USU Free Software & GNU/Linux Club - http://fslc.usu.edu/ + > if(crash) grab_ankles();kiss_xxx_goodbye()
Roberto, Here's my solution. CREATE VIEW viewcontent (id,statusid,title,datecreated,lastmodified,userid,username,catid,catego ry,intro,body,typeid,content,commentcount) as SELECT a.id as id, a.status as statusid, a.title as title, a.datecreated as datecreated, a.lastmodifiedas lastmodified, b.id as userid, b.username as username, c.id as catid, c.longname as category, a.intro as intro, a.body as body, d.type as typeid, d.name as content, COUNT(e.cid) ascommentcount FROM content a LEFT JOIN comments e ON (a.id = e.cid) , users b, categories c, types d WHERE a.id = b.idAND a.catid = c.idAND a.type = d.typeAND a.status = 1 GROUP BY a.id, a.status, a.title, a.datecreated, a.lastmodified, b.id, b.username, c.id, c.longname, a.intro, a.body, d.type, d.name ORDER BY a.datecreated DESC ; Is creating the view a good idea in terms of performance? This is a general query, I would for example not include the main body of the text needed for the front page listing. Thanks again, John ----- Original Message ----- From: "Roberto Mello" <rmello@cc.usu.edu> To: "John Cavacas" <oogly@rogers.com> Cc: <pgsql-sql@postgresql.org> Sent: February 8, 2003 1:35 AM Subject: Re: [SQL] query help/sugestions > On Sat, Feb 08, 2003 at 12:50:20AM -0500, John Cavacas wrote: > > Can you please fix your e-mail client so it breaks lines at 72 characters? > > > Anyway, i'm trying to solve a problem which I had with my previous version. The problem boils down to how to efficiently display an article in either in its full form or in a article listing page (think front page of slashdot) while at the same time finding out what the comment count is for that article. > > > > I have 2 tables a "content" table which holds the articles, and a "comments" table that you guessed it holds the comments. In the past what I did was to create one query that returned all of the articles (remember I was using MySQL), then while I looped through the results of that query, in PHP I would create another SQL object and issue another query that went out and did a select count(newsid) for each article. This did what I wanted but I always thought it was a bit of a hack and it did not perform well under stress. > > Well, no wonder. For each article you'd issue a new query that had to be > sent to, parsed, planned, optimized, executed, and returned by the > database. All you needed was to join the tables, use a group by or some > other technique. > > I suggest you read C.J. Date's "An Introduction to Database Systems". > > > Is there a better way to this? I have a few ideas... > > Definitely. > > > I was thinking of using a SQL sub select, but to be honest I can't seem to construct the proper query to give me what I need, which would be a result set that would look something like this: > > > > newsid | user | date | title | intro | commentcount > > You could use a GROUP BY but that could become expensive. > > > Another idea I had was to just create a count column in my content table, which gets incremented each time a comment is posted. > > > > Taking the above idea a step forward, i could problably create a trigger to keep that column counter updated whenver a new comment is inserted. Now that I think of it, it problably wouldn't work as the trigger could not be generic. > > > > Anyway, I was just wondering if anyone out here had any ideas or past experience that wouldn't mind sharing. > > This has been solved many times by several open source projects (OpenACS > being one of them). > > You could, for the page that shoms articles with comment count, do a query > that would join with the comments table, doing a COUNT on it (sorry you > didn't send your table definitions). > > For the full article page, you could query for the comment and all the > comments, caching part or all of it in ram if you foresee it'll be > requested often. > > -Roberto > > -- > +----| Roberto Mello - http://www.brasileiro.net/ |------+ > + Computer Science Graduate Student, Utah State University + > + USU Free Software & GNU/Linux Club - http://fslc.usu.edu/ + > if(crash) grab_ankles();kiss_xxx_goodbye()
On Sat, Feb 08, 2003 at 02:33:58AM -0500, John Cavacas wrote: > > CREATE VIEW viewcontent > (id,statusid,title,datecreated,lastmodified,userid,username,catid,catego > ry,intro,body,typeid,content,commentcount) as > > SELECT a.id as id, > a.status as statusid, > a.title as title, > a.datecreated as datecreated, ... Have you run the query above under EXPLAIN to see what kind of execution plan you're getting? > Is creating the view a good idea in terms of performance? This is a > general query, I would for example not include the main body of the text > needed for the front page listing. The performance will be as good as your underlying query is. The VIEW is just an "alias" if you will, and they are generally a good idea. -Roberto P.S.: Please trim the messages to which you respond. For this message you left dozens of lines of cruft hanging after your message (replying at the top is a bad idea too, because you lose context). Not only this is bad for the archives, but it wastes bandwidth and shows lack of respect for other people's inbox. -- +----| Roberto Mello - http://www.brasileiro.net/ |------+ + Computer Science Graduate Student, Utah State University + + USU Free Software & GNU/Linux Club - http://fslc.usu.edu/ + Boss spelled backwards is "double-SOB"
> Have you run the query above under EXPLAIN to see what kind of execution > plan you're getting? > > > Is creating the view a good idea in terms of performance? This is a > > general query, I would for example not include the main body of the text > > needed for the front page listing. > > The performance will be as good as your underlying query is. The VIEW is > just an "alias" if you will, and they are generally a good idea. I just started using PostgreSQL recently, and as such I am not sure how to properlly evaluate the results of EXPLAIN on that query. I started looking at the documentation last night regarding EXPLAIN but I haven't gotten my head around it. Any sugestions in that area? Thanks again, John
On Sat, Feb 08, 2003 at 12:27:34PM -0500, John Cavacas wrote: > > I just started using PostgreSQL recently, and as such I am not sure how > to properlly evaluate the results of EXPLAIN on that query. I started > looking at the documentation last night regarding EXPLAIN but I haven't > gotten my head around it. Any sugestions in that area? Do you have any specific questions? In general you're trying to get rid of sequential scans (Seq Scan). -Roberto -- +----| Roberto Mello - http://www.brasileiro.net/ |------+ + Computer Science Graduate Student, Utah State University + + USU Free Software & GNU/Linux Club - http://fslc.usu.edu/ + File not found. Try again? Try again? Ah, come'on, pleeeeeease!