Thread: Re: [pgsql-advocacy] Me And My Database
Moving to -general. On Fri, Jun 02, 2006 at 10:45:14PM +0200, Leif B. Kristensen wrote: > Don't know if it's relevant here, but I've recently launched the third > article in a series on my transition from shrink-wrapped genealogy > software on Windows to my own custom PostgreSQL/PHP application running > on Linux. The new article is almost all about PostgreSQL, along with > some rantings about commercial genealogy software and a description on > my "structured document" (or WYSIWYG) PHP interface. Along with this, > I've created a blog about the project. > > I wonder if anyone here would want to review the stuff and maybe post > some comments. It's all at <http://solumslekt.org/forays/>, and the > latest article is of course the > <http://solumslekt.org/forays/exodus.php>. The second article in the > series <http://solumslekt.org/forays/blue.php> was my preliminary > PostgreSQL data definition. A lot of it is outdated, but it might be > interesting for someone as well. > > Eventually I'll publish the full code, but it's still rather immature. From the bottom of that page: SELECT * FROM sources INTO src WHERE source_id = $1; SELECT * is generally something to avoid. You end up shoving around more data than needed. Granted, in this case it's only getting shoved down into plpgsql, but it's still extra work for the server. Also, the commentary about how MySQL is faster isn't very clear. Are you using MySQL as some kind of result cache? When you get to running actual concurrent access on the website, you could well find yourself very disappointed with the performance of MyISAM and it's table-level locking. There's probably also some gains to be had on the PostgreSQL performance. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wednesday 7. June 2006 00:10, Jim C. Nasby wrote: >Moving to -general. From the bottom of that page: > > SELECT * FROM sources INTO src WHERE source_id = $1; > >SELECT * is generally something to avoid. You end up shoving around > more data than needed. Granted, in this case it's only getting shoved > down into plpgsql, but it's still extra work for the server. I know that. But the table is only four columns wide, and all the columns matter in this query. Eventually I'll remove such things as "SELECT * FROM ..." which really is a bad habit. >Also, the commentary about how MySQL is faster isn't very clear. Are > you using MySQL as some kind of result cache? When you get to running > actual concurrent access on the website, you could well find yourself > very disappointed with the performance of MyISAM and it's table-level > locking. There's probably also some gains to be had on the PostgreSQL > performance. I may have been a little unclear here. My production database is PostgreSQL, as it quite clearly is the better choice of the two, in particular wrt data integrity. My Web presentation software is quite a different matter. It's running at a web hotel that's only offering MySQL for a database. I find MySQL with MyISAM quite sufficient for that use, as its only purpose is to serve up simple selects quickly. The reason why the generation of eg. the family sheet is faster in the MySQL web context than in my production environment, is that I'm really comparing apples and potatoes here. The Web database has a much flatter and denormalized structure, due to the fact that there's no editing. The entire Web database is repopulated from scratch every time I do an update. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
On Tuesday 06 June 2006 18:44, Leif B. Kristensen wrote: > On Wednesday 7. June 2006 00:10, Jim C. Nasby wrote: > >Moving to -general. > >From the bottom of that page: > > > > SELECT * FROM sources INTO src WHERE source_id = $1; > > > >SELECT * is generally something to avoid. You end up shoving around > > more data than needed. Granted, in this case it's only getting shoved > > down into plpgsql, but it's still extra work for the server. > > I know that. But the table is only four columns wide, and all the > columns matter in this query. Eventually I'll remove such things > as "SELECT * FROM ..." which really is a bad habit. > > >Also, the commentary about how MySQL is faster isn't very clear. Are > > you using MySQL as some kind of result cache? When you get to running > > actual concurrent access on the website, you could well find yourself > > very disappointed with the performance of MyISAM and it's table-level > > locking. There's probably also some gains to be had on the PostgreSQL > > performance. > > I may have been a little unclear here. My production database is > PostgreSQL, as it quite clearly is the better choice of the two, in > particular wrt data integrity. My Web presentation software is quite a > different matter. It's running at a web hotel that's only offering > MySQL for a database. I find MySQL with MyISAM quite sufficient for > that use, as its only purpose is to serve up simple selects quickly. > I'd think sqlite would be even faster, though it sounds like that might not be an option for you. > The reason why the generation of eg. the family sheet is faster in the > MySQL web context than in my production environment, is that I'm really > comparing apples and potatoes here. The Web database has a much flatter > and denormalized structure, due to the fact that there's no editing. > The entire Web database is repopulated from scratch every time I do an > update. If you going through this kind of step now, why not just generate the whole site from the pg database as html pages and then push those out to the client? That way you eliminate any dbms overhead and reduce load on your webservers (and eliminate the need for a 2nd db schema) -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
On Wednesday 7. June 2006 06:26, Robert Treat wrote: >On Tuesday 06 June 2006 18:44, Leif B. Kristensen wrote: >> The reason why the generation of eg. the family sheet is faster in >> the MySQL web context than in my production environment, is that I'm >> really comparing apples and potatoes here. The Web database has a >> much flatter and denormalized structure, due to the fact that >> there's no editing. The entire Web database is repopulated from >> scratch every time I do an update. > >If you going through this kind of step now, why not just generate the > whole site from the pg database as html pages and then push those out > to the client? That way you eliminate any dbms overhead and reduce > load on your webservers (and eliminate the need for a 2nd db schema) Ouch. The method I'm using today, is quick, easy, and works like a charm. It's one local script that runs in a few seconds, generating SQL command files which are tarred and gzipped to a 1.5MB file, and scp'ed to the server, and then a serverside load script which takes a couple of minutes. Generating 40000+ static HTML pages, each of up to 10K, would fill up my disk quota faster than I can spell postgresql. And how would you write a name search for static pages? It ain't broken, and I ain't gonna fix it. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
On Wednesday 7. June 2006 00:10, Jim C. Nasby wrote: >Also, the commentary about how MySQL is faster isn't very clear. Are > you using MySQL as some kind of result cache? When you get to running > actual concurrent access on the website, you could well find yourself > very disappointed with the performance of MyISAM and it's table-level > locking. There's probably also some gains to be had on the PostgreSQL > performance. I've rewritten that passage to make it clearer what it's about. I've also included the complete table definitions, along with my views and functions. The article is still at <http://solumslekt.org/forays/exodus.php>. One question: When I have a function like this: CREATE OR REPLACE FUNCTION get_source_text(INTEGER) RETURNS TEXT AS $$ DECLARE src sources%ROWTYPE; mystring TEXT; BEGIN SELECT * FROM sources INTO src WHERE source_id = $1; mystring := src.large_text; IF src.parent_id <> 0 THEN mystring := get_source_text(src.parent_id) || ' ' || mystring; END IF; RETURN mystring; END; $$ LANGUAGE plpgsql; What do you suggest that I write instead of "SELECT * FROM sources INTO src", when src is defined as sources%ROWTYPE? The table sources is defined as: CREATE TABLE sources ( source_id INTEGER PRIMARY KEY, parent_id INTEGER NOT NULL REFERENCES sources (source_id), small_text VARCHAR(50) NOT NULL DEFAULT '', large_text TEXT NOT NULL DEFAULT '' ); I only need (source_id, parent_id, large_text) in the query. The small_text column is largely unused, but holds at most 50 chars. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
On Wed, Jun 14, 2006 at 11:04:19AM +0200, Leif B. Kristensen wrote: > One question: When I have a function like this: > > CREATE OR REPLACE FUNCTION get_source_text(INTEGER) RETURNS TEXT AS $$ > DECLARE > src sources%ROWTYPE; <snip> > What do you suggest that I write instead of "SELECT * FROM sources INTO > src", when src is defined as sources%ROWTYPE? The table sources is > defined as: IIRC, if you just declare src as type "record" you can select any fields you like. AIUI, declaring a row to be of a specific type is only really important if you plan to return it or pass it to another function. Have a ncie day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Wednesday 14. June 2006 11:09, Martijn van Oosterhout wrote: >IIRC, if you just declare src as type "record" you can select any >fields you like. AIUI, declaring a row to be of a specific type is > only really important if you plan to return it or pass it to another > function. I tried: CREATE OR REPLACE FUNCTION get_source_text(integer) RETURNS TEXT AS $$ DECLARE src RECORD; mystring TEXT; BEGIN SELECT (source_id, parent_id, large_text) FROM sources INTO src WHERE source_id = $1; mystring := src.large_text; IF src.parent_id <> 0 THEN mystring := get_source_text(src.parent_id) || ' ' || mystring; END IF; RETURN mystring; END; $$ LANGUAGE plpgsql; But now I get this error message: Query failed: ERROR: record "src" has no field "large_text" CONTEXT: PL/pgSQL function "get_source_text" line 7 at assignment PostgreSQL version 8.0.8. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
On Wed, Jun 14, 2006 at 11:35:12AM +0200, Leif B. Kristensen wrote: > On Wednesday 14. June 2006 11:09, Martijn van Oosterhout wrote: > >IIRC, if you just declare src as type "record" you can select any > >fields you like. AIUI, declaring a row to be of a specific type is > > only really important if you plan to return it or pass it to another > > function. > > I tried: > > CREATE OR REPLACE FUNCTION get_source_text(integer) RETURNS TEXT AS $$ > DECLARE > src RECORD; > mystring TEXT; > BEGIN > SELECT (source_id, parent_id, large_text) > FROM sources INTO src WHERE source_id = $1; Why did you put parenthesis there? It looks like you're making a record within a record. You wouldn't have parenthesis there for a normal select statement, would you? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Wednesday 14. June 2006 11:38, Martijn van Oosterhout wrote: >Why did you put parenthesis there? It looks like you're making a > record within a record. You wouldn't have parenthesis there for a > normal select statement, would you? s**t. When I remove the parentheses, it runs fine. This is a little contrary to common programmer philosophy, where putting in extra parentheses for clarity is considered Good Practice[TM]. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
On Wednesday 14. June 2006 11:38, Martijn van Oosterhout wrote: >Have a nice day, I forgot to say thank you. And a nice day to you too. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
On Wed, Jun 14, 2006 at 11:45:03AM +0200, Leif B. Kristensen wrote: > On Wednesday 14. June 2006 11:38, Martijn van Oosterhout wrote: > >Why did you put parenthesis there? It looks like you're making a > > record within a record. You wouldn't have parenthesis there for a > > normal select statement, would you? > > s**t. When I remove the parentheses, it runs fine. > > This is a little contrary to common programmer philosophy, where putting > in extra parentheses for clarity is considered Good Practice[TM]. Except in cases where it changes the meaning, obviously. I suppose if the SQL standard had chosen something else for row constructors (like [] or {}) what you typed would've been a syntax error rather than being valid. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.