Thread: Re: [pgsql-advocacy] Me And My Database

Re: [pgsql-advocacy] Me And My Database

From

"Jim C. Nasby"

Date:

06 June 2006, 19:11:06

Moving to -general.

On Fri, Jun 02, 2006 at 10:45:14PM +0200, Leif B. Kristensen wrote:
> Don't know if it's relevant here, but I've recently launched the third
> article in a series on my transition from shrink-wrapped genealogy
> software on Windows to my own custom PostgreSQL/PHP application running
> on Linux. The new article is almost all about PostgreSQL, along with
> some rantings about commercial genealogy software and a description on
> my "structured document" (or WYSIWYG) PHP interface. Along with this,
> I've created a blog about the project.
>
> I wonder if anyone here would want to review the stuff and maybe post
> some comments. It's all at <http://solumslekt.org/forays/>, and the
> latest article is of course the
> <http://solumslekt.org/forays/exodus.php>. The second article in the
> series <http://solumslekt.org/forays/blue.php> was my preliminary
> PostgreSQL data definition. A lot of it is outdated, but it might be
> interesting for someone as well.
>
> Eventually I'll publish the full code, but it's still rather immature.

From the bottom of that page:

    SELECT * FROM sources INTO src WHERE source_id = $1;

SELECT * is generally something to avoid. You end up shoving around more
data than needed. Granted, in this case it's only getting shoved down
into plpgsql, but it's still extra work for the server.

Also, the commentary about how MySQL is faster isn't very clear. Are you
using MySQL as some kind of result cache? When you get to running actual
concurrent access on the website, you could well find yourself very
disappointed with the performance of MyISAM and it's table-level
locking. There's probably also some gains to be had on the PostgreSQL
performance.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: [pgsql-advocacy] Me And My Database

From

"Leif B. Kristensen"

Date:

06 June 2006, 19:44:31

On Wednesday 7. June 2006 00:10, Jim C. Nasby wrote:
>Moving to -general.
From the bottom of that page:
>
>    SELECT * FROM sources INTO src WHERE source_id = $1;
>
>SELECT * is generally something to avoid. You end up shoving around
> more data than needed. Granted, in this case it's only getting shoved
> down into plpgsql, but it's still extra work for the server.

I know that. But the table is only four columns wide, and all the
columns matter in this query. Eventually I'll remove such things
as "SELECT * FROM ..." which really is a bad habit.

>Also, the commentary about how MySQL is faster isn't very clear. Are
> you using MySQL as some kind of result cache? When you get to running
> actual concurrent access on the website, you could well find yourself
> very disappointed with the performance of MyISAM and it's table-level
> locking. There's probably also some gains to be had on the PostgreSQL
> performance.

I may have been a little unclear here. My production database is
PostgreSQL, as it quite clearly is the better choice of the two, in
particular wrt data integrity. My Web presentation software is quite a
different matter. It's running at a web hotel that's only offering
MySQL for a database. I find MySQL with MyISAM quite sufficient for
that use, as its only purpose is to serve up simple selects quickly.

The reason why the generation of eg. the family sheet is faster in the
MySQL web context than in my production environment, is that I'm really
comparing apples and potatoes here. The Web database has a much flatter
and denormalized structure, due to the fact that there's no editing.
The entire Web database is repopulated from scratch every time I do an
update.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

Re: [pgsql-advocacy] Me And My Database

From

Robert Treat

Date:

07 June 2006, 01:26:43

On Tuesday 06 June 2006 18:44, Leif B. Kristensen wrote:
> On Wednesday 7. June 2006 00:10, Jim C. Nasby wrote:
> >Moving to -general.
> >From the bottom of that page:
> >
> >    SELECT * FROM sources INTO src WHERE source_id = $1;
> >
> >SELECT * is generally something to avoid. You end up shoving around
> > more data than needed. Granted, in this case it's only getting shoved
> > down into plpgsql, but it's still extra work for the server.
>
> I know that. But the table is only four columns wide, and all the
> columns matter in this query. Eventually I'll remove such things
> as "SELECT * FROM ..." which really is a bad habit.
>
> >Also, the commentary about how MySQL is faster isn't very clear. Are
> > you using MySQL as some kind of result cache? When you get to running
> > actual concurrent access on the website, you could well find yourself
> > very disappointed with the performance of MyISAM and it's table-level
> > locking. There's probably also some gains to be had on the PostgreSQL
> > performance.
>
> I may have been a little unclear here. My production database is
> PostgreSQL, as it quite clearly is the better choice of the two, in
> particular wrt data integrity. My Web presentation software is quite a
> different matter. It's running at a web hotel that's only offering
> MySQL for a database. I find MySQL with MyISAM quite sufficient for
> that use, as its only purpose is to serve up simple selects quickly.
>

I'd think sqlite would be even faster, though it sounds like that might not be
an option for you.

> The reason why the generation of eg. the family sheet is faster in the
> MySQL web context than in my production environment, is that I'm really
> comparing apples and potatoes here. The Web database has a much flatter
> and denormalized structure, due to the fact that there's no editing.
> The entire Web database is repopulated from scratch every time I do an
> update.

If you going through this kind of step now, why not just generate the whole
site from the pg database as html pages and then push those out to the
client?  That way you eliminate any dbms overhead and reduce load on your
webservers (and eliminate the need for a 2nd db schema)

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: [pgsql-advocacy] Me And My Database

From

"Leif B. Kristensen"

Date:

07 June 2006, 10:31:24

On Wednesday 7. June 2006 06:26, Robert Treat wrote:

>On Tuesday 06 June 2006 18:44, Leif B. Kristensen wrote:

>> The reason why the generation of eg. the family sheet is faster in
>> the MySQL web context than in my production environment, is that I'm
>> really comparing apples and potatoes here. The Web database has a
>> much flatter and denormalized structure, due to the fact that
>> there's no editing. The entire Web database is repopulated from
>> scratch every time I do an update.
>
>If you going through this kind of step now, why not just generate the
> whole site from the pg database as html pages and then push those out
> to the client?  That way you eliminate any dbms overhead and reduce
> load on your webservers (and eliminate the need for a 2nd db schema)

Ouch. The method I'm using today, is quick, easy, and works like a
charm. It's one local script that runs in a few seconds, generating SQL
command files which are tarred and gzipped to a 1.5MB file, and scp'ed
to the server, and then a serverside load script which takes a couple
of minutes. Generating 40000+ static HTML pages, each of up to 10K,
would fill up my disk quota faster than I can spell postgresql.

And how would you write a name search for static pages?

It ain't broken, and I ain't gonna fix it.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

Re: Me And My Database

From

"Leif B. Kristensen"

Date:

14 June 2006, 06:04:32

On Wednesday 7. June 2006 00:10, Jim C. Nasby wrote:

>Also, the commentary about how MySQL is faster isn't very clear. Are
> you using MySQL as some kind of result cache? When you get to running
> actual concurrent access on the website, you could well find yourself
> very disappointed with the performance of MyISAM and it's table-level
> locking. There's probably also some gains to be had on the PostgreSQL
> performance.

I've rewritten that passage to make it clearer what it's about. I've
also included the complete table definitions, along with my views and
functions. The article is still at
<http://solumslekt.org/forays/exodus.php>.

One question: When I have a function like this:

CREATE OR REPLACE FUNCTION get_source_text(INTEGER) RETURNS TEXT AS $$
DECLARE
    src sources%ROWTYPE;
    mystring TEXT;
BEGIN
    SELECT * FROM sources INTO src WHERE source_id = $1;
    mystring := src.large_text;
    IF src.parent_id <> 0 THEN
        mystring := get_source_text(src.parent_id) || ' ' || mystring;
    END IF;
    RETURN mystring;
END;
$$ LANGUAGE plpgsql;

What do you suggest that I write instead of "SELECT * FROM sources INTO
src", when src is defined as sources%ROWTYPE? The table sources is
defined as:

CREATE TABLE sources (
    source_id           INTEGER PRIMARY KEY,
    parent_id           INTEGER NOT NULL REFERENCES sources (source_id),
    small_text          VARCHAR(50) NOT NULL DEFAULT '',
    large_text          TEXT NOT NULL DEFAULT ''
);

I only need (source_id, parent_id, large_text) in the query. The
small_text column is largely unused, but holds at most 50 chars.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

Re: Me And My Database

From

Martijn van Oosterhout

Date:

14 June 2006, 06:09:12

On Wed, Jun 14, 2006 at 11:04:19AM +0200, Leif B. Kristensen wrote:
> One question: When I have a function like this:
>
> CREATE OR REPLACE FUNCTION get_source_text(INTEGER) RETURNS TEXT AS $$
> DECLARE
>     src sources%ROWTYPE;

<snip>

> What do you suggest that I write instead of "SELECT * FROM sources INTO
> src", when src is defined as sources%ROWTYPE? The table sources is
> defined as:

IIRC, if you just declare src as type "record" you can select any
fields you like. AIUI, declaring a row to be of a specific type is only
really important if you plan to return it or pass it to another
function.

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: Me And My Database

From

"Leif B. Kristensen"

Date:

14 June 2006, 06:35:22

On Wednesday 14. June 2006 11:09, Martijn van Oosterhout wrote:
>IIRC, if you just declare src as type "record" you can select any
>fields you like. AIUI, declaring a row to be of a specific type is
> only really important if you plan to return it or pass it to another
> function.

I tried:

CREATE OR REPLACE FUNCTION get_source_text(integer) RETURNS TEXT AS $$
DECLARE
    src RECORD;
    mystring TEXT;
BEGIN
    SELECT (source_id, parent_id, large_text)
        FROM sources INTO src WHERE source_id = $1;
    mystring := src.large_text;
    IF src.parent_id <> 0 THEN
        mystring := get_source_text(src.parent_id) || ' ' || mystring;
    END IF;
    RETURN mystring;
END;
$$ LANGUAGE plpgsql;

But now I get this error message:

Query failed: ERROR: record "src" has no field "large_text" CONTEXT:
PL/pgSQL function "get_source_text" line 7 at assignment

PostgreSQL version 8.0.8.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

Re: Me And My Database

From

Martijn van Oosterhout

Date:

14 June 2006, 06:38:42

On Wed, Jun 14, 2006 at 11:35:12AM +0200, Leif B. Kristensen wrote:
> On Wednesday 14. June 2006 11:09, Martijn van Oosterhout wrote:
> >IIRC, if you just declare src as type "record" you can select any
> >fields you like. AIUI, declaring a row to be of a specific type is
> > only really important if you plan to return it or pass it to another
> > function.
>
> I tried:
>
> CREATE OR REPLACE FUNCTION get_source_text(integer) RETURNS TEXT AS $$
> DECLARE
>     src RECORD;
>     mystring TEXT;
> BEGIN
>     SELECT (source_id, parent_id, large_text)
>         FROM sources INTO src WHERE source_id = $1;

Why did you put parenthesis there? It looks like you're making a record
within a record. You wouldn't have parenthesis there for a normal
select statement, would you?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: Me And My Database

From

"Leif B. Kristensen"

Date:

14 June 2006, 06:45:16

On Wednesday 14. June 2006 11:38, Martijn van Oosterhout wrote:
>Why did you put parenthesis there? It looks like you're making a
> record within a record. You wouldn't have parenthesis there for a
> normal select statement, would you?

s**t. When I remove the parentheses, it runs fine.

This is a little contrary to common programmer philosophy, where putting
in extra parentheses for clarity is considered Good Practice[TM].
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

Re: Me And My Database

From

"Leif B. Kristensen"

Date:

14 June 2006, 07:34:36

On Wednesday 14. June 2006 11:38, Martijn van Oosterhout wrote:
>Have a nice day,

I forgot to say thank you. And a nice day to you too.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

Re: Me And My Database

From

Martijn van Oosterhout

Date:

14 June 2006, 07:39:36

On Wed, Jun 14, 2006 at 11:45:03AM +0200, Leif B. Kristensen wrote:
> On Wednesday 14. June 2006 11:38, Martijn van Oosterhout wrote:
> >Why did you put parenthesis there? It looks like you're making a
> > record within a record. You wouldn't have parenthesis there for a
> > normal select statement, would you?
>
> s**t. When I remove the parentheses, it runs fine.
>
> This is a little contrary to common programmer philosophy, where putting
> in extra parentheses for clarity is considered Good Practice[TM].

Except in cases where it changes the meaning, obviously.

I suppose if the SQL standard had chosen something else for row
constructors (like [] or {}) what you typed would've been a syntax
error rather than being valid.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc