Thread: Re: PostgreSQL Advocacy, Thoughts and Comments

Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Jason Tesser"
Date:
hi,

<huge snip>

> Maybe there's not such a need for the advanced features of PostgreSQL
> amongst PHP programmers as you seem to believe. Most of the PHP stuff I've
> seen is read-only content display stuff and that doesn't really require a
> top-notch RDBMS; a more limited database should also be up to the job. For
> complex transactional web applications, J2EE/Model II is a far superior
> technology to scripts/Model I and that means a different target audience
> for the apps where PostgreSQL can offer those essential extra features.
> Whilst most J2EE developers will be using Oracle/DB2/MSSQL as their
> back-end, the awareness of PostgreSQL seems quite high and, in the few
>  usenet groups I monitor, I don't recall anyone being flamed for
> recommending PostgreSQL over MySQL. Maybe seasoned, professional
> developers don't like being told that they're crap programmers just
> because they ask for something as fundamental as referential integrity!

I completely disagree.  I do a lot of programming with PHP and the features
of Postgres come in handy.  Let me give you an example of just some
basic things.  Triggers!  Why should I have to write insert and update
triggers in the logic (PHP) if I can handle it at the database level.  Sql
is 10x as fast as the language.  Better to handle what you can at the database
level.  Same with views and stored procedures.  MySQL cannot even handle
sub-queries yet. I also use Python for standalone interfaces to the data.
Why should I not be able to use the same views and triggers etc  in there
that I use for my web apps.  PHP is quite powerful if used correctly.
Java has its own issues and I am not sure it is as far supiour as you
are claming it is.  But that is not for this dscussion.  MySQL may be more
popular with (cheap) web hosting places but that doesn't mean it is the best
or that Postgres wouldn't serve better even in this area.  I am glad
to see the article written for PHP mag as Postgres would help alot of PHP guys that are using MySQL.

<another snip>




Re: PostgreSQL Advocacy, Thoughts and Comments

From
"cnliou"
Date:
"Jason Tesser" <JTesser@nbbc.edu>

> MySQL cannot even handle sub-queries yet.

Ohh! Really?
Allow me to pay my highest respect to the genius mySQL
programmers!
I completely have no clue on how to construct any single
tiny database on a DBMS having no sub-query capability.

Being too dumb, I solicit mySQL programmers' help by showing
me employee FOO's birthday and his/her latest job title
effective on or before 2003-1-1 from the following tables:

CREATE TABLE t1 (employee TEXT,BirthDay DATE);
CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle
TEXT);

And make the result like this:

FOO  1980-1-1   programmer

Please do not give me the answer that you will merge these
two tables to form one like this:

CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate
DATE,JobTitle TEXT);

Regards,
CN

Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Chris Travers"
Date:
"Jason Tesser" <JTesser@nbbc.edu> wrote:
> I completely disagree.  I do a lot of programming with PHP and the
features
> of Postgres come in handy.  Let me give you an example of just some
> basic things.  Triggers!  Why should I have to write insert and update
> triggers in the logic (PHP) if I can handle it at the database level.  Sql
> is 10x as fast as the language.  Better to handle what you can at the
database
> level.  Same with views and stored procedures.  MySQL cannot even handle
> sub-queries yet. I also use Python for standalone interfaces to the data.
> Why should I not be able to use the same views and triggers etc  in there
> that I use for my web apps.  PHP is quite powerful if used correctly.

I guess I am coming at this from the other direction:  MySQL is popular and
many people use it for lightweight stuff.  Partly this may be because better
tools exist for providing hosted solutions, and this is an area we could
improve (automatically adding entries to the pg_hba.conf, etc.-- may have to
look into doing this).

THe real problem I see is that this keeps PHP from being an ideal skill for
developing enterprise applications.  The features you are mentioning are
extremely helpful, even necessary, when you have many applications working
against the same database.  The triggers, etc. can give you some consistant
business logic, and you can use views to present information to the
applications in a way that is natural for them.

In essence, my point is that for single-use databases, MySQL isn't all that
bad (aside from consistancy issues).  However, the popularity of the LAMP
development environment holds PHP back from being a serious corporate
development environment, IMO.

> Java has its own issues and I am not sure it is as far supiour as you
> are claming it is.  But that is not for this dscussion.  MySQL may be more
> popular with (cheap) web hosting places but that doesn't mean it is the
best
> or that Postgres wouldn't serve better even in this area.  I am glad
> to see the article written for PHP mag as Postgres would help alot of PHP
> guys that are using MySQL.

Again, I think that the most important benefit would be lowering the barrier
to entry of serious development.  You can start with
Linux/Apache/PostgreSQL/PHP for a simple site, and then use your knowledge
better to develop more serious applications.  But the critical issue to
resolve is to make available a tool or set of tools to manage shared hosting
environments in an easier way.  I would be happy to try to generate such a
set of tools.

Best Wishes,
Chris Travers



Re: PostgreSQL Advocacy, Thoughts and Comments

From
Oliver Elphick
Date:
On Sat, 2003-11-29 at 04:37, cnliou wrote:
> "Jason Tesser" <JTesser@nbbc.edu>
>
> > MySQL cannot even handle sub-queries yet.
>
> Ohh! Really?
> Allow me to pay my highest respect to the genius mySQL
> programmers!
> I completely have no clue on how to construct any single
> tiny database on a DBMS having no sub-query capability.
>
> Being too dumb, I solicit mySQL programmers' help by showing
> me employee FOO's birthday and his/her latest job title
> effective on or before 2003-1-1 from the following tables:
>
> CREATE TABLE t1 (employee TEXT,BirthDay DATE);
> CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle
> TEXT);
>
> And make the result like this:
>
> FOO  1980-1-1   programmer
>
> Please do not give me the answer that you will merge these
> two tables to form one like this:
>
> CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate
> DATE,JobTitle TEXT);

I have great trouble following your meaning, but I think you are talking
about joining two tables in a query:

   SELECT t1.employee, t1.birthday, t2.jobtitle
     FROM t1, t2
    WHERE t1.employee = t2.employee;

That is not the same as using a sub-query:

   SELECT employee
     FROM t1
    WHERE birthday > (
          SELECT MIN(effectivedate)
            FROM t2
          );

  (select employees who were born after the longest-serving employee
   started work.)
--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "Who shall ascend into the hill of the LORD? or who
      shall stand in his holy place? He that hath clean
      hands, and a pure heart..."            Psalms 24:3,4


Re: PostgreSQL Advocacy, Thoughts and Comments

From
Tony
Date:
HI All,

I'm glad that this thread prompted some thoughtful response.   I think
one of my main points I was trying to make, Jason hit the nail on the
head.  The article to which I was referring uses a great example which I
have experienced many times before, but in order to grasp this, PHP et
al, must be thought of as a scripting language which crosses many
corporate boundries, and it is easy to assume that it's primary use
(simple web site back ends) are the only thing to discuss.  But the
situation has changed enourmously since the release of PHP v4.  Now many
consultant/developer/sys-admins like myself are going to client site on
a contract (this is especially true in the UK, I can't speak for
anywhere else) and finding complex stocktrading systems, inventory
systems, CRM systems, and others, all written in PHP backed by MySQL.
Whether this is right or wrong, good choice or bad choice is not what
I'm interested in debating.   The point is that when these systems where
architected, the developers used MySQL not because they were dumb, but
because many of them develop awesome code and can get around most
problems in the code, with a little ingenuity.  Many simply do not have
the insight into the potential benefits of *proper* RDBMS can offer.
Had they had the benefit of such knowledge the code they have written
would be faster (in DB) and more legible. Sadly often the developers are
the only source of DBA for some of these companies.

The second scenario, is with admin systems, written by people like
myself for companies, whether they be simple or complex systems, that
are intended as a temporary work around to an immediate problem.
In a very short space of time the stop-gap application you had written
to sort out the immediate problem quickly becomes a core business
application (I recently returned to a site after not being there for two
years and the temporary address book/ email system that I knocked up in
an afternoon was not only still being used, but now relied upon heavily).

So on to my point, MySQL guys will happily say "Hey, we're not saying
that the features MySQL is missing aren't important, and we're working
towards them, but in the meantime these issues can be worked around like
this....."  and happily play the whole thing down.  Many LAMP developers
aren't aware of the benefits of stored procedures, of triggers and other
good stuff. Like myself, if they were aware how much easier life could
be if these things were accessible to them, they'd probably be converts too.

There is not enough emphasis put on the basic importance of these
functions in PG.  Someone needs to standup and say "Hey, look how this
can simplify your programming lives"  until I started using
Druid/Postgres, I had no idea why I needed triggers or what a cascade
effect did, or why I might want one.

The Linux  community has grown  at least in part because it has
educated  potential users and journo's to its benefits.  I believe if
the PG advocacy team did the same, then it would attract many more
serious LAMP developers.

Like Linux vs. Windows, PG has an awful lot going for it in respect to
MySQL, so why not crow about it.  It needs to be pointed at a crowd that
are DB novices, they need to be told why PG is worth the time/knowledge
investment, because anyone who reads the MySQL site, will come away with
the impression that the Trigger, Stored Procs, and other things are a
luxurious overhead not necessary for getting the job done.

I'd gladly help out with such a paper, but find myself in the sad
position of my prose being open to attack due to my newbieness in the DB
world and not able to speak authoratatively on the subject.

Have a think, I'd like to know if others agree.

Cheers

T.

Re: PostgreSQL Advocacy, Thoughts and Comments

From
Unihost Web Hosting
Date:
Further to this post, what might actually work is to convince O' Reilly
(since they have PostgreSQL book/s) to do some articles like they have
for PG, but making full use of the PG database.  For instance, building
a simple data-warehouse using PG.  Articles that show off an OSS
product/project in a clearly enterprise light in a step-by-step
fashion.  There have been so many articles on DB design using MySQL.
How about an article on DB design using all the functionality of a real
ORDBMS.

Just a few thoughts.

Cheers

T.

Tony wrote:

> HI All,
>
> I'm glad that this thread prompted some thoughtful response.   I think
> one of my main points I was trying to make, Jason hit the nail on the
> head.  The article to which I was referring uses a great example which
> I have experienced many times before, but in order to grasp this, PHP
> et al, must be thought of as a scripting language which crosses many
> corporate boundries, and it is easy to assume that it's primary use
> (simple web site back ends) are the only thing to discuss.  But the
> situation has changed enourmously since the release of PHP v4.  Now
> many consultant/developer/sys-admins like myself are going to client
> site on a contract (this is especially true in the UK, I can't speak
> for anywhere else) and finding complex stocktrading systems, inventory
> systems, CRM systems, and others, all written in PHP backed by MySQL.
> Whether this is right or wrong, good choice or bad choice is not what
> I'm interested in debating.   The point is that when these systems
> where architected, the developers used MySQL not because they were
> dumb, but because many of them develop awesome code and can get around
> most problems in the code, with a little ingenuity.  Many simply do
> not have the insight into the potential benefits of *proper* RDBMS can
> offer.   Had they had the benefit of such knowledge the code they have
> written would be faster (in DB) and more legible. Sadly often the
> developers are the only source of DBA for some of these companies.
>
> The second scenario, is with admin systems, written by people like
> myself for companies, whether they be simple or complex systems, that
> are intended as a temporary work around to an immediate problem.  In a
> very short space of time the stop-gap application you had written to
> sort out the immediate problem quickly becomes a core business
> application (I recently returned to a site after not being there for
> two years and the temporary address book/ email system that I knocked
> up in an afternoon was not only still being used, but now relied upon
> heavily).
>
> So on to my point, MySQL guys will happily say "Hey, we're not saying
> that the features MySQL is missing aren't important, and we're working
> towards them, but in the meantime these issues can be worked around
> like this....."  and happily play the whole thing down.  Many LAMP
> developers aren't aware of the benefits of stored procedures, of
> triggers and other good stuff. Like myself, if they were aware how
> much easier life could be if these things were accessible to them,
> they'd probably be converts too.
>
> There is not enough emphasis put on the basic importance of these
> functions in PG.  Someone needs to standup and say "Hey, look how this
> can simplify your programming lives"  until I started using
> Druid/Postgres, I had no idea why I needed triggers or what a cascade
> effect did, or why I might want one.
> The Linux  community has grown  at least in part because it has
> educated  potential users and journo's to its benefits.  I believe if
> the PG advocacy team did the same, then it would attract many more
> serious LAMP developers.
>
> Like Linux vs. Windows, PG has an awful lot going for it in respect to
> MySQL, so why not crow about it.  It needs to be pointed at a crowd
> that are DB novices, they need to be told why PG is worth the
> time/knowledge investment, because anyone who reads the MySQL site,
> will come away with the impression that the Trigger, Stored Procs, and
> other things are a luxurious overhead not necessary for getting the
> job done.
>
> I'd gladly help out with such a paper, but find myself in the sad
> position of my prose being open to attack due to my newbieness in the
> DB world and not able to speak authoratatively on the subject.
>
> Have a think, I'd like to know if others agree.
>
> Cheers
>
> T.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org


Re: PostgreSQL Advocacy, Thoughts and Comments

From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "Unihost" == Unihost Web Hosting <tony@unihost.net> writes:

Unihost> Further to this post, what might actually work is to convince O'
Unihost> Reilly (since they have PostgreSQL book/s) to do some articles like
Unihost> they have for PG, but making full use of the PG database.  For
Unihost> instance, building a simple data-warehouse using PG.  Articles that
Unihost> show off an OSS product/project in a clearly enterprise light in a
Unihost> step-by-step fashion.  There have been so many articles on DB design
Unihost> using MySQL.  How about an article on DB design using all the
Unihost> functionality of a real ORDBMS.

Well, since I need 2.5 ideas per month for the three columns I'm still
writing, I'm certainly in a position to write nice things about PG,
although I always have to work it in from a Perl slant.

Actually, I'm sure that any of the magazines I'm in would appreciate
an additional article or two from me.

If you can think of something that fits in 2000 words or so (or 4000
if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate
some inspiration.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Nigel J. Andrews"
Date:
On Sat, 29 Nov 2003, Oliver Elphick wrote:

> On Sat, 2003-11-29 at 04:37, cnliou wrote:
> > "Jason Tesser" <JTesser@nbbc.edu>
> >
> > > MySQL cannot even handle sub-queries yet.
> >
> > Ohh! Really?
> > Allow me to pay my highest respect to the genius mySQL
> > programmers!
> > I completely have no clue on how to construct any single
> > tiny database on a DBMS having no sub-query capability.
> >
> > Being too dumb, I solicit mySQL programmers' help by showing
> > me employee FOO's birthday and his/her latest job title
> > effective on or before 2003-1-1 from the following tables:
> >
> > CREATE TABLE t1 (employee TEXT,BirthDay DATE);
> > CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle
> > TEXT);
> >
> > And make the result like this:
> >
> > FOO  1980-1-1   programmer
> >
> > Please do not give me the answer that you will merge these
> > two tables to form one like this:
> >
> > CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate
> > DATE,JobTitle TEXT);
>
> I have great trouble following your meaning, but I think you are talking
> about joining two tables in a query:
>
>    SELECT t1.employee, t1.birthday, t2.jobtitle
>      FROM t1, t2
>     WHERE t1.employee = t2.employee;
>
> That is not the same as using a sub-query:
>
>    SELECT employee
>      FROM t1
>     WHERE birthday > (
>           SELECT MIN(effectivedate)
>             FROM t2
>           );
>
>   (select employees who were born after the longest-serving employee
>    started work.)
>

I think he means for the employee FOO show only the latest job title. Or in
other words:

SELECT t1.employee, t1.birthday, t2.jobtitle
 FROM t1, t2
 WHERE
   t1.employee = t2.employee
  AND
   t1.employee = 'FOO'
  AND
   t2.effectivedate > CAST('2003-1-1' TO DATE)
 ORDER BY t2.effectivedate DESC
 LIMIT 1


which of course uses a PostgreSQL customisation.

I've got a feeling it's possible doing self joins and the like but I'll leave
it at that I think.


--
Nigel


Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Chris Travers"
Date:
Tony <tony@unihost.net> Wrote:
>  Now many
> consultant/developer/sys-admins like myself are going to client site on
> a contract (this is especially true in the UK, I can't speak for
> anywhere else) and finding complex stocktrading systems, inventory
> systems, CRM systems, and others, all written in PHP backed by MySQL.

I started the CRM system I am developing on MySQL before realizing it was
the wrong choice.  Part of it is simple because people have heard of the
software and don't have the time/stamina/patience to do proper research into
the benefits of alternatives.  There is also a learning curve when going
from MySQL to a more standards-compliant RDBMS like PostgreSQL.  Heck, I
found that going from PostgreSQL to Firebird give me headaches :-P  And
these RDBMS's have most of the same features!

> Whether this is right or wrong, good choice or bad choice is not what
> I'm interested in debating.   The point is that when these systems where
> architected, the developers used MySQL not because they were dumb, but
> because many of them develop awesome code and can get around most
> problems in the code, with a little ingenuity.  Many simply do not have
> the insight into the potential benefits of *proper* RDBMS can offer.

I would actually venture to say that many of them are using the RDBMS as a
sort of object-persistance store, and not really trying to use the
*relational* features of the software.  They might as well be using Berkeley
DB 4.  I know that is how I started with MySQL.

What most of these programmers do not understand is that an RDBMS is not
simply a search-engine for stored persistant objects, but is actually a
fully-featured information storage management system.  With the right
features, this information can be stored, queried, presented in another
form, etc. all while ensuring that the stored information is EXACTLY what
was intended. The tasks that the RDBMS handles include data storage,
integrity enforcement, and data presentation.  Most MySQL programmers only
use it for data storage.  Sadly, this is about all MySQL is good for, and
hence the barrier to learning how to USE a REAL RDBMS are a bit higher
because of the prevalence of the likes of MySQL and MS Access.

> The second scenario, is with admin systems, written by people like
> myself for companies, whether they be simple or complex systems, that
> are intended as a temporary work around to an immediate problem.
> In a very short space of time the stop-gap application you had written
> to sort out the immediate problem quickly becomes a core business
> application (I recently returned to a site after not being there for two
> years and the temporary address book/ email system that I knocked up in
> an afternoon was not only still being used, but now relied upon heavily).

But again, if you start with the right tools, it is easier to modify later
to adapt to changing needs.  I think that this is one of the messages we
should be presenting.  With updateable views, different applications can
even have access to different presentations of the data.

> So on to my point, MySQL guys will happily say "Hey, we're not saying
> that the features MySQL is missing aren't important, and we're working
> towards them, but in the meantime these issues can be worked around like
> this....."  and happily play the whole thing down.  Many LAMP developers
> aren't aware of the benefits of stored procedures, of triggers and other
> good stuff. Like myself, if they were aware how much easier life could
> be if these things were accessible to them, they'd probably be converts
too.

Agreed completely.  Now we just have to sell the PostgreSQL solution.  Here
is what the MySQL people will say (and we need good evidence to counter):
1:  MySQL is faster.
2:  MySQL has more community support.
3:  MySQL has replication as part of its core distribution.  MySQL's
replication is better tested...

> There is not enough emphasis put on the basic importance of these
> functions in PG.  Someone needs to standup and say "Hey, look how this
> can simplify your programming lives"  until I started using
> Druid/Postgres, I had no idea why I needed triggers or what a cascade
> effect did, or why I might want one.

The basic issue is that many programmers are not taught to value information
management systems, such as RDBMS's.  These programmers are interested only
in the data storage issues of the database, and not on how to use it to
manage the information stored therein. Changing this may take a lot of
effort.  Also, using an RDBMS to its full extent rubs some OO programmers
the wrong way because it strikes them as violating rules of OO design.  Of
course, then why not use an OO database? ;-)

> The Linux  community has grown  at least in part because it has
> educated  potential users and journo's to its benefits.  I believe if
> the PG advocacy team did the same, then it would attract many more
> serious LAMP developers.

I agree.  But it will take some time to sell, and will require some
extremely basic work to get going.

> I'd gladly help out with such a paper, but find myself in the sad
> position of my prose being open to attack due to my newbieness in the DB
> world and not able to speak authoratatively on the subject.

I would be happy to help out as well.  How about a paper entitled:
Why PostgreSQL?
It would cover why PostgreSQL is ideal for serious application development.

Note, I have only about 2 years experience with PostgreSQL, and so I would
feel more comfortable with lengthy peer review of whatever I write, but I am
reasonably familiar with both worlds, and can make a strong case for
PostgreSQL, I think.

> Have a think, I'd like to know if others agree.
>
> Cheers
>
> T.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>
>


Re: PostgreSQL Advocacy, Thoughts and Comments

From
Tony
Date:
Comments within:

Chris Travers wrote:
Tony <tony@unihost.net> Wrote: 
 Now many
consultant/developer/sys-admins like myself are going to client site on
a contract (this is especially true in the UK, I can't speak for
anywhere else) and finding complex stocktrading systems, inventory
systems, CRM systems, and others, all written in PHP backed by MySQL.   
I started the CRM system I am developing on MySQL before realizing it was
the wrong choice.  Part of it is simple because people have heard of the
software and don't have the time/stamina/patience to do proper research into
the benefits of alternatives.  There is also a learning curve when going
from MySQL to a more standards-compliant RDBMS like PostgreSQL.  Heck, I
found that going from PostgreSQL to Firebird give me headaches :-P  And
these RDBMS's have most of the same features!
 
It's the learning curve part that I'm finding difficult, not because it's a too complicated, but because I can't find a good source of information to learn from.  I'm sure I'll get flamed for this, but I seem to be unable to find information on proper design principle, including where and when to use triggers, stored procs, etc, etc, that isn't 20 years old already.  In the liquid world of IT, I find it worrying (perhaps incorrectly) learning from a book written 14 years ago. 

Whether this is right or wrong, good choice or bad choice is not what
I'm interested in debating.   The point is that when these systems where
architected, the developers used MySQL not because they were dumb, but
because many of them develop awesome code and can get around most
problems in the code, with a little ingenuity.  Many simply do not have
the insight into the potential benefits of *proper* RDBMS can offer.   
I would actually venture to say that many of them are using the RDBMS as a
sort of object-persistance store, and not really trying to use the
*relational* features of the software.  They might as well be using Berkeley
DB 4.  I know that is how I started with MySQL.
 

Agreed...
What most of these programmers do not understand is that an RDBMS is not
simply a search-engine for stored persistant objects, but is actually a
fully-featured information storage management system.  With the right
features, this information can be stored, queried, presented in another
form, etc. all while ensuring that the stored information is EXACTLY what
was intended. The tasks that the RDBMS handles include data storage,
integrity enforcement, and data presentation.  Most MySQL programmers only
use it for data storage.  Sadly, this is about all MySQL is good for, and
hence the barrier to learning how to USE a REAL RDBMS are a bit higher
because of the prevalence of the likes of MySQL and MS Access. 

Indeed, and I believe that the lack of education with regards to this (or even available information presented at the right level) perpetuates the issue, along with the mis-information put forward by MySQL that these aspects aren't really all that important anyway.
 
The second scenario, is with admin systems, written by people like
myself for companies, whether they be simple or complex systems, that
are intended as a temporary work around to an immediate problem.
In a very short space of time the stop-gap application you had written
to sort out the immediate problem quickly becomes a core business
application (I recently returned to a site after not being there for two
years and the temporary address book/ email system that I knocked up in
an afternoon was not only still being used, but now relied upon heavily).   
But again, if you start with the right tools, it is easier to modify later
to adapt to changing needs.  I think that this is one of the messages we
should be presenting.  With updateable views, different applications can
even have access to different presentations of the data.
 
So on to my point, MySQL guys will happily say "Hey, we're not saying
that the features MySQL is missing aren't important, and we're working
towards them, but in the meantime these issues can be worked around like
this....."  and happily play the whole thing down.  Many LAMP developers
aren't aware of the benefits of stored procedures, of triggers and other
good stuff. Like myself, if they were aware how much easier life could
be if these things were accessible to them, they'd probably be converts   
too.

Agreed completely.  Now we just have to sell the PostgreSQL solution.  Here
is what the MySQL people will say (and we need good evidence to counter):
1:  MySQL is faster.
2:  MySQL has more community support.
3:  MySQL has replication as part of its core distribution.  MySQL's
replication is better tested... 

1.  Let's do apples to apples, NOT apples to Oranges as has been done many times in the past.   It would be far more useful IMHO to put forward a "real world" scenario (rather than purely transactions per second) which shows 2 examples of the same application, the first MySQL written with much of the logic in the code, PHP, Perl, whatever. the second would be written in the same language, with the appropriate use of the true RDBMS features in full effect.  Then let's comapare results.  I'm sure that the more we can usefully abstract to the DB the more performace you'll see from PG, especially as load increases.  The big issue I've found is not raw speed, if half of the logic is in regexps and algorithms in the scripting, this will be many times slower to run than the same en DB.  Although correct me if I'm way off the mark here.  Also the point needs to be made that, it is far more trivial to write other interfaces to the program/application.  Who wouldn't enjoy the ability to work on their CRM with a Python or Java frontend, without the need to write 75% of the logic in a different language.
 
There is not enough emphasis put on the basic importance of these
functions in PG.  Someone needs to standup and say "Hey, look how this
can simplify your programming lives"  until I started using
Druid/Postgres, I had no idea why I needed triggers or what a cascade
effect did, or why I might want one.   
The basic issue is that many programmers are not taught to value information
management systems, such as RDBMS's.  These programmers are interested only
in the data storage issues of the database, and not on how to use it to
manage the information stored therein. Changing this may take a lot of
effort.  Also, using an RDBMS to its full extent rubs some OO programmers
the wrong way because it strikes them as violating rules of OO design.  Of
course, then why not use an OO database? ;-) 

In some respects adding much of the logic to the DB doesn't fly in the face of OO, since you are providing an encapsulated data interface, only needing to change the programatic code in one place
(the DB) for potentially high returns in many places.  Hence taking OO one step further than the language and abstracting one level further.
 
The Linux  community has grown  at least in part because it has
educated  potential users and journo's to its benefits.  I believe if
the PG advocacy team did the same, then it would attract many more
serious LAMP developers.   
I agree.  But it will take some time to sell, and will require some
extremely basic work to get going.
 
I'd gladly help out with such a paper, but find myself in the sad
position of my prose being open to attack due to my newbieness in the DB
world and not able to speak authoratatively on the subject.   
I would be happy to help out as well.  How about a paper entitled:
Why PostgreSQL?
It would cover why PostgreSQL is ideal for serious application development. 

What about "PostgreSQL - Ideal for any application development"
Note, I have only about 2 years experience with PostgreSQL, and so I would
feel more comfortable with lengthy peer review of whatever I write, but I am
reasonably familiar with both worlds, and can make a strong case for
PostgreSQL, I think. 

I have only 2 months and have oodles of enthusiasm, but short on PG knowledge, and even shorter on real RDBMS basics and experience.  But I'll be happy to proof read as a novice. 
 
 

Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Chris Travers"
Date:
Re: [GENERAL] PostgreSQL Advocacy, Thoughts and CommentsRegarding the
learning curve issue, maybe people can recommend their favorite books.  I
recommend "SQL Unleashed" (I forget the author), pub. Samms.  "SQL For
Smarties" also gets recommended often around here, but again, I don't know
the author (or in this case, even the publisher).  Maybe there are others
too that people can recommend.

As for the paper-- I think your title is good ("PostgreSQL - Ideal for any
application development"), and I will have to look at how to organize it.  I
wanted to cover the following topics:
1)  Different types of databases, and what is meant by Object Relational
2)  Enterprise-ready features (Views, Stored Proceedures, Subselects, etc.)
3)  RDBMS tasks and how these features fit in.

Perhaps a followup paper could be written as a basic treatise in database
design.

The other issues I am seing here involve finding a suitable venue for
publication.  Any suggestions here are welcome also.

Best Wishes,
Chris Travers


Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Chris Travers"
Date:
Hi;

The MySQL manual states that Joins perform better than subselects.  The
stated reason is that the planner can better optimize a join.

I am, however, very skeptical of this, and would generally assume things to
be the other way around.

How do subselects perform in relation to hash joins, etc?

Best Wishes,
Chris Travers


Re: PostgreSQL Advocacy, Thoughts and Comments

From
Martijn van Oosterhout
Date:
Hehe, it's kinda funny when people say things like that.

It depends on what you call a "join". From a certain point of view, an IN
expression is a join too, making subselects a slightly unusual join.

In some ways I think a subselect is the SQL way way of expressing a type of
join that cannot be expressed in the normal way and that with a different
query language, the concept may go away entirely.

I do know that in recent versions of postgres, certain types of subselects
are actually optimised into specific types of joins. It's all just
relational algebra people!

So the moral is, MySQL needs a better query optimiser?

Hope this helps,

On Sun, Nov 30, 2003 at 07:59:09PM +0700, Chris Travers wrote:
> Hi;
>
> The MySQL manual states that Joins perform better than subselects.  The
> stated reason is that the planner can better optimize a join.
>
> I am, however, very skeptical of this, and would generally assume things to
> be the other way around.
>
> How do subselects perform in relation to hash joins, etc?
>
> Best Wishes,
> Chris Travers
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

Re: PostgreSQL Advocacy, Thoughts and Comments

From
Tom Lane
Date:
"Chris Travers" <chris@travelamericas.com> writes:
> The MySQL manual states that Joins perform better than subselects.

Very possibly true ... in MySQL.  Since they have such an immature
subselect implementation (not even out of alpha apparently), it'd
not be surprising if they can't optimize subselects worth a damn yet.
Our planner has been hacked on repeatedly to do a good job with
subselects --- and I wouldn't want to imply that the process is done
yet.

One of the more amusing aspects of watching MySQL's response to the
"feature race" is how they invariably gloss over the difference between
having a minimal implementation of a feature, and having a feature that
is mature, complete, and efficient.  Subselects are one example where
there's a lot of mileage yet to cover after you get to the point where
you can say "it works".

            regards, tom lane

Re: PostgreSQL Advocacy, Thoughts and Comments

From
Oleg Bartunov
Date:
Randal,

we have contrib/tsearch2 which could be used for full text searching
and perl wrapper illustrating simple search engine would be quite
useful.

    Oleg

On Sat, 29 Nov 2003, Randal L. Schwartz wrote:

> >>>>> "Unihost" == Unihost Web Hosting <tony@unihost.net> writes:
>
> Unihost> Further to this post, what might actually work is to convince O'
> Unihost> Reilly (since they have PostgreSQL book/s) to do some articles like
> Unihost> they have for PG, but making full use of the PG database.  For
> Unihost> instance, building a simple data-warehouse using PG.  Articles that
> Unihost> show off an OSS product/project in a clearly enterprise light in a
> Unihost> step-by-step fashion.  There have been so many articles on DB design
> Unihost> using MySQL.  How about an article on DB design using all the
> Unihost> functionality of a real ORDBMS.
>
> Well, since I need 2.5 ideas per month for the three columns I'm still
> writing, I'm certainly in a position to write nice things about PG,
> although I always have to work it in from a Perl slant.
>
> Actually, I'm sure that any of the magazines I'm in would appreciate
> an additional article or two from me.
>
> If you can think of something that fits in 2000 words or so (or 4000
> if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate
> some inspiration.
>
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: PostgreSQL Advocacy, Thoughts and Comments

From
Rick Morris
Date:
Tony wrote:

> HI All,
>
> I'm glad that this thread prompted some thoughtful response.   I think
> one of my main points I was trying to make, Jason hit the nail on the
> head.  The article to which I was referring uses a great example which
> I have experienced many times before, but in order to grasp this, PHP
> et al, must be thought of as a scripting language which crosses many
> corporate boundries, and it is easy to assume that it's primary use
> (simple web site back ends) are the only thing to discuss.  But the
> situation has changed enourmously since the release of PHP v4.  Now
> many consultant/developer/sys-admins like myself are going to client
> site on a contract (this is especially true in the UK, I can't speak
> for anywhere else) and finding complex stocktrading systems, inventory
> systems, CRM systems, and others, all written in PHP backed by MySQL.

So true! I am in the U.S (Florida), and I am seeing the same thing here.
Starting around 2000, many fairly complex, mission-critical PHP/MySQL
apps were developed, which are just beginning to surface. We all know
how prevalent PHP and MySQL became overnight, but how many of us realize
that it was not just used for 'lightweight' applications?. Imagine how
big a problem all these PHP/MySQL applications are going to become over
the next few years. I have had the dubious pleasure of moving a few of
these from MySQL to PostgreSQL already (Yes, financial systems using
MySQL's unconstrained numeric types!!), and I shudder to think about all
the companies that might end up with *years* of poorly-constrained data.

> Whether this is right or wrong, good choice or bad choice is not what
> I'm interested in debating.   The point is that when these systems
> where architected, the developers used MySQL not because they were
> dumb, but because many of them develop awesome code and can get around
> most problems in the code, with a little ingenuity.  Many simply do
> not have the insight into the potential benefits of *proper* RDBMS can
> offer.   Had they had the benefit of such knowledge the code they have
> written would be faster (in DB) and more legible. Sadly often the
> developers are the only source of DBA for some of these companies.

Most medium/small business managers don't even know there is a
difference between the two.

<snip>

> Like Linux vs. Windows, PG has an awful lot going for it in respect to
> MySQL, so why not crow about it.  It needs to be pointed at a crowd
> that are DB novices, they need to be told why PG is worth the
> time/knowledge investment, because anyone who reads the MySQL site,
> will come away with the impression that the Trigger, Stored Procs, and
> other things are a luxurious overhead not necessary for getting the
> job done.
>
> I'd gladly help out with such a paper, but find myself in the sad
> position of my prose being open to attack due to my newbieness in the
> DB world and not able to speak authoratatively on the subject.

You're not doing too badly, really. Your writing is good and clear, and
your knowlege is well above the typical corporate IT magazine hack ;-).

Regards,

Rick Morris


Re: PostgreSQL Advocacy, Thoughts and Comments

From
Brendan Jurd
Date:
Hello,

I'd just like to add that this thread is perfectly aligned with my own
experiences on the DB front.

When I started out as a developer, I was familiar with the concepts of
good database design, but *not* with the range of DBMSs available, and
their respective advantages and disadvantages.  I started using MySQL
because it was popular, fast, free and (sort of) easy to set up.  Then
my major project, a payroll management system, previously PHP/MSSQL,
needed a new environment established very quickly.  I used MySQL because
time was short and it was what I knew.  To get around the lack of proper
foreign key constraints I did crazy things like manually maintaining a
table of "foreign keys" and enforcing them from PHP.

When the client on that project hired a new sysadmin, the sa took one
look at the system and said "Uhh, dude.  Do you know anything about
postgres?"

I'd heard of postgres but didn't really know anything about it.  After
having the sysadmin tell me about postgres' capabilities, and checking
out the manual for myself, I realised that I'd been barking up the wrong
proverbial tree for months.  So we migrated the system to postgres
earlier this year, and I couldn't be more pleased with it.  The
immediate payoff of having referential integrity enforced properly was
wonderful, but the advantages just kept on coming.  I started using
postgres functions and views to shift more of the work of "organising
data" over to the piece of software that *should* be handling it.  Some
of those views enabled me to chop vast tracts of superfluous PHP code
out of the system.  The more I used postgres, the more I came to
understand -- this is the way it ought to be.

BJ

Rick Morris wrote:

> Tony wrote:
>
>> HI All,
>>
>> I'm glad that this thread prompted some thoughtful response.   I
>> think one of my main points I was trying to make, Jason hit the nail
>> on the head.  The article to which I was referring uses a great
>> example which I have experienced many times before, but in order to
>> grasp this, PHP et al, must be thought of as a scripting language
>> which crosses many corporate boundries, and it is easy to assume that
>> it's primary use (simple web site back ends) are the only thing to
>> discuss.  But the situation has changed enourmously since the release
>> of PHP v4.  Now many consultant/developer/sys-admins like myself are
>> going to client site on a contract (this is especially true in the
>> UK, I can't speak for anywhere else) and finding complex stocktrading
>> systems, inventory systems, CRM systems, and others, all written in
>> PHP backed by MySQL.
>
>
> So true! I am in the U.S (Florida), and I am seeing the same thing
> here. Starting around 2000, many fairly complex, mission-critical
> PHP/MySQL apps were developed, which are just beginning to surface. We
> all know how prevalent PHP and MySQL became overnight, but how many of
> us realize that it was not just used for 'lightweight' applications?.
> Imagine how big a problem all these PHP/MySQL applications are going
> to become over the next few years. I have had the dubious pleasure of
> moving a few of these from MySQL to PostgreSQL already (Yes, financial
> systems using MySQL's unconstrained numeric types!!), and I shudder to
> think about all the companies that might end up with *years* of
> poorly-constrained data.
>
>> Whether this is right or wrong, good choice or bad choice is not what
>> I'm interested in debating.   The point is that when these systems
>> where architected, the developers used MySQL not because they were
>> dumb, but because many of them develop awesome code and can get
>> around most problems in the code, with a little ingenuity.  Many
>> simply do not have the insight into the potential benefits of
>> *proper* RDBMS can offer.   Had they had the benefit of such
>> knowledge the code they have written would be faster (in DB) and more
>> legible. Sadly often the developers are the only source of DBA for
>> some of these companies.
>
>
> Most medium/small business managers don't even know there is a
> difference between the two.
>
> <snip>
>
>> Like Linux vs. Windows, PG has an awful lot going for it in respect
>> to MySQL, so why not crow about it.  It needs to be pointed at a
>> crowd that are DB novices, they need to be told why PG is worth the
>> time/knowledge investment, because anyone who reads the MySQL site,
>> will come away with the impression that the Trigger, Stored Procs,
>> and other things are a luxurious overhead not necessary for getting
>> the job done.
>>
>> I'd gladly help out with such a paper, but find myself in the sad
>> position of my prose being open to attack due to my newbieness in the
>> DB world and not able to speak authoratatively on the subject.
>
>
> You're not doing too badly, really. Your writing is good and clear,
> and your knowlege is well above the typical corporate IT magazine hack
> ;-).
>
> Regards,
>
> Rick Morris
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org



Re: PostgreSQL Advocacy, Thoughts and Comments

From
Robert Treat
Date:
On Sunday 30 November 2003 05:38, Tony wrote:
> Comments within:
> Chris Travers wrote:
> >Tony <tony@unihost.net> Wrote:
> >Agreed completely.  Now we just have to sell the PostgreSQL solution.
> > Here is what the MySQL people will say (and we need good evidence to
> > counter): 1:  MySQL is faster.
> >2:  MySQL has more community support.
> >3:  MySQL has replication as part of its core distribution.  MySQL's
> >replication is better tested...
>
> 1.  Let's do apples to apples, NOT apples to Oranges as has been done
> many times in the past.   It would be far more useful IMHO to put
> forward a "real world" scenario (rather than purely transactions per
> second) which shows 2 examples of the same application, the first MySQL
> written with much of the logic in the code, PHP, Perl, whatever. the
> second would be written in the same language, with the appropriate use
> of the true RDBMS features in full effect.  Then let's comapare
> results.  I'm sure that the more we can usefully abstract to the DB the
> more performace you'll see from PG, especially as load increases.  The
> big issue I've found is not raw speed, if half of the logic is in
> regexps and algorithms in the scripting, this will be many times slower
> to run than the same en DB.  Although correct me if I'm way off the mark
> here.  Also the point needs to be made that, it is far more trivial to
> write other interfaces to the program/application.  Who wouldn't enjoy
> the ability to work on their CRM with a Python or Java frontend, without
> the need to write 75% of the logic in a different language.
>

Here's one of the more famous articles of this type which documents the switch
from mysql to postgresql by the folks at sourcforge. (and we didnt even have
to give them thousands of $$$ in development and product support like IBM...
:-)

http://www.phpbuilder.com/columns/tim20001112.php3

You'll note in one test postgresql performs at 6x mysql... and this was with
version 7.1.


Another interesting note is the mention that interbase was looked at and then
passed over due, in part, to "serious community and corporate issues around
it's Open Sourceness".

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: PostgreSQL Advocacy, Thoughts and Comments

From
Scott Ribe
Date:
> I would be happy to help out as well.  How about a paper entitled:
> Why PostgreSQL?
> It would cover why PostgreSQL is ideal for serious application development.

Good idea.

> Note, I have only about 2 years experience with PostgreSQL, and so I would
> feel more comfortable with lengthy peer review of whatever I write, but I am
> reasonably familiar with both worlds, and can make a strong case for
> PostgreSQL, I think.

Same here...


--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 665-7007 voice


Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Rick Gigger"
Date:
Here is a link to the sql for smarties book:

http://www.amazon.com/exec/obidos/tg/detail/-/1558603239/102-3995931-7263333?v=glance

by Joe Celko

Has some cool ways of handling trees in sql

----- Original Message -----
From: "Chris Travers" <chris@travelamericas.com>
To: "Tony" <tony@unihost.net>
Cc: <pgsql-general@postgresql.org>
Sent: Sunday, November 30, 2003 5:49 AM
Subject: Re: [GENERAL] PostgreSQL Advocacy, Thoughts and Comments


> Re: [GENERAL] PostgreSQL Advocacy, Thoughts and CommentsRegarding the
> learning curve issue, maybe people can recommend their favorite books.  I
> recommend "SQL Unleashed" (I forget the author), pub. Samms.  "SQL For
> Smarties" also gets recommended often around here, but again, I don't know
> the author (or in this case, even the publisher).  Maybe there are others
> too that people can recommend.
>
> As for the paper-- I think your title is good ("PostgreSQL - Ideal for any
> application development"), and I will have to look at how to organize it.
I
> wanted to cover the following topics:
> 1)  Different types of databases, and what is meant by Object Relational
> 2)  Enterprise-ready features (Views, Stored Proceedures, Subselects,
etc.)
> 3)  RDBMS tasks and how these features fit in.
>
> Perhaps a followup paper could be written as a basic treatise in database
> design.
>
> The other issues I am seing here involve finding a suitable venue for
> publication.  Any suggestions here are welcome also.
>
> Best Wishes,
> Chris Travers
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


Re: PostgreSQL Advocacy, Thoughts and Comments

From
"Chris Travers"
Date:

Here is a paper I have written for the purposes of providing some additional educational material for the MySQL crowd.  Permission is granted for it to be redistributed via the mailing list and archives, but not to be posted elsewhere until I have a final version completed.

I am interested in peer review regarding the concepts in this paper, etc.  Any feedback?

 

Choosing a Database Manager

A Study of Open Source Relational Database Management Systems

By Chris Travers

 

Introduction

There sometimes comes a point of epiphany, dreaded by programmers, when it becomes clear that at least one of the components of an application is completely inadequate for the task at hand, and thus limits progress or usefulness of the program in some fundamental way.  All too often this is the relational database management system (RDBMS).

 

Choosing the proper relational database management system can make or break an application.  The choice of a primary RDBMS is thus very important for the long-term viability of an application.  While this paper focuses on open source RDBMS’s, the same methodology can be applied towards commercial systems as well, such as Oracle, MS SQL, and DB2. Also, the concept of application specific database managers will be discussed briefly in order to provide some degree of contrast to the relational paradigm.

Types of Database Managers

Application Specific Database Managers

Application specific database managers include the GDBM (GNU Database Manager), and the Berkeley Database Manager (available, open source or commercially licensed from Sleepycat software).  Basically, such a database provides a simple persistence layer for database objects.  They do not allow for complex querying, and are often used in single-purpose applications where the data does not need to be accessed by any outside application.

 

If two applications share the same database using these database managers, they must be intimately familiar with the internal data structures that are stored in the database manager.  No presentation is given to the data aside from how it is represented in the database.

Relational Database Managers

In contrast to the application specific database managers, relational database managers present an abstraction layer between the information in the database and the application using that information.  The information can then be subject to much more complex queries, and the information can be represented to the application in whatever form makes sense to the application.  Usually, SQL is the language used for such queries.

 

In a relational database manager, data is stored in “tables” and tables can be joined together based on common criteria.  In this way, several different applications can access the same information in different ways.

What does “Object Relational” mean?

PostgreSQL is “Object Relational” which means that although it is a relational database management system, the database “objects” have at least some of the traits of objects in object-oriented programming.  This tables can be inherited, though this does not apply to user defined data types, etc.  With an Object Relational system, purely relational databases can be designed and used, but also inheritance can be used in various ways.  Use of such features is beyond the scope of this paper, however.

Features Important to Enterprise Applications

Subselects

Subselects are important in large applications for two reasons.  First they allow performance tuning for certain types of queries above and beyond what a planner can normally do.  Very often subselects will outperform inner joins because the planner can focus on a simpler pattern with a maximum of one table scan per subselect and a smaller memory footprint than other forms of single-scan joins.

 

Secondly, when compared to inner joins, subselects often make for more readable queries, especially where several tables are involved.  Greater readability makes for easier maintenance of the application.

Views

A view is a stored query that appears to be a table from the application’s perspective.  Views can be used to create alternative representations of the data for other applications or for reporting purposes.  In many database managers, views can be updateable, and sometimes data can even be inserted or deleted from the view.  Thus views can be used to create logical tables that present data in a way that is less normalized than the actual structure of the information in the tables themselves.

 

Views can also be used for partitioning tables for permissions purposes-- vertically (groups of rows), horizontally (groups of columns), or both.  They can also be used to create complex virtual tables for reporting or online analytic processing, or OLAP.  In this way, views can help manage information and also help make sense of complex patterns as well.

Stored Procedures

Stored Procedures, also known as User Defined Functions, allow for a greater quantity of business logic to be stored in the database.  These functions can allow for the development of more complex views than would otherwise be possible, and can also allow for the development of a database-level API that can then be accessed by any application that has access to the database.  In this way, they can reduce development time by allowing the development of “modules” which are common to all clients of the database.

 

Stored procedures also reduce network latency issues by reducing the number of communications between the client and server.  Whole sets of queries can be run before any result set needs to be sent to the client.  In this way, performance can be improved, given sufficient hardware on the side of the database server.

Triggers

Triggers are a mechanism for automatic maintenance and management of information in the database.  Triggers can be activated when information is modified in any table. Triggers are often used to enforce referential integrity (ensuring that records in one table reference records in another table that actually exist).

 

One example of another use of a trigger is in maintaining some other record of the change made to the information of a database.  This information can later be used for reporting purposes to ensure that a consistent picture of the database can be constructed for any arbitrary point in time.  For example, I could store historical information about prices for items from my online store, and then do historical processing to see how sales were affected by price changes at a later date since all historical information would be maintained in the database.

Database Tasks: How These Features Help

Data Integrity Enforcement

For data to be useful in a large business program, it must be meaningful.  In many cases, the largest threat to the ability to make the data meaningful is the so-called “orphaned record” problem.  Orphaned records refer to records in other tables that, from the point of view of the application, no longer exist.  This can happen when records are deleted or their primary keys are updated.  Triggers can be used to ensure that this problem never exists by doing one of several things:  cascading updates/deletes, restricting updates/deletes when records reference the column being updated or row being deleted, or setting the foreign key in the referencing table to NULL (which means, in database terms, “unknown”).

 

Triggers can also restrict the types of information being inserted or updated in the database, ensuring that only meaningful input is stored.  They can also be used to maintain historical information, allowing a consistent picture of all tables to be built for any point in time.

Data Presentation

Assume for a moment that your database has several front-ends that access the information in different ways.  Perhaps this is not the case now, but it could be in the future.  It is extremely helpful if the database manager can present the information to the programs in ways that would be natural for them.  This effectively negates the requirement that every application accessing the database must have in-depth knowledge of the database schema.  Instead, the application’s expected database schema can be presented to it.

 

A very useful side effect of this capability is in integrating applications that may or may not have been designed to be integrated.  In this way, larger, more unified databases can be built for use of internal applications. 

 

Views and, depending on complexity, stored procedures would be needed to implement this sort of system, which could also provide advanced reporting capabilities, also based on views and stored procedures.

Common Modules As Stored Procedures

When several programs access the same database, there may be a reasonable amount of common functionality in the functions used by the programs.  These can be simplified using views to some extent, but often this is simply not enough.  Using stored procedures, the functions that two or more front-ends have in common can be moved to the database backend, so that there is one single point of maintenance and reference.  This technique can reduce development and maintenance time and is essential for any database being used by multiple applications.

Questions to Ask

Where will my application be deployed?

The choice of a database manager must, no doubt, take into account availability of the solutions at the point of deployment.  Of course, applications run at the business’s site can use whatever solutions are found to be affordable and/or best suited to the task at hand.  Hosted web applications are more at the mercy of what is available, however.

How necessary is integration with other programs?

Integration capability is always a selling point, as it allows for more powerful solutions to be built in pieces and lowers overall system maintenance.  However, the ability to integrate with other solutions may come at a cost, especially regarding availability of the chosen database manager in a shared hosting environment.

Major Open Source Relational Database Managers

MySQL

MySQL is the most common open source RDBMS among low-cost web hosting providers.  As of version 4.0, it does not support subselects, views, stored procedures, or triggers.  Additionally, its enforcement of data integrity is questionable at best, and it is sometimes known to truncate numeric or text values when they are too large to fit.  It is also sometimes possible to insert invalid timestamps into the database.  This can cause extreme problems for any application that relies on the information in the database being exactly what was entered (for example, accounting applications).

 

Even when views are supported in MySQL, it may be some time before these views can be used for insert, update, or delete queries.  A similar period of maturation should be expected for other features such as stored procedures and triggers.

 

Despite its limitations, MySQL’s ubiquity and the availability of a Windows port makes it an attractive database platform.  For internal business tools, however, there are better choices.

Firebird

Firebird is based on a source-code release of Borland’s Interbase RDBMS.  It is mature, and supports views, subselects, triggers, and stored procedures. As of version 1.0, it has a native Windows port available, making it ideal for in-house applications that must support both Windows and UNIX operating systems.  However, I found the learning curve to be steeper than other RDBMS’s particularly regarding the development of stored procedures.

PostgreSQL

PostgreSQL is an object-relational database management system that features every feature listed here, and extensible type system, and the ability to write stored procedures in a variety of different languages.  Indeed, there are four stored procedure languages (Perl, Tcl, SQL, and PL/PGSQL) distributed with PostgreSQL and many more available from third-party sites.

 

Although the learning curve is steeper than that of MySQL, I found it easier than Firebird.

 

As of version 7.4, PostgreSQL does not yet have a native Windows port available, making it unsuitable at the moment for applications that must be able to run on Windows database servers.

 

Although PostgreSQL has not achieved the ubiquity among shared hosting providers that MySQL has, there are many providers offering hosting at a variety of costs.  You can find a list of them at http://techdocs.postgresql.org/hosting.php.

Conclusion

Selecting the right database manager is crucial to the long-term viability of any application.  Care should be taken to ensure that the database manager chosen will continue to meet the needs of the application in the future as well as in the present.

Re: PostgreSQL Advocacy, Thoughts and Comments

From
Alex Satrapa
Date:
Chris Travers wrote:
>
>   Here is a paper I have written for the purposes of providing some
>   additional educational material for the MySQL crowd.

Here's my contribution:


Why I choose PostgreSQL (PostgreSQL in 21 Seconds)

I choose referential integrity, meaning my lookups always work.

I choose stored procedures, meaning all my developers - Windows or Unix,
Perl, C++ or Java - can access the database in the same way, using the
same locking, with the same checking and cleaning

I choose subselects and outer joins, which allow me to build complex
queries to get exactly the information I want from the database, rather
than wasting my time munging data in my code. Even better, I can put
those common queries into stored procedures, so other developers can get
the same results as I do!

I choose partial indexes, so lookups on NULL fields are just as fast if
not faster.

I choose a user community that believes getting the results right is
more important than getting them quickly.

I choose getting the right results, right now!

I choose funny capitalisation, and a name that can't be pronounced!

I choose PostgreSQL.


Re: PostgreSQL Advocacy, Thoughts and Comments

From
Rory Campbell-Lange
Date:
On 29/11/03, Randal L. Schwartz (merlyn@stonehenge.com) wrote:
> Well, since I need 2.5 ideas per month for the three columns I'm still
> writing, I'm certainly in a position to write nice things about PG,
> although I always have to work it in from a Perl slant.
>
> Actually, I'm sure that any of the magazines I'm in would appreciate
> an additional article or two from me.
>
> If you can think of something that fits in 2000 words or so (or 4000
> if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate
> some inspiration.

Hi Randal

I think I may have an idea for an article which would address a common
problem for people writing database client interfaces:

The problem is simply explained.

Problem title:

    The "page of pages" problem (!)

The problem:

    You want to return a subset of a large number items using some
    fairly complex search criteria. You want to make only one database
    call, benefit from a cached query, and don't want to have all the
    rows in memory. How do you get the total count of pages for the
    relevant search criteria?

Why is this relevant?

    Moving logic that is inherent to the database to the database
    provides a potentially rich yet simple interface to database
    queries that can benefit a number of client applications.

    Typically this sort of query would be written as at least two
    dynamically generated queries in the client program that has to be
    parsed by the backend before it is executed. By using functions we
    can hide complex joins behind simple field names, and provide
    flexible (if limited) search capabilites, as well as caching and
    sensible error messages.

Approach:

    Using Postgres one can construct a function and then do either

       SELECT * from function fn_explore($searchstring, $limit, $offset);
    OR
       SELECT
        *
       FROM
        function fn_explore()
       WHERE
        searchterm ~* 'test'
       LIMIT
        5
       OFFSET
        10;

    What is cool about the second format is that (if the function
    returned a type 'explore_result' as below), your PHP/Perl programmer
    can at their interface do something like

        '... where id <> 1 AND author IN ('james', 'bill')...'

    However I don't know how you get back the total rows in this case,
    also maybe the caching effects are minimised?

    --------------------------------------------------------------------
    Type definition:

    CREATE TYPE explore_result as (
        id                 INTEGER,   -- some sort of row id
        total              INTEGER,   -- total rows for query
        author             VARCHAR,
        image              BYTEA
        /*
        Not needed unless search is done outside db.
        , searchterm         VARCHAR
        */
    );

    --------------------------------------------------------------------
    Sketch function definition:

    CREATE OR REPLACE FUNCTION
        fn_explore (integer, integer, integer) RETURNS setof explore_result
    AS '
    DECLARE
        searchstring  ALIAS for $1;
        offsetter     ALIAS for $2;
        limiter       ALIAS for $3;
        resulter      explore_page%rowtype;
    BEGIN

        /*
          variable verifation section chopped
        */

        FOR resulter IN
            SELECT
                n_id                                          as id,
                LOJ.pagetotal                                 as total
                pers.t_name                                   as author,
                image.b_contents                              as image
                /*
                need searchterm returned if we are doing search outside
                the database
                , COALESCE(t_title || '' '' || t_text,  '''') as searchterm

            FROM
                db
                /*
                -> self join on db LOJ for unoffset, unlimited row count
                   refer to searchterm stuff below
                */
            WHERE
                /* note, if we are doing a search outside of the
                 * function and t_title or t_text could be empty then we
                 * need to coalesce to an empty string
                 * COALESCE(t_title || '' '' || t_text,  '''') as searchterm
                 */
                 searchstring ~ t_title || '' '' || t_text
            ORDER BY
                dt_modified DESC
            LIMIT
                limiter
            OFFSET
                offsetter
                ,

             LOOP

            RETURN NEXT
                resulter;

        END LOOP;

        RETURN;
    END;'
        LANGUAGE plpgsql;

--
Rory Campbell-Lange
<rory@campbell-lange.net>
<www.campbell-lange.net>