Thread: Re: PostgreSQL Advocacy, Thoughts and Comments
hi, <huge snip> > Maybe there's not such a need for the advanced features of PostgreSQL > amongst PHP programmers as you seem to believe. Most of the PHP stuff I've > seen is read-only content display stuff and that doesn't really require a > top-notch RDBMS; a more limited database should also be up to the job. For > complex transactional web applications, J2EE/Model II is a far superior > technology to scripts/Model I and that means a different target audience > for the apps where PostgreSQL can offer those essential extra features. > Whilst most J2EE developers will be using Oracle/DB2/MSSQL as their > back-end, the awareness of PostgreSQL seems quite high and, in the few > usenet groups I monitor, I don't recall anyone being flamed for > recommending PostgreSQL over MySQL. Maybe seasoned, professional > developers don't like being told that they're crap programmers just > because they ask for something as fundamental as referential integrity! I completely disagree. I do a lot of programming with PHP and the features of Postgres come in handy. Let me give you an example of just some basic things. Triggers! Why should I have to write insert and update triggers in the logic (PHP) if I can handle it at the database level. Sql is 10x as fast as the language. Better to handle what you can at the database level. Same with views and stored procedures. MySQL cannot even handle sub-queries yet. I also use Python for standalone interfaces to the data. Why should I not be able to use the same views and triggers etc in there that I use for my web apps. PHP is quite powerful if used correctly. Java has its own issues and I am not sure it is as far supiour as you are claming it is. But that is not for this dscussion. MySQL may be more popular with (cheap) web hosting places but that doesn't mean it is the best or that Postgres wouldn't serve better even in this area. I am glad to see the article written for PHP mag as Postgres would help alot of PHP guys that are using MySQL. <another snip>
"Jason Tesser" <JTesser@nbbc.edu> > MySQL cannot even handle sub-queries yet. Ohh! Really? Allow me to pay my highest respect to the genius mySQL programmers! I completely have no clue on how to construct any single tiny database on a DBMS having no sub-query capability. Being too dumb, I solicit mySQL programmers' help by showing me employee FOO's birthday and his/her latest job title effective on or before 2003-1-1 from the following tables: CREATE TABLE t1 (employee TEXT,BirthDay DATE); CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle TEXT); And make the result like this: FOO 1980-1-1 programmer Please do not give me the answer that you will merge these two tables to form one like this: CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate DATE,JobTitle TEXT); Regards, CN
"Jason Tesser" <JTesser@nbbc.edu> wrote: > I completely disagree. I do a lot of programming with PHP and the features > of Postgres come in handy. Let me give you an example of just some > basic things. Triggers! Why should I have to write insert and update > triggers in the logic (PHP) if I can handle it at the database level. Sql > is 10x as fast as the language. Better to handle what you can at the database > level. Same with views and stored procedures. MySQL cannot even handle > sub-queries yet. I also use Python for standalone interfaces to the data. > Why should I not be able to use the same views and triggers etc in there > that I use for my web apps. PHP is quite powerful if used correctly. I guess I am coming at this from the other direction: MySQL is popular and many people use it for lightweight stuff. Partly this may be because better tools exist for providing hosted solutions, and this is an area we could improve (automatically adding entries to the pg_hba.conf, etc.-- may have to look into doing this). THe real problem I see is that this keeps PHP from being an ideal skill for developing enterprise applications. The features you are mentioning are extremely helpful, even necessary, when you have many applications working against the same database. The triggers, etc. can give you some consistant business logic, and you can use views to present information to the applications in a way that is natural for them. In essence, my point is that for single-use databases, MySQL isn't all that bad (aside from consistancy issues). However, the popularity of the LAMP development environment holds PHP back from being a serious corporate development environment, IMO. > Java has its own issues and I am not sure it is as far supiour as you > are claming it is. But that is not for this dscussion. MySQL may be more > popular with (cheap) web hosting places but that doesn't mean it is the best > or that Postgres wouldn't serve better even in this area. I am glad > to see the article written for PHP mag as Postgres would help alot of PHP > guys that are using MySQL. Again, I think that the most important benefit would be lowering the barrier to entry of serious development. You can start with Linux/Apache/PostgreSQL/PHP for a simple site, and then use your knowledge better to develop more serious applications. But the critical issue to resolve is to make available a tool or set of tools to manage shared hosting environments in an easier way. I would be happy to try to generate such a set of tools. Best Wishes, Chris Travers
On Sat, 2003-11-29 at 04:37, cnliou wrote: > "Jason Tesser" <JTesser@nbbc.edu> > > > MySQL cannot even handle sub-queries yet. > > Ohh! Really? > Allow me to pay my highest respect to the genius mySQL > programmers! > I completely have no clue on how to construct any single > tiny database on a DBMS having no sub-query capability. > > Being too dumb, I solicit mySQL programmers' help by showing > me employee FOO's birthday and his/her latest job title > effective on or before 2003-1-1 from the following tables: > > CREATE TABLE t1 (employee TEXT,BirthDay DATE); > CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle > TEXT); > > And make the result like this: > > FOO 1980-1-1 programmer > > Please do not give me the answer that you will merge these > two tables to form one like this: > > CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate > DATE,JobTitle TEXT); I have great trouble following your meaning, but I think you are talking about joining two tables in a query: SELECT t1.employee, t1.birthday, t2.jobtitle FROM t1, t2 WHERE t1.employee = t2.employee; That is not the same as using a sub-query: SELECT employee FROM t1 WHERE birthday > ( SELECT MIN(effectivedate) FROM t2 ); (select employees who were born after the longest-serving employee started work.) -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "Who shall ascend into the hill of the LORD? or who shall stand in his holy place? He that hath clean hands, and a pure heart..." Psalms 24:3,4
HI All, I'm glad that this thread prompted some thoughtful response. I think one of my main points I was trying to make, Jason hit the nail on the head. The article to which I was referring uses a great example which I have experienced many times before, but in order to grasp this, PHP et al, must be thought of as a scripting language which crosses many corporate boundries, and it is easy to assume that it's primary use (simple web site back ends) are the only thing to discuss. But the situation has changed enourmously since the release of PHP v4. Now many consultant/developer/sys-admins like myself are going to client site on a contract (this is especially true in the UK, I can't speak for anywhere else) and finding complex stocktrading systems, inventory systems, CRM systems, and others, all written in PHP backed by MySQL. Whether this is right or wrong, good choice or bad choice is not what I'm interested in debating. The point is that when these systems where architected, the developers used MySQL not because they were dumb, but because many of them develop awesome code and can get around most problems in the code, with a little ingenuity. Many simply do not have the insight into the potential benefits of *proper* RDBMS can offer. Had they had the benefit of such knowledge the code they have written would be faster (in DB) and more legible. Sadly often the developers are the only source of DBA for some of these companies. The second scenario, is with admin systems, written by people like myself for companies, whether they be simple or complex systems, that are intended as a temporary work around to an immediate problem. In a very short space of time the stop-gap application you had written to sort out the immediate problem quickly becomes a core business application (I recently returned to a site after not being there for two years and the temporary address book/ email system that I knocked up in an afternoon was not only still being used, but now relied upon heavily). So on to my point, MySQL guys will happily say "Hey, we're not saying that the features MySQL is missing aren't important, and we're working towards them, but in the meantime these issues can be worked around like this....." and happily play the whole thing down. Many LAMP developers aren't aware of the benefits of stored procedures, of triggers and other good stuff. Like myself, if they were aware how much easier life could be if these things were accessible to them, they'd probably be converts too. There is not enough emphasis put on the basic importance of these functions in PG. Someone needs to standup and say "Hey, look how this can simplify your programming lives" until I started using Druid/Postgres, I had no idea why I needed triggers or what a cascade effect did, or why I might want one. The Linux community has grown at least in part because it has educated potential users and journo's to its benefits. I believe if the PG advocacy team did the same, then it would attract many more serious LAMP developers. Like Linux vs. Windows, PG has an awful lot going for it in respect to MySQL, so why not crow about it. It needs to be pointed at a crowd that are DB novices, they need to be told why PG is worth the time/knowledge investment, because anyone who reads the MySQL site, will come away with the impression that the Trigger, Stored Procs, and other things are a luxurious overhead not necessary for getting the job done. I'd gladly help out with such a paper, but find myself in the sad position of my prose being open to attack due to my newbieness in the DB world and not able to speak authoratatively on the subject. Have a think, I'd like to know if others agree. Cheers T.
Further to this post, what might actually work is to convince O' Reilly (since they have PostgreSQL book/s) to do some articles like they have for PG, but making full use of the PG database. For instance, building a simple data-warehouse using PG. Articles that show off an OSS product/project in a clearly enterprise light in a step-by-step fashion. There have been so many articles on DB design using MySQL. How about an article on DB design using all the functionality of a real ORDBMS. Just a few thoughts. Cheers T. Tony wrote: > HI All, > > I'm glad that this thread prompted some thoughtful response. I think > one of my main points I was trying to make, Jason hit the nail on the > head. The article to which I was referring uses a great example which > I have experienced many times before, but in order to grasp this, PHP > et al, must be thought of as a scripting language which crosses many > corporate boundries, and it is easy to assume that it's primary use > (simple web site back ends) are the only thing to discuss. But the > situation has changed enourmously since the release of PHP v4. Now > many consultant/developer/sys-admins like myself are going to client > site on a contract (this is especially true in the UK, I can't speak > for anywhere else) and finding complex stocktrading systems, inventory > systems, CRM systems, and others, all written in PHP backed by MySQL. > Whether this is right or wrong, good choice or bad choice is not what > I'm interested in debating. The point is that when these systems > where architected, the developers used MySQL not because they were > dumb, but because many of them develop awesome code and can get around > most problems in the code, with a little ingenuity. Many simply do > not have the insight into the potential benefits of *proper* RDBMS can > offer. Had they had the benefit of such knowledge the code they have > written would be faster (in DB) and more legible. Sadly often the > developers are the only source of DBA for some of these companies. > > The second scenario, is with admin systems, written by people like > myself for companies, whether they be simple or complex systems, that > are intended as a temporary work around to an immediate problem. In a > very short space of time the stop-gap application you had written to > sort out the immediate problem quickly becomes a core business > application (I recently returned to a site after not being there for > two years and the temporary address book/ email system that I knocked > up in an afternoon was not only still being used, but now relied upon > heavily). > > So on to my point, MySQL guys will happily say "Hey, we're not saying > that the features MySQL is missing aren't important, and we're working > towards them, but in the meantime these issues can be worked around > like this....." and happily play the whole thing down. Many LAMP > developers aren't aware of the benefits of stored procedures, of > triggers and other good stuff. Like myself, if they were aware how > much easier life could be if these things were accessible to them, > they'd probably be converts too. > > There is not enough emphasis put on the basic importance of these > functions in PG. Someone needs to standup and say "Hey, look how this > can simplify your programming lives" until I started using > Druid/Postgres, I had no idea why I needed triggers or what a cascade > effect did, or why I might want one. > The Linux community has grown at least in part because it has > educated potential users and journo's to its benefits. I believe if > the PG advocacy team did the same, then it would attract many more > serious LAMP developers. > > Like Linux vs. Windows, PG has an awful lot going for it in respect to > MySQL, so why not crow about it. It needs to be pointed at a crowd > that are DB novices, they need to be told why PG is worth the > time/knowledge investment, because anyone who reads the MySQL site, > will come away with the impression that the Trigger, Stored Procs, and > other things are a luxurious overhead not necessary for getting the > job done. > > I'd gladly help out with such a paper, but find myself in the sad > position of my prose being open to attack due to my newbieness in the > DB world and not able to speak authoratatively on the subject. > > Have a think, I'd like to know if others agree. > > Cheers > > T. > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Re: PostgreSQL Advocacy, Thoughts and Comments
>>>>> "Unihost" == Unihost Web Hosting <tony@unihost.net> writes: Unihost> Further to this post, what might actually work is to convince O' Unihost> Reilly (since they have PostgreSQL book/s) to do some articles like Unihost> they have for PG, but making full use of the PG database. For Unihost> instance, building a simple data-warehouse using PG. Articles that Unihost> show off an OSS product/project in a clearly enterprise light in a Unihost> step-by-step fashion. There have been so many articles on DB design Unihost> using MySQL. How about an article on DB design using all the Unihost> functionality of a real ORDBMS. Well, since I need 2.5 ideas per month for the three columns I'm still writing, I'm certainly in a position to write nice things about PG, although I always have to work it in from a Perl slant. Actually, I'm sure that any of the magazines I'm in would appreciate an additional article or two from me. If you can think of something that fits in 2000 words or so (or 4000 if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate some inspiration. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
On Sat, 29 Nov 2003, Oliver Elphick wrote: > On Sat, 2003-11-29 at 04:37, cnliou wrote: > > "Jason Tesser" <JTesser@nbbc.edu> > > > > > MySQL cannot even handle sub-queries yet. > > > > Ohh! Really? > > Allow me to pay my highest respect to the genius mySQL > > programmers! > > I completely have no clue on how to construct any single > > tiny database on a DBMS having no sub-query capability. > > > > Being too dumb, I solicit mySQL programmers' help by showing > > me employee FOO's birthday and his/her latest job title > > effective on or before 2003-1-1 from the following tables: > > > > CREATE TABLE t1 (employee TEXT,BirthDay DATE); > > CREATE TABLE t2 (employee TEXT,EffectiveDate DATE,JobTitle > > TEXT); > > > > And make the result like this: > > > > FOO 1980-1-1 programmer > > > > Please do not give me the answer that you will merge these > > two tables to form one like this: > > > > CREATE TABLE t1 (employee TEXT,BirthDay DATE,EffectiveDate > > DATE,JobTitle TEXT); > > I have great trouble following your meaning, but I think you are talking > about joining two tables in a query: > > SELECT t1.employee, t1.birthday, t2.jobtitle > FROM t1, t2 > WHERE t1.employee = t2.employee; > > That is not the same as using a sub-query: > > SELECT employee > FROM t1 > WHERE birthday > ( > SELECT MIN(effectivedate) > FROM t2 > ); > > (select employees who were born after the longest-serving employee > started work.) > I think he means for the employee FOO show only the latest job title. Or in other words: SELECT t1.employee, t1.birthday, t2.jobtitle FROM t1, t2 WHERE t1.employee = t2.employee AND t1.employee = 'FOO' AND t2.effectivedate > CAST('2003-1-1' TO DATE) ORDER BY t2.effectivedate DESC LIMIT 1 which of course uses a PostgreSQL customisation. I've got a feeling it's possible doing self joins and the like but I'll leave it at that I think. -- Nigel
Tony <tony@unihost.net> Wrote: > Now many > consultant/developer/sys-admins like myself are going to client site on > a contract (this is especially true in the UK, I can't speak for > anywhere else) and finding complex stocktrading systems, inventory > systems, CRM systems, and others, all written in PHP backed by MySQL. I started the CRM system I am developing on MySQL before realizing it was the wrong choice. Part of it is simple because people have heard of the software and don't have the time/stamina/patience to do proper research into the benefits of alternatives. There is also a learning curve when going from MySQL to a more standards-compliant RDBMS like PostgreSQL. Heck, I found that going from PostgreSQL to Firebird give me headaches :-P And these RDBMS's have most of the same features! > Whether this is right or wrong, good choice or bad choice is not what > I'm interested in debating. The point is that when these systems where > architected, the developers used MySQL not because they were dumb, but > because many of them develop awesome code and can get around most > problems in the code, with a little ingenuity. Many simply do not have > the insight into the potential benefits of *proper* RDBMS can offer. I would actually venture to say that many of them are using the RDBMS as a sort of object-persistance store, and not really trying to use the *relational* features of the software. They might as well be using Berkeley DB 4. I know that is how I started with MySQL. What most of these programmers do not understand is that an RDBMS is not simply a search-engine for stored persistant objects, but is actually a fully-featured information storage management system. With the right features, this information can be stored, queried, presented in another form, etc. all while ensuring that the stored information is EXACTLY what was intended. The tasks that the RDBMS handles include data storage, integrity enforcement, and data presentation. Most MySQL programmers only use it for data storage. Sadly, this is about all MySQL is good for, and hence the barrier to learning how to USE a REAL RDBMS are a bit higher because of the prevalence of the likes of MySQL and MS Access. > The second scenario, is with admin systems, written by people like > myself for companies, whether they be simple or complex systems, that > are intended as a temporary work around to an immediate problem. > In a very short space of time the stop-gap application you had written > to sort out the immediate problem quickly becomes a core business > application (I recently returned to a site after not being there for two > years and the temporary address book/ email system that I knocked up in > an afternoon was not only still being used, but now relied upon heavily). But again, if you start with the right tools, it is easier to modify later to adapt to changing needs. I think that this is one of the messages we should be presenting. With updateable views, different applications can even have access to different presentations of the data. > So on to my point, MySQL guys will happily say "Hey, we're not saying > that the features MySQL is missing aren't important, and we're working > towards them, but in the meantime these issues can be worked around like > this....." and happily play the whole thing down. Many LAMP developers > aren't aware of the benefits of stored procedures, of triggers and other > good stuff. Like myself, if they were aware how much easier life could > be if these things were accessible to them, they'd probably be converts too. Agreed completely. Now we just have to sell the PostgreSQL solution. Here is what the MySQL people will say (and we need good evidence to counter): 1: MySQL is faster. 2: MySQL has more community support. 3: MySQL has replication as part of its core distribution. MySQL's replication is better tested... > There is not enough emphasis put on the basic importance of these > functions in PG. Someone needs to standup and say "Hey, look how this > can simplify your programming lives" until I started using > Druid/Postgres, I had no idea why I needed triggers or what a cascade > effect did, or why I might want one. The basic issue is that many programmers are not taught to value information management systems, such as RDBMS's. These programmers are interested only in the data storage issues of the database, and not on how to use it to manage the information stored therein. Changing this may take a lot of effort. Also, using an RDBMS to its full extent rubs some OO programmers the wrong way because it strikes them as violating rules of OO design. Of course, then why not use an OO database? ;-) > The Linux community has grown at least in part because it has > educated potential users and journo's to its benefits. I believe if > the PG advocacy team did the same, then it would attract many more > serious LAMP developers. I agree. But it will take some time to sell, and will require some extremely basic work to get going. > I'd gladly help out with such a paper, but find myself in the sad > position of my prose being open to attack due to my newbieness in the DB > world and not able to speak authoratatively on the subject. I would be happy to help out as well. How about a paper entitled: Why PostgreSQL? It would cover why PostgreSQL is ideal for serious application development. Note, I have only about 2 years experience with PostgreSQL, and so I would feel more comfortable with lengthy peer review of whatever I write, but I am reasonably familiar with both worlds, and can make a strong case for PostgreSQL, I think. > Have a think, I'd like to know if others agree. > > Cheers > > T. > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > >
Chris Travers wrote:
It's the learning curve part that I'm finding difficult, not because it's a too complicated, but because I can't find a good source of information to learn from. I'm sure I'll get flamed for this, but I seem to be unable to find information on proper design principle, including where and when to use triggers, stored procs, etc, etc, that isn't 20 years old already. In the liquid world of IT, I find it worrying (perhaps incorrectly) learning from a book written 14 years ago.Tony <tony@unihost.net> Wrote:Now many consultant/developer/sys-admins like myself are going to client site on a contract (this is especially true in the UK, I can't speak for anywhere else) and finding complex stocktrading systems, inventory systems, CRM systems, and others, all written in PHP backed by MySQL.I started the CRM system I am developing on MySQL before realizing it was the wrong choice. Part of it is simple because people have heard of the software and don't have the time/stamina/patience to do proper research into the benefits of alternatives. There is also a learning curve when going from MySQL to a more standards-compliant RDBMS like PostgreSQL. Heck, I found that going from PostgreSQL to Firebird give me headaches :-P And these RDBMS's have most of the same features!
Whether this is right or wrong, good choice or bad choice is not what I'm interested in debating. The point is that when these systems where architected, the developers used MySQL not because they were dumb, but because many of them develop awesome code and can get around most problems in the code, with a little ingenuity. Many simply do not have the insight into the potential benefits of *proper* RDBMS can offer.I would actually venture to say that many of them are using the RDBMS as a sort of object-persistance store, and not really trying to use the *relational* features of the software. They might as well be using Berkeley DB 4. I know that is how I started with MySQL.
Agreed...
What most of these programmers do not understand is that an RDBMS is not simply a search-engine for stored persistant objects, but is actually a fully-featured information storage management system. With the right features, this information can be stored, queried, presented in another form, etc. all while ensuring that the stored information is EXACTLY what was intended. The tasks that the RDBMS handles include data storage, integrity enforcement, and data presentation. Most MySQL programmers only use it for data storage. Sadly, this is about all MySQL is good for, and hence the barrier to learning how to USE a REAL RDBMS are a bit higher because of the prevalence of the likes of MySQL and MS Access.
Indeed, and I believe that the lack of education with regards to this (or even available information presented at the right level) perpetuates the issue, along with the mis-information put forward by MySQL that these aspects aren't really all that important anyway.
The second scenario, is with admin systems, written by people like myself for companies, whether they be simple or complex systems, that are intended as a temporary work around to an immediate problem. In a very short space of time the stop-gap application you had written to sort out the immediate problem quickly becomes a core business application (I recently returned to a site after not being there for two years and the temporary address book/ email system that I knocked up in an afternoon was not only still being used, but now relied upon heavily).But again, if you start with the right tools, it is easier to modify later to adapt to changing needs. I think that this is one of the messages we should be presenting. With updateable views, different applications can even have access to different presentations of the data.So on to my point, MySQL guys will happily say "Hey, we're not saying that the features MySQL is missing aren't important, and we're working towards them, but in the meantime these issues can be worked around like this....." and happily play the whole thing down. Many LAMP developers aren't aware of the benefits of stored procedures, of triggers and other good stuff. Like myself, if they were aware how much easier life could be if these things were accessible to them, they'd probably be convertstoo. Agreed completely. Now we just have to sell the PostgreSQL solution. Here is what the MySQL people will say (and we need good evidence to counter): 1: MySQL is faster. 2: MySQL has more community support. 3: MySQL has replication as part of its core distribution. MySQL's replication is better tested...
1. Let's do apples to apples, NOT apples to Oranges as has been done many times in the past. It would be far more useful IMHO to put forward a "real world" scenario (rather than purely transactions per second) which shows 2 examples of the same application, the first MySQL written with much of the logic in the code, PHP, Perl, whatever. the second would be written in the same language, with the appropriate use of the true RDBMS features in full effect. Then let's comapare results. I'm sure that the more we can usefully abstract to the DB the more performace you'll see from PG, especially as load increases. The big issue I've found is not raw speed, if half of the logic is in regexps and algorithms in the scripting, this will be many times slower to run than the same en DB. Although correct me if I'm way off the mark here. Also the point needs to be made that, it is far more trivial to write other interfaces to the program/application. Who wouldn't enjoy the ability to work on their CRM with a Python or Java frontend, without the need to write 75% of the logic in a different language.
There is not enough emphasis put on the basic importance of these functions in PG. Someone needs to standup and say "Hey, look how this can simplify your programming lives" until I started using Druid/Postgres, I had no idea why I needed triggers or what a cascade effect did, or why I might want one.The basic issue is that many programmers are not taught to value information management systems, such as RDBMS's. These programmers are interested only in the data storage issues of the database, and not on how to use it to manage the information stored therein. Changing this may take a lot of effort. Also, using an RDBMS to its full extent rubs some OO programmers the wrong way because it strikes them as violating rules of OO design. Of course, then why not use an OO database? ;-)
In some respects adding much of the logic to the DB doesn't fly in the face of OO, since you are providing an encapsulated data interface, only needing to change the programatic code in one place
(the DB) for potentially high returns in many places. Hence taking OO one step further than the language and abstracting one level further.
The Linux community has grown at least in part because it has educated potential users and journo's to its benefits. I believe if the PG advocacy team did the same, then it would attract many more serious LAMP developers.I agree. But it will take some time to sell, and will require some extremely basic work to get going.I'd gladly help out with such a paper, but find myself in the sad position of my prose being open to attack due to my newbieness in the DB world and not able to speak authoratatively on the subject.I would be happy to help out as well. How about a paper entitled: Why PostgreSQL? It would cover why PostgreSQL is ideal for serious application development.
What about "PostgreSQL - Ideal for any application development"
Note, I have only about 2 years experience with PostgreSQL, and so I would feel more comfortable with lengthy peer review of whatever I write, but I am reasonably familiar with both worlds, and can make a strong case for PostgreSQL, I think.
I have only 2 months and have oodles of enthusiasm, but short on PG knowledge, and even shorter on real RDBMS basics and experience. But I'll be happy to proof read as a novice.
Re: [GENERAL] PostgreSQL Advocacy, Thoughts and CommentsRegarding the learning curve issue, maybe people can recommend their favorite books. I recommend "SQL Unleashed" (I forget the author), pub. Samms. "SQL For Smarties" also gets recommended often around here, but again, I don't know the author (or in this case, even the publisher). Maybe there are others too that people can recommend. As for the paper-- I think your title is good ("PostgreSQL - Ideal for any application development"), and I will have to look at how to organize it. I wanted to cover the following topics: 1) Different types of databases, and what is meant by Object Relational 2) Enterprise-ready features (Views, Stored Proceedures, Subselects, etc.) 3) RDBMS tasks and how these features fit in. Perhaps a followup paper could be written as a basic treatise in database design. The other issues I am seing here involve finding a suitable venue for publication. Any suggestions here are welcome also. Best Wishes, Chris Travers
Hi; The MySQL manual states that Joins perform better than subselects. The stated reason is that the planner can better optimize a join. I am, however, very skeptical of this, and would generally assume things to be the other way around. How do subselects perform in relation to hash joins, etc? Best Wishes, Chris Travers
Hehe, it's kinda funny when people say things like that. It depends on what you call a "join". From a certain point of view, an IN expression is a join too, making subselects a slightly unusual join. In some ways I think a subselect is the SQL way way of expressing a type of join that cannot be expressed in the normal way and that with a different query language, the concept may go away entirely. I do know that in recent versions of postgres, certain types of subselects are actually optimised into specific types of joins. It's all just relational algebra people! So the moral is, MySQL needs a better query optimiser? Hope this helps, On Sun, Nov 30, 2003 at 07:59:09PM +0700, Chris Travers wrote: > Hi; > > The MySQL manual states that Joins perform better than subselects. The > stated reason is that the planner can better optimize a join. > > I am, however, very skeptical of this, and would generally assume things to > be the other way around. > > How do subselects perform in relation to hash joins, etc? > > Best Wishes, > Chris Travers > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > "All that is needed for the forces of evil to triumph is for enough good > men to do nothing." - Edmond Burke > "The penalty good people pay for not being interested in politics is to be > governed by people worse than themselves." - Plato
Attachment
"Chris Travers" <chris@travelamericas.com> writes: > The MySQL manual states that Joins perform better than subselects. Very possibly true ... in MySQL. Since they have such an immature subselect implementation (not even out of alpha apparently), it'd not be surprising if they can't optimize subselects worth a damn yet. Our planner has been hacked on repeatedly to do a good job with subselects --- and I wouldn't want to imply that the process is done yet. One of the more amusing aspects of watching MySQL's response to the "feature race" is how they invariably gloss over the difference between having a minimal implementation of a feature, and having a feature that is mature, complete, and efficient. Subselects are one example where there's a lot of mileage yet to cover after you get to the point where you can say "it works". regards, tom lane
Randal, we have contrib/tsearch2 which could be used for full text searching and perl wrapper illustrating simple search engine would be quite useful. Oleg On Sat, 29 Nov 2003, Randal L. Schwartz wrote: > >>>>> "Unihost" == Unihost Web Hosting <tony@unihost.net> writes: > > Unihost> Further to this post, what might actually work is to convince O' > Unihost> Reilly (since they have PostgreSQL book/s) to do some articles like > Unihost> they have for PG, but making full use of the PG database. For > Unihost> instance, building a simple data-warehouse using PG. Articles that > Unihost> show off an OSS product/project in a clearly enterprise light in a > Unihost> step-by-step fashion. There have been so many articles on DB design > Unihost> using MySQL. How about an article on DB design using all the > Unihost> functionality of a real ORDBMS. > > Well, since I need 2.5 ideas per month for the three columns I'm still > writing, I'm certainly in a position to write nice things about PG, > although I always have to work it in from a Perl slant. > > Actually, I'm sure that any of the magazines I'm in would appreciate > an additional article or two from me. > > If you can think of something that fits in 2000 words or so (or 4000 > if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate > some inspiration. > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Tony wrote: > HI All, > > I'm glad that this thread prompted some thoughtful response. I think > one of my main points I was trying to make, Jason hit the nail on the > head. The article to which I was referring uses a great example which > I have experienced many times before, but in order to grasp this, PHP > et al, must be thought of as a scripting language which crosses many > corporate boundries, and it is easy to assume that it's primary use > (simple web site back ends) are the only thing to discuss. But the > situation has changed enourmously since the release of PHP v4. Now > many consultant/developer/sys-admins like myself are going to client > site on a contract (this is especially true in the UK, I can't speak > for anywhere else) and finding complex stocktrading systems, inventory > systems, CRM systems, and others, all written in PHP backed by MySQL. So true! I am in the U.S (Florida), and I am seeing the same thing here. Starting around 2000, many fairly complex, mission-critical PHP/MySQL apps were developed, which are just beginning to surface. We all know how prevalent PHP and MySQL became overnight, but how many of us realize that it was not just used for 'lightweight' applications?. Imagine how big a problem all these PHP/MySQL applications are going to become over the next few years. I have had the dubious pleasure of moving a few of these from MySQL to PostgreSQL already (Yes, financial systems using MySQL's unconstrained numeric types!!), and I shudder to think about all the companies that might end up with *years* of poorly-constrained data. > Whether this is right or wrong, good choice or bad choice is not what > I'm interested in debating. The point is that when these systems > where architected, the developers used MySQL not because they were > dumb, but because many of them develop awesome code and can get around > most problems in the code, with a little ingenuity. Many simply do > not have the insight into the potential benefits of *proper* RDBMS can > offer. Had they had the benefit of such knowledge the code they have > written would be faster (in DB) and more legible. Sadly often the > developers are the only source of DBA for some of these companies. Most medium/small business managers don't even know there is a difference between the two. <snip> > Like Linux vs. Windows, PG has an awful lot going for it in respect to > MySQL, so why not crow about it. It needs to be pointed at a crowd > that are DB novices, they need to be told why PG is worth the > time/knowledge investment, because anyone who reads the MySQL site, > will come away with the impression that the Trigger, Stored Procs, and > other things are a luxurious overhead not necessary for getting the > job done. > > I'd gladly help out with such a paper, but find myself in the sad > position of my prose being open to attack due to my newbieness in the > DB world and not able to speak authoratatively on the subject. You're not doing too badly, really. Your writing is good and clear, and your knowlege is well above the typical corporate IT magazine hack ;-). Regards, Rick Morris
Hello, I'd just like to add that this thread is perfectly aligned with my own experiences on the DB front. When I started out as a developer, I was familiar with the concepts of good database design, but *not* with the range of DBMSs available, and their respective advantages and disadvantages. I started using MySQL because it was popular, fast, free and (sort of) easy to set up. Then my major project, a payroll management system, previously PHP/MSSQL, needed a new environment established very quickly. I used MySQL because time was short and it was what I knew. To get around the lack of proper foreign key constraints I did crazy things like manually maintaining a table of "foreign keys" and enforcing them from PHP. When the client on that project hired a new sysadmin, the sa took one look at the system and said "Uhh, dude. Do you know anything about postgres?" I'd heard of postgres but didn't really know anything about it. After having the sysadmin tell me about postgres' capabilities, and checking out the manual for myself, I realised that I'd been barking up the wrong proverbial tree for months. So we migrated the system to postgres earlier this year, and I couldn't be more pleased with it. The immediate payoff of having referential integrity enforced properly was wonderful, but the advantages just kept on coming. I started using postgres functions and views to shift more of the work of "organising data" over to the piece of software that *should* be handling it. Some of those views enabled me to chop vast tracts of superfluous PHP code out of the system. The more I used postgres, the more I came to understand -- this is the way it ought to be. BJ Rick Morris wrote: > Tony wrote: > >> HI All, >> >> I'm glad that this thread prompted some thoughtful response. I >> think one of my main points I was trying to make, Jason hit the nail >> on the head. The article to which I was referring uses a great >> example which I have experienced many times before, but in order to >> grasp this, PHP et al, must be thought of as a scripting language >> which crosses many corporate boundries, and it is easy to assume that >> it's primary use (simple web site back ends) are the only thing to >> discuss. But the situation has changed enourmously since the release >> of PHP v4. Now many consultant/developer/sys-admins like myself are >> going to client site on a contract (this is especially true in the >> UK, I can't speak for anywhere else) and finding complex stocktrading >> systems, inventory systems, CRM systems, and others, all written in >> PHP backed by MySQL. > > > So true! I am in the U.S (Florida), and I am seeing the same thing > here. Starting around 2000, many fairly complex, mission-critical > PHP/MySQL apps were developed, which are just beginning to surface. We > all know how prevalent PHP and MySQL became overnight, but how many of > us realize that it was not just used for 'lightweight' applications?. > Imagine how big a problem all these PHP/MySQL applications are going > to become over the next few years. I have had the dubious pleasure of > moving a few of these from MySQL to PostgreSQL already (Yes, financial > systems using MySQL's unconstrained numeric types!!), and I shudder to > think about all the companies that might end up with *years* of > poorly-constrained data. > >> Whether this is right or wrong, good choice or bad choice is not what >> I'm interested in debating. The point is that when these systems >> where architected, the developers used MySQL not because they were >> dumb, but because many of them develop awesome code and can get >> around most problems in the code, with a little ingenuity. Many >> simply do not have the insight into the potential benefits of >> *proper* RDBMS can offer. Had they had the benefit of such >> knowledge the code they have written would be faster (in DB) and more >> legible. Sadly often the developers are the only source of DBA for >> some of these companies. > > > Most medium/small business managers don't even know there is a > difference between the two. > > <snip> > >> Like Linux vs. Windows, PG has an awful lot going for it in respect >> to MySQL, so why not crow about it. It needs to be pointed at a >> crowd that are DB novices, they need to be told why PG is worth the >> time/knowledge investment, because anyone who reads the MySQL site, >> will come away with the impression that the Trigger, Stored Procs, >> and other things are a luxurious overhead not necessary for getting >> the job done. >> >> I'd gladly help out with such a paper, but find myself in the sad >> position of my prose being open to attack due to my newbieness in the >> DB world and not able to speak authoratatively on the subject. > > > You're not doing too badly, really. Your writing is good and clear, > and your knowlege is well above the typical corporate IT magazine hack > ;-). > > Regards, > > Rick Morris > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
On Sunday 30 November 2003 05:38, Tony wrote: > Comments within: > Chris Travers wrote: > >Tony <tony@unihost.net> Wrote: > >Agreed completely. Now we just have to sell the PostgreSQL solution. > > Here is what the MySQL people will say (and we need good evidence to > > counter): 1: MySQL is faster. > >2: MySQL has more community support. > >3: MySQL has replication as part of its core distribution. MySQL's > >replication is better tested... > > 1. Let's do apples to apples, NOT apples to Oranges as has been done > many times in the past. It would be far more useful IMHO to put > forward a "real world" scenario (rather than purely transactions per > second) which shows 2 examples of the same application, the first MySQL > written with much of the logic in the code, PHP, Perl, whatever. the > second would be written in the same language, with the appropriate use > of the true RDBMS features in full effect. Then let's comapare > results. I'm sure that the more we can usefully abstract to the DB the > more performace you'll see from PG, especially as load increases. The > big issue I've found is not raw speed, if half of the logic is in > regexps and algorithms in the scripting, this will be many times slower > to run than the same en DB. Although correct me if I'm way off the mark > here. Also the point needs to be made that, it is far more trivial to > write other interfaces to the program/application. Who wouldn't enjoy > the ability to work on their CRM with a Python or Java frontend, without > the need to write 75% of the logic in a different language. > Here's one of the more famous articles of this type which documents the switch from mysql to postgresql by the folks at sourcforge. (and we didnt even have to give them thousands of $$$ in development and product support like IBM... :-) http://www.phpbuilder.com/columns/tim20001112.php3 You'll note in one test postgresql performs at 6x mysql... and this was with version 7.1. Another interesting note is the mention that interbase was looked at and then passed over due, in part, to "serious community and corporate issues around it's Open Sourceness". Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
> I would be happy to help out as well. How about a paper entitled: > Why PostgreSQL? > It would cover why PostgreSQL is ideal for serious application development. Good idea. > Note, I have only about 2 years experience with PostgreSQL, and so I would > feel more comfortable with lengthy peer review of whatever I write, but I am > reasonably familiar with both worlds, and can make a strong case for > PostgreSQL, I think. Same here... -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 665-7007 voice
Here is a link to the sql for smarties book: http://www.amazon.com/exec/obidos/tg/detail/-/1558603239/102-3995931-7263333?v=glance by Joe Celko Has some cool ways of handling trees in sql ----- Original Message ----- From: "Chris Travers" <chris@travelamericas.com> To: "Tony" <tony@unihost.net> Cc: <pgsql-general@postgresql.org> Sent: Sunday, November 30, 2003 5:49 AM Subject: Re: [GENERAL] PostgreSQL Advocacy, Thoughts and Comments > Re: [GENERAL] PostgreSQL Advocacy, Thoughts and CommentsRegarding the > learning curve issue, maybe people can recommend their favorite books. I > recommend "SQL Unleashed" (I forget the author), pub. Samms. "SQL For > Smarties" also gets recommended often around here, but again, I don't know > the author (or in this case, even the publisher). Maybe there are others > too that people can recommend. > > As for the paper-- I think your title is good ("PostgreSQL - Ideal for any > application development"), and I will have to look at how to organize it. I > wanted to cover the following topics: > 1) Different types of databases, and what is meant by Object Relational > 2) Enterprise-ready features (Views, Stored Proceedures, Subselects, etc.) > 3) RDBMS tasks and how these features fit in. > > Perhaps a followup paper could be written as a basic treatise in database > design. > > The other issues I am seing here involve finding a suitable venue for > publication. Any suggestions here are welcome also. > > Best Wishes, > Chris Travers > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Here is a paper I have written for the purposes of providing some additional educational material for the MySQL crowd. Permission is granted for it to be redistributed via the mailing list and archives, but not to be posted elsewhere until I have a final version completed.
Choosing a Database Manager
A Study of Open Source Relational Database Management Systems
By Chris Travers
Introduction
There sometimes comes a point of epiphany, dreaded by programmers, when it becomes clear that at least one of the components of an application is completely inadequate for the task at hand, and thus limits progress or usefulness of the program in some fundamental way. All too often this is the relational database management system (RDBMS).
Choosing the proper relational database management system can make or break an application. The choice of a primary RDBMS is thus very important for the long-term viability of an application. While this paper focuses on open source RDBMSs, the same methodology can be applied towards commercial systems as well, such as Oracle, MS SQL, and DB2. Also, the concept of application specific database managers will be discussed briefly in order to provide some degree of contrast to the relational paradigm.
Types of Database Managers
Application Specific Database Managers
Application specific database managers include the GDBM (GNU Database Manager), and the Berkeley Database Manager (available, open source or commercially licensed from Sleepycat software). Basically, such a database provides a simple persistence layer for database objects. They do not allow for complex querying, and are often used in single-purpose applications where the data does not need to be accessed by any outside application.
If two applications share the same database using these database managers, they must be intimately familiar with the internal data structures that are stored in the database manager. No presentation is given to the data aside from how it is represented in the database.
Relational Database Managers
In contrast to the application specific database managers, relational database managers present an abstraction layer between the information in the database and the application using that information. The information can then be subject to much more complex queries, and the information can be represented to the application in whatever form makes sense to the application. Usually, SQL is the language used for such queries.
In a relational database manager, data is stored in tables and tables can be joined together based on common criteria. In this way, several different applications can access the same information in different ways.
What does Object Relational mean?
PostgreSQL is Object Relational which means that although it is a relational database management system, the database objects have at least some of the traits of objects in object-oriented programming. This tables can be inherited, though this does not apply to user defined data types, etc. With an Object Relational system, purely relational databases can be designed and used, but also inheritance can be used in various ways. Use of such features is beyond the scope of this paper, however.
Features Important to Enterprise Applications
Subselects
Subselects are important in large applications for two reasons. First they allow performance tuning for certain types of queries above and beyond what a planner can normally do. Very often subselects will outperform inner joins because the planner can focus on a simpler pattern with a maximum of one table scan per subselect and a smaller memory footprint than other forms of single-scan joins.
Secondly, when compared to inner joins, subselects often make for more readable queries, especially where several tables are involved. Greater readability makes for easier maintenance of the application.
Views
A view is a stored query that appears to be a table from the applications perspective. Views can be used to create alternative representations of the data for other applications or for reporting purposes. In many database managers, views can be updateable, and sometimes data can even be inserted or deleted from the view. Thus views can be used to create logical tables that present data in a way that is less normalized than the actual structure of the information in the tables themselves.
Views can also be used for partitioning tables for permissions purposes-- vertically (groups of rows), horizontally (groups of columns), or both. They can also be used to create complex virtual tables for reporting or online analytic processing, or OLAP. In this way, views can help manage information and also help make sense of complex patterns as well.
Stored Procedures
Stored Procedures, also known as User Defined Functions, allow for a greater quantity of business logic to be stored in the database. These functions can allow for the development of more complex views than would otherwise be possible, and can also allow for the development of a database-level API that can then be accessed by any application that has access to the database. In this way, they can reduce development time by allowing the development of modules which are common to all clients of the database.
Stored procedures also reduce network latency issues by reducing the number of communications between the client and server. Whole sets of queries can be run before any result set needs to be sent to the client. In this way, performance can be improved, given sufficient hardware on the side of the database server.
Triggers
Triggers are a mechanism for automatic maintenance and management of information in the database. Triggers can be activated when information is modified in any table. Triggers are often used to enforce referential integrity (ensuring that records in one table reference records in another table that actually exist).
One example of another use of a trigger is in maintaining some other record of the change made to the information of a database. This information can later be used for reporting purposes to ensure that a consistent picture of the database can be constructed for any arbitrary point in time. For example, I could store historical information about prices for items from my online store, and then do historical processing to see how sales were affected by price changes at a later date since all historical information would be maintained in the database.
Database Tasks: How These Features Help
Data Integrity Enforcement
For data to be useful in a large business program, it must be meaningful. In many cases, the largest threat to the ability to make the data meaningful is the so-called orphaned record problem. Orphaned records refer to records in other tables that, from the point of view of the application, no longer exist. This can happen when records are deleted or their primary keys are updated. Triggers can be used to ensure that this problem never exists by doing one of several things: cascading updates/deletes, restricting updates/deletes when records reference the column being updated or row being deleted, or setting the foreign key in the referencing table to NULL (which means, in database terms, unknown).
Triggers can also restrict the types of information being inserted or updated in the database, ensuring that only meaningful input is stored. They can also be used to maintain historical information, allowing a consistent picture of all tables to be built for any point in time.
Data Presentation
Assume for a moment that your database has several front-ends that access the information in different ways. Perhaps this is not the case now, but it could be in the future. It is extremely helpful if the database manager can present the information to the programs in ways that would be natural for them. This effectively negates the requirement that every application accessing the database must have in-depth knowledge of the database schema. Instead, the applications expected database schema can be presented to it.
A very useful side effect of this capability is in integrating applications that may or may not have been designed to be integrated. In this way, larger, more unified databases can be built for use of internal applications.
Views and, depending on complexity, stored procedures would be needed to implement this sort of system, which could also provide advanced reporting capabilities, also based on views and stored procedures.
Common Modules As Stored Procedures
When several programs access the same database, there may be a reasonable amount of common functionality in the functions used by the programs. These can be simplified using views to some extent, but often this is simply not enough. Using stored procedures, the functions that two or more front-ends have in common can be moved to the database backend, so that there is one single point of maintenance and reference. This technique can reduce development and maintenance time and is essential for any database being used by multiple applications.
Questions to Ask
Where will my application be deployed?
The choice of a database manager must, no doubt, take into account availability of the solutions at the point of deployment. Of course, applications run at the businesss site can use whatever solutions are found to be affordable and/or best suited to the task at hand. Hosted web applications are more at the mercy of what is available, however.
How necessary is integration with other programs?
Integration capability is always a selling point, as it allows for more powerful solutions to be built in pieces and lowers overall system maintenance. However, the ability to integrate with other solutions may come at a cost, especially regarding availability of the chosen database manager in a shared hosting environment.
Major Open Source Relational Database Managers
MySQL
MySQL is the most common open source RDBMS among low-cost web hosting providers. As of version 4.0, it does not support subselects, views, stored procedures, or triggers. Additionally, its enforcement of data integrity is questionable at best, and it is sometimes known to truncate numeric or text values when they are too large to fit. It is also sometimes possible to insert invalid timestamps into the database. This can cause extreme problems for any application that relies on the information in the database being exactly what was entered (for example, accounting applications).
Even when views are supported in MySQL, it may be some time before these views can be used for insert, update, or delete queries. A similar period of maturation should be expected for other features such as stored procedures and triggers.
Despite its limitations, MySQLs ubiquity and the availability of a Windows port makes it an attractive database platform. For internal business tools, however, there are better choices.
Firebird
Firebird is based on a source-code release of Borlands Interbase RDBMS. It is mature, and supports views, subselects, triggers, and stored procedures. As of version 1.0, it has a native Windows port available, making it ideal for in-house applications that must support both Windows and UNIX operating systems. However, I found the learning curve to be steeper than other RDBMSs particularly regarding the development of stored procedures.
PostgreSQL
PostgreSQL is an object-relational database management system that features every feature listed here, and extensible type system, and the ability to write stored procedures in a variety of different languages. Indeed, there are four stored procedure languages (Perl, Tcl, SQL, and PL/PGSQL) distributed with PostgreSQL and many more available from third-party sites.
Although the learning curve is steeper than that of MySQL, I found it easier than Firebird.
As of version 7.4, PostgreSQL does not yet have a native Windows port available, making it unsuitable at the moment for applications that must be able to run on Windows database servers.
Although PostgreSQL has not achieved the ubiquity among shared hosting providers that MySQL has, there are many providers offering hosting at a variety of costs. You can find a list of them at http://techdocs.postgresql.org/hosting.php.
Conclusion
Selecting the right database manager is crucial to the long-term viability of any application. Care should be taken to ensure that the database manager chosen will continue to meet the needs of the application in the future as well as in the present.
Chris Travers wrote: > > Here is a paper I have written for the purposes of providing some > additional educational material for the MySQL crowd. Here's my contribution: Why I choose PostgreSQL (PostgreSQL in 21 Seconds) I choose referential integrity, meaning my lookups always work. I choose stored procedures, meaning all my developers - Windows or Unix, Perl, C++ or Java - can access the database in the same way, using the same locking, with the same checking and cleaning I choose subselects and outer joins, which allow me to build complex queries to get exactly the information I want from the database, rather than wasting my time munging data in my code. Even better, I can put those common queries into stored procedures, so other developers can get the same results as I do! I choose partial indexes, so lookups on NULL fields are just as fast if not faster. I choose a user community that believes getting the results right is more important than getting them quickly. I choose getting the right results, right now! I choose funny capitalisation, and a name that can't be pronounced! I choose PostgreSQL.
On 29/11/03, Randal L. Schwartz (merlyn@stonehenge.com) wrote: > Well, since I need 2.5 ideas per month for the three columns I'm still > writing, I'm certainly in a position to write nice things about PG, > although I always have to work it in from a Perl slant. > > Actually, I'm sure that any of the magazines I'm in would appreciate > an additional article or two from me. > > If you can think of something that fits in 2000 words or so (or 4000 > if it needs part 1 and 2), and can have a Perl wrapper, I'd appreciate > some inspiration. Hi Randal I think I may have an idea for an article which would address a common problem for people writing database client interfaces: The problem is simply explained. Problem title: The "page of pages" problem (!) The problem: You want to return a subset of a large number items using some fairly complex search criteria. You want to make only one database call, benefit from a cached query, and don't want to have all the rows in memory. How do you get the total count of pages for the relevant search criteria? Why is this relevant? Moving logic that is inherent to the database to the database provides a potentially rich yet simple interface to database queries that can benefit a number of client applications. Typically this sort of query would be written as at least two dynamically generated queries in the client program that has to be parsed by the backend before it is executed. By using functions we can hide complex joins behind simple field names, and provide flexible (if limited) search capabilites, as well as caching and sensible error messages. Approach: Using Postgres one can construct a function and then do either SELECT * from function fn_explore($searchstring, $limit, $offset); OR SELECT * FROM function fn_explore() WHERE searchterm ~* 'test' LIMIT 5 OFFSET 10; What is cool about the second format is that (if the function returned a type 'explore_result' as below), your PHP/Perl programmer can at their interface do something like '... where id <> 1 AND author IN ('james', 'bill')...' However I don't know how you get back the total rows in this case, also maybe the caching effects are minimised? -------------------------------------------------------------------- Type definition: CREATE TYPE explore_result as ( id INTEGER, -- some sort of row id total INTEGER, -- total rows for query author VARCHAR, image BYTEA /* Not needed unless search is done outside db. , searchterm VARCHAR */ ); -------------------------------------------------------------------- Sketch function definition: CREATE OR REPLACE FUNCTION fn_explore (integer, integer, integer) RETURNS setof explore_result AS ' DECLARE searchstring ALIAS for $1; offsetter ALIAS for $2; limiter ALIAS for $3; resulter explore_page%rowtype; BEGIN /* variable verifation section chopped */ FOR resulter IN SELECT n_id as id, LOJ.pagetotal as total pers.t_name as author, image.b_contents as image /* need searchterm returned if we are doing search outside the database , COALESCE(t_title || '' '' || t_text, '''') as searchterm FROM db /* -> self join on db LOJ for unoffset, unlimited row count refer to searchterm stuff below */ WHERE /* note, if we are doing a search outside of the * function and t_title or t_text could be empty then we * need to coalesce to an empty string * COALESCE(t_title || '' '' || t_text, '''') as searchterm */ searchstring ~ t_title || '' '' || t_text ORDER BY dt_modified DESC LIMIT limiter OFFSET offsetter , LOOP RETURN NEXT resulter; END LOOP; RETURN; END;' LANGUAGE plpgsql; -- Rory Campbell-Lange <rory@campbell-lange.net> <www.campbell-lange.net>