Thread: Wishlist for 7.4: Plan stability
Ok, someone else posted their data warehousing wishlist, I want to post my single item OLP wishlist: Plan stability. This is the natural conclusion of having prepared queries. This is one area where it would be possible to totally beat Oracle's implementation by a huge margin. And it's something that's critical to web sites and other applications that handle many short queries and need to reliably provide rapid response. What I want to do would actually go way beyond what Oracle provides. I have a clear Idea what I would want to make the database truly manageable under heavy OLP load. I would want a queries to be a first class object in the postgres world. There could be a system table that lists all the prepared queries the database knows about. Every query's current plan and performance statistics could be linked from there. This would allow a sysadmin to at least have a clue what queries are running and how without having to open up the application source. Then I would want an acl system to restrict under what circumstances postgres allows new queries to be added, new plans to be analyzed, and old queries to be removed. While in a development server or a DSS server you want any user to be able to create new queries and you want the database to optimize every query using the most up to date information, in an OLP server that's not what you want. It's more important to be consistently fast than it is to be as fast as possible. Ideally I would expect every query to be manually loaded when a new version of the application is loaded. A DBA could check at that point every plan and ensure that they're all reasonable. Then the user that the web server connects as would be prohibited from running any new queries or generating any new plans for existing queries. The DBA could go to sleep at night confident that the database isn't suddenly going to hit some formerly unreached section of code or decide to "optimize" a query differently and suddenly come to a crashing halt. It would also mean that the query written by the new junior programmer you just hired can't slip by QA and thrash the query cache of your database by inserting constants directly into the query. And finally, it would also mean that the query written by your other junior programmer that inserts unchecked data can't become a gaping security hazard because when the hacker submits subqueries in the form the resulting query is rejected. -- greg
Is someone keeping a "hopeful" todo list? - Brandon ----------------------------------------------------------------------------c: 917-697-8665 h: 201-798-4983b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5
On Sat, 2002-11-30 at 21:22, bpalmer wrote: > Is someone keeping a "hopeful" todo list? Nearly every one of the items brought up could / should be on the standard todo list. http://developer.postgresql.org/todo.php -- Rod Taylor <rbt@rbt.ca>
Rod Taylor wrote: -- Start of PGP signed section. > On Sat, 2002-11-30 at 21:22, bpalmer wrote: > > Is someone keeping a "hopeful" todo list? > > Nearly every one of the items brought up could / should be on the > standard todo list. > > http://developer.postgresql.org/todo.php Does anyone have additions for the list. I didn't see anything discussed that jumped out at me, though I didn't see "Conquer the world" on there. ;-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Sat, 2002-11-30 at 23:02, Bruce Momjian wrote: > Rod Taylor wrote: > -- Start of PGP signed section. > > On Sat, 2002-11-30 at 21:22, bpalmer wrote: > > > Is someone keeping a "hopeful" todo list? > > > > Nearly every one of the items brought up could / should be on the > > standard todo list. > > > > http://developer.postgresql.org/todo.php > > Does anyone have additions for the list. I didn't see anything > discussed that jumped out at me, though I didn't see "Conquer the world" > on there. ;-) I didn't see anything that wasn't on either the TODO list or in the 'Unsupported Features' page in the docs aside from the ltree/XML items and MODIFY COLUMN support. -- Rod Taylor <rbt@rbt.ca>
Greg Stark <gsstark@mit.edu> writes: > Ok, someone else posted their data warehousing wishlist, I want to post my > single item OLP wishlist: Plan stability. That seems to me to translate to "I want the system to fail to react to changes in data statistics and all other variables relevant to query planning". You can pretty much get that by never doing [VACUUM] ANALYZE, but I'm quite lost as to why it's really a good idea. regards, tom lane
We recently finished some evaluations of Oracle, and one of the things which jumped out at me, PostgreSQL booster that I am, was that in 9i Oracle has finally surpassed PostgreSQL in some elements of object-relational technology. Among the things you can do are: - Create new compound object types on the SQL command line. So for example, the "imaginary numbers" example in the PostgreSLQ manual could all be done on the command line. - Address components of objects using dot-notation. (select employee.salary from employees) - Index objects based on their components (create index blah_idx on employees (employee.last_name)) In combination with ARRAY data types, and references, you can do pretty fancy things without ever creating a C library. PostgreSQL seems to have most of the underpinning already. I even did some experimenting with the "tables as a datatype" stuff. However, it's definately incomplete. In the tables-as-datatype example, you cannot get the objects to reconstruct themselves, you have to do it by hand. And while oid's make handy reference holders, there does not seem to be any particular performance benefit to doing object-oriented modelling over relational modelling in PostgreSQL at this point. By contrast, OORDBMS like Informix can be blazingly fast when used with the proper OO model. (Oracle performance enhancement for OO models is mixed at best right now.) Bruce Momjian wrote: > Does anyone have additions for the list. I didn't see anything > discussed that jumped out at me, though I didn't see "Conquer the world" > on there. ;-)
Paul Ramsey <pramsey@refractions.net> writes: > Oracle has finally surpassed PostgreSQL in some elements of > object-relational technology. Among the things you can do are: > - Address components of objects using dot-notation. (select > employee.salary from employees) Cool. How do they resolve the conflict against schema notation (ie, is employee a table reference or a schema name here)? regards, tom lane
On Sunday 01 December 2002 05:03, Tom Lane wrote: > Paul Ramsey <pramsey@refractions.net> writes: > > Oracle has finally surpassed PostgreSQL in some elements of > > object-relational technology. Among the things you can do are: > > > > - Address components of objects using dot-notation. (select > > employee.salary from employees) > > Cool. How do they resolve the conflict against schema notation > (ie, is employee a table reference or a schema name here)? > There are simple name resolution rules regards Haris Peco
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > Ok, someone else posted their data warehousing wishlist, I want to post my > > single item OLP wishlist: Plan stability. > > That seems to me to translate to "I want the system to fail to react to > changes in data statistics and all other variables relevant to query > planning". > > You can pretty much get that by never doing [VACUUM] ANALYZE, but I'm > quite lost as to why it's really a good idea. Well, first of all there's no guarantee that the genetic algorithm will actually produce the same plan twice, but that's a side issue. The main issue is that you *do* want to vacuum and analyze the database regularly to get good performance, but you don't want the database spontaneously changing its behaviour without testing and verifying the new behaviour personally. Not if it's a high availability production server. I'm thinking it should require a specific privilege that can be separately access controlled to parse a new query that didn't already exist in the query table. Then for a production server I would expect the DBA to arrange for vacuum analyze to run regularly during off-peak hours. Have a job test all the queries and report any changed optimiser behaviour. Then have a DBA sanity check and test the performance of any new query plans before allowing them to go into production. But the threat of the optimiser changing behaviour from running analyze isn't even the main threat I see this addressing. The threat of untested queries entering production from new code being pushed live is far greater. I've seen web sites go down more often from new queries with bad performance that were missed in testing more often than any other source. And I've seen security holes caused by applications that allow untrusted users to slip unexpected sql syntax into queries more often than any other reason. Really it boils down to one point: there's really no reason to assume a user should be able to execute any new query he feels like. Creating a new query should be privileged operation just like creating a new table or new database. For some systems such as development systems it of course makes sense for users to be able to create new queries on the fly. For DSS systems too it's pretty much assumed. But for OLTP systems it's very unlikely that a new query should suddenly be necessary. These systems spend their days running the same queries millions of times per day. They need to return results within milliseconds. Any new query should be assumed to be a bug or a security breach and reported as an immediate error. not cause the database to valiantly attempt to figure out how best to handle the unexpected query. -- greg
Greg Stark <gsstark@mit.edu> writes: > Really it boils down to one point: there's really no reason to assume a user > should be able to execute any new query he feels like. Creating a new query > should be privileged operation just like creating a new table or new database. This is an interesting view of what a database should be like, but it has very little to do with my idea of a database ;-). I think you want some sort of middleware layer to keep your users away from the database. I do not agree that the DB itself ought to contain such draconian restrictions. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > Really it boils down to one point: there's really no reason to assume a user > > should be able to execute any new query he feels like. Creating a new query > > should be privileged operation just like creating a new table or new database. > > This is an interesting view of what a database should be like, but it > has very little to do with my idea of a database ;-). I think you want > some sort of middleware layer to keep your users away from the database. This is how people attempt to tackle this problem with Oracle or other databases, but it's an assbackwards design. It leads to a lot of pain and awkwardness and only partially solves the problems. A good design would be elegant and result in a much more manageable system. What I'm really asking for is lower level control of when the query parser and the optimizer runs. That would allow an admin or middleware to control when new queries are parsed and optimized. It could also allow an admin to peek at the existing queries and see what plans are currently in the system, rather than run explain himself and say "well this is what it would do if i ran it now, which might be the same thing that ran earlier and caused this performance problem but there's no way to know for sure" > I do not agree that the DB itself ought to contain such draconian > restrictions. Note that the restriction I'm proposing be available is of the form "stop the system from doing something for me". This is the kind of feature that's impossible to graft on cleanly in a higher layer. -- greg