Re: eWeek Poll: Which database is most critical to - Mailing list pgsql-hackers
From | F Harvell |
---|---|
Subject | Re: eWeek Poll: Which database is most critical to |
Date | |
Msg-id | 200202271451.g1REp4d04916@odin.fts.net Whole thread Raw |
In response to | Re: eWeek Poll: Which database is most critical to your ("Dann Corbit" <DCorbit@connx.com>) |
Responses |
Re: eWeek Poll: Which database is most critical to
|
List | pgsql-hackers |
On Tue, 26 Feb 2002 15:20:17 PST, "Dann Corbit" wrote: > -----Original Message----- > From: Neil Conway [mailto:nconway@klamath.dyndns.org] > Sent: Tuesday, February 26, 2002 3:04 PM > To: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] eWeek Poll: Which database is most critical to > your > > > On Tue, 2002-02-26 at 15:30, Zak Greant wrote: > > Good Day All, > > > > eWeek has posted a poll that asks which database server is most > critical > > to your organization. > > The article mentions a MySQL feature which apparently improved > performance considerably: > > // > MySQL 4.0.1's new, extremely fast query cache is also quite notable, as > no other database we tested had this feature. If the text of an incoming > query has a byte-for-byte match with a cached query, MySQL can retrieve > the results directly from the cache without compiling the query, getting > locks or doing index accesses. This query caching will be effective only > for tables with few updates because any table updates that clear the > cache to guarantee correct results are always returned. > // > > My guess is that it would be relatively simple to implement. Any > comments on this? > > If I implemented this, any chance this would make it into the tree? Of > course, it would be: > > - disabled by default > - enabled on a table-by-table basis (maybe an ALTER TABLE command) > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >> > I don't see how it will do any good. There is no "prepare" in > Postgresql > and therefore you will simply be reexecuting the queries every time any > way. Also, parameter markers only work in embedded SQL and that is a > single tasking system. > > I think it would be a major piece of work to do anything useful along > those lines. > > If you look at how DB/2 works, you will see that they store prepared > statements. Another alternative would be to keep some point in the > parser marked and somehow jump to that point, but you would have to > be able to save a parse tree somewhere and also recognize the query. > > Here is where problems come in... > -- Someone wants blue and blue-green, etc shirts that are backordered > SELECT shirt, color, backorder_qty FROM garments WHERE color like > "BLUE%" > > Now, another query comes along: > > -- Someone else wants reddish, etc shirts that are backordered: > SELECT shirt, color, backorder_qty FROM garments WHERE color like "RED%" > > It's the same query with different data. Without parameter markers you > will never know it. And yet this is exactly the sort of caching that is > useful. > > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< > << While the cache that MySQL implemented may be of limited value for dynamic queries, it would be very useful for many, many queries. For example, many applications (especially the ones that I write), there are literally hundreds of selects against static "lookup" tables that maintain acceptable values for new data inputs (i.e., validated using referential integrity). Each and every one of those selects would receive an improvement. Also, it is relatively easy to "parameterize" the select. After all, it is a structured query. Just by parsing it, you can identify the "parameters". As a matter of fact, I'm pretty certain that this is already done by the optimizer when breaking the SQL down into the code to perform the query. The trick is to cache the "parameterized" version of the query. This is likely at or near the end of the optimizer processing. Also, you would want to capture the query plan in the cache. The query plan is not going to be interested at all in the literal value of the parameters and therefore will be the same for any query of the same form. For example, from above: SELECT shirt, color, backorder_qty FROM garments WHERE color like 'BLUE%' should become something on the order of: SELECT shirt, color, backorder_qty FROM garments WHERE color like '{param0}%' The next query: SELECT shirt, color, backorder_qty FROM garments WHERE color like 'RED%' should also be "parameterized" on the order of: SELECT shirt, color, backorder_qty FROM garments WHERE color like '{param0}%' A lookup into the query hash would match and therefore _at least_ the same query plan can be used. The commercial database "Cache" by Intersystems uses an approach similar to this. The performance of that database is phenomenal. (Of course, there is more going on in the internals of that system than just the query cache.)
pgsql-hackers by date: