Re: 7.3 schedule - Mailing list pgsql-hackers
From | Dann Corbit |
---|---|
Subject | Re: 7.3 schedule |
Date | |
Msg-id | D90A5A6C612A39408103E6ECDD77B82906F42C@voyager.corporate.connx.com Whole thread Raw |
In response to | 7.3 schedule (Bruce Momjian <pgman@candle.pha.pa.us>) |
List | pgsql-hackers |
-----Original Message----- From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] Sent: Friday, April 12, 2002 2:38 PM To: Tom Lane Cc: Neil Conway; zakkr@zf.jcu.cz; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] 7.3 schedule Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Oh, are you thinking that one backend would do the PREPARE and another > > one the EXECUTE? I can't see that working at all. > > Uh, why exactly were you advocating a shared cache then? Wouldn't that > be exactly the *point* of a shared cache? I thought it would somehow compare the SQL query string to the cached plans and if it matched, it would use that plan rather than make a new one. Any DDL statement would flush the cache. >>------------------------------------------------------------------- Many applications will have similar queries coming from lots of different end-users. Imagine an order-entry program where people are ordering parts. Many of the queries might look like this: SELECT part_number FROM parts WHERE part_id = 12324 AND part_cost < 12.95 In order to cache this query, we first parse it to replace the data fields with paramter markers. Then it looks like this: SELECT part_number FROM parts WHERE part_id = ? AND part_cost < ? {in the case of a 'LIKE' query or some other query where you can use key information, you might have a symbolic replacement like this: WHERE field LIKE '{D}%' to indicate that the key can be used} Then, we make sure that the case is consistent by either capitalizing the whole query or changing it all into lower case: select part_number from parts where part_id = ? and part_cost < ? Then, we run a checksum on the parameterized string. The checksum might be used as a hash table key, where we keep some additional information like how stale the entry is, and a pointer to the actual parameterized SQL (in case the hash key has a collision it would be simply wrong to run an incorrect query for obvious enough reasons). Now, if there are a huge number of users of the same application, it makes sense that the probabilities of reusing queries goes up with the number of users of the same application. Therefore, I would advocate that the cache be kept in shared memory. Consider a single application with 100 different queries. Now, add one user, ten users, 100 users, ... 10,000 users and you can see that the benefit would be greater and greater as we add users. <<-------------------------------------------------------------------
pgsql-hackers by date: