Re: 7.3 schedule - Mailing list pgsql-hackers

From Dann Corbit
Subject Re: 7.3 schedule
Date
Msg-id D90A5A6C612A39408103E6ECDD77B82906F42C@voyager.corporate.connx.com
Whole thread Raw
In response to 7.3 schedule  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Friday, April 12, 2002 2:38 PM
To: Tom Lane
Cc: Neil Conway; zakkr@zf.jcu.cz; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] 7.3 schedule


Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Oh, are you thinking that one backend would do the PREPARE and
another
> > one the EXECUTE?  I can't see that working at all.
>
> Uh, why exactly were you advocating a shared cache then?  Wouldn't
that
> be exactly the *point* of a shared cache?

I thought it would somehow compare the SQL query string to the cached
plans and if it matched, it would use that plan rather than make a new
one.  Any DDL statement would flush the cache.
>>-------------------------------------------------------------------
Many applications will have similar queries coming from lots of
different end-users.  Imagine an order-entry program where people are
ordering parts.  Many of the queries might look like this:

SELECT part_number FROM parts WHERE part_id = 12324 AND part_cost
< 12.95

In order to cache this query, we first parse it to replace the data
fields with paramter markers.
Then it looks like this:
SELECT part_number FROM parts WHERE part_id = ? AND part_cost < ?
{in the case of a 'LIKE' query or some other query where you can use
key information, you might have a symbolic replacement like this:
WHERE field LIKE '{D}%' to indicate that the key can be used}
Then, we make sure that the case is consistent by either capitalizing
the whole query or changing it all into lower case:
select part_number from parts where part_id = ? and part_cost < ?
Then, we run a checksum on the parameterized string.
The checksum might be used as a hash table key, where we keep some
additional information like how stale the entry is, and a pointer to
the actual parameterized SQL (in case the hash key has a collision
it would be simply wrong to run an incorrect query for obvious enough
reasons).
Now, if there are a huge number of users of the same application, it
makes sense that the probabilities of reusing queries goes up with
the number of users of the same application.  Therefore, I would
advocate that the cache be kept in shared memory.

Consider a single application with 100 different queries.  Now, add
one user, ten users, 100 users, ... 10,000 users and you can see
that the benefit would be greater and greater as we add users.
<<-------------------------------------------------------------------


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: 7.3 schedule
Next
From: Jan Wieck
Date:
Subject: Re: numeric/decimal docs bug?