Re: Postgresql Caching - Mailing list pgsql-hackers

From Anon Mous
Subject Re: Postgresql Caching
Date
Msg-id 20061017235158.11524.qmail@web90305.mail.mud.yahoo.com
Whole thread Raw
In response to Postgresql Caching  (Anon Mous <soundami@yahoo.com>)
Responses Re: Postgresql Caching  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Hi<br /><br /> I've loved reading all of your thoughts and comments.<br /><br /> Yet, I'm left with the question:<br
/><br/>    Can we can brainstorm a caching solution that is workable...<br /><br /> I've seen some posts talking about
someof the challenges.<br /><br /> 1.) Only good for static data<br /><br /> As it was proposed that is largely true.
Thisdoesn't mean, however that the idea as stated isn't worth persuing because I think you'll find most applications
havea great deal of "static enough" data to benefit greatly from this type of caching.<br /><br /> However, I think
somesimple changes to the idea may make it useful for busy tables...<br /><br /> These changes, would probably require
directcommunication between the caching controller and the the postmaster.<br /><br /> a.) Rather than table locking,
trackchanges at the row level.<br /> b.) Rather than requiring a complete reseeding of a table after an update, just
invalidate,or repopulate the affected rows. <br /> c.) Rather than destroying popular query results, try to update them
ifpossible. For example, it's easy to remove one entry from the cache if that row was just deleted. (It's probably
cheaperto rerun the query on just the few changed rows than to rerun the whole query.)<br /> d.) Any other ideas?<br
/><br/> 2.) If any caching were possible, we'd already be doing it.<br /><br /> I don't think this statement will stand
thetest of time! Agreed, caching quickly becomes a head hurting complex topic, but the benefits are well worth the
effort!<br/><br /> 3.) Consistency and memcached, *are* mutually exclusive. Memcached provides no mechanisms for
consistency.<br /><br /> "You can never have perfect consistency across different systems (memcache / <br />
postgresql)and especially not when their visibility rules differ. What is <br /> visible to something via memcache is
alwayslatest uncommitted. What is <br /> visible in PostgreSQL is something less than that. Consistency is not <br />
possible.Correct caching is therefore also not possible unless you define <br /> correct as 'latest', and even then,
youhave problems if memcache expires <br /> the record, before the real record has been commited into PostgreSQL."<br
/><br/> I completely agree. I'm not talking about synchronizing memcached data to be consistent, I'm saying lets create
acaching layer that works something like memcache and preserves consistency!<br /><br /> and, very related to this, <br
/>4.) Memcached Caching is exactly opposite to Postgres consistency.<br /><br /> Specifically:<br /> Memcache is
serialized<br/> Memcache can loose data at any time<br /> Memcache has only 2 fields<br /> Memcache has no
synchronization<br/><br /> Postgres needs consistency.<br /><br /> Memcache doesn't do any synchronization, and that
meansconsistency is impossible. However, a special version of memcache that is embedded into the postgresql system or
apithat does talk with the postmaster could be able to provide guaranteed consistency?<br /><br /> 5.) This idea won't
saveany time with SQL parsing.<br /><br /> I believe it can...  Because, as memcache has illustrated, you can avoid any
sqlparsing by using the sql and user's table permissions (if different users are enabled) as the key to the cached
data.<br /><br /> 6.) Postgresql is consistency. If an application needs speed let the application figure out how to
cachethe data<br /><br /> I appreciate that Postgres is all about data consistency. Actually, that's why I'm here and
noton the Mysql board...  However, I believe that we can provide caching without losing consistency, and developers
willlove the extra speed.<br /><br /> If we do implement the caching once, everyone will be able to use it without
complicatingtheir end application!!!  (Read: It will help the world and make PostgreSQL very popular!)<br /><br />
---<br/><br /> So, would it work to create a caching pre-processor for Postgresql that would work serially on every
request,and pass all uncached queries to the database?<br /><br /> - If it's a cache hit, and the data is currently
availableand active, pass the data back. <br /><br /> - If it's a miss, pass the query along to the database, and
populatethe cache with the results. <br /><br /> - If the query changes data, invalidate the cached queries that touch
anytable rows that could be affected. After the update, re-enable the unaffected table rows, and repopulate the cache
withthe updates. <br /><br /> - Cached queries using an affected table would normally be deleted except in special
simplecases that could be updated. <br /><br /> A related interesting thought... It might help if the cache stored the
datain separate table rows rather than in whatever format the query requested the way memcached does. <br /><br /> -
Eachcached table row could be stored as a separate entity. <br /> - Complicated joins, rather than caching all of the
datain whatever organization the user specified, would instead store a matrix of pointers to the exact table fields in
mostcases.<br /><br /> Will it work?  Am I missing anything?<br /><br /> Thanks<br /><br /> Daniel<br /><p><hr size="1"
/>Yahoo!Messenger with Voice. <a
href="http://us.rd.yahoo.com/mail_us/taglines/postman1/*http://us.rd.yahoo.com/evt=39663/*http://voice.yahoo.com">Make
PC-to-PhoneCalls</a> to the US (and 30+ countries) for 2¢/min or less. 

pgsql-hackers by date:

Previous
From: Shane Ambler
Date:
Subject: Re: Syntax bug? Group by?
Next
From: Jeff Davis
Date:
Subject: Re: Postgresql Caching