Mark,
> Using memcache, I've had problems with consistency brought right to
> the front. Both of these have failed me:
>
> 1) When updating a PostgreSQL record, I invalidate the memcache record.
> If another process comes along in parallel before I commit, notices
> that the memcache record is invalidated, it queries the data from
> SQL, and updates the memcache record back to the old value. :-(
>
> 2) When updating a PostgreSQL record, I updated the memcache record
> to the new value. If another process comes along in parallel before
> I commit, that is still looking at an older view, cross-referencing
> may not work as expected.
The answer is that cached values are not expected to be consistent. If they
were, then they'd have to have all of the transaction overhead which
PostgreSQL has, and lose any gain in efficiency.
Generally, memcached is best used for values that don't get kept in the
database at all. Example (how I use it)
Using pgmemcache for user session information:
1) User logs in. Their session information is stored in the permanent
user_session table and the pgmemcache pseudotable, user_session_cache.
2) User browses the website. This requires updating their session every time
a page is loaded with their last activity timestamp, their page path (for
backtracking) and their ip information (for hijack prevention). This
informaiton is recorded in user_session_cache *only*, with the presumption
that it will be lost if the server goes down.
3) User logs out (or is auto-logged-out). Keys are deleted from
user_session_cache and their exit information is written to the permanent
table user_session.
The advantage of this is that it allows lowering the amount of write activity
to the user_session table by 95% with no loss of information we care about.
Of course, if you are using a Java or .NET application server, it probably
provides the above functionality itself.
--
Josh Berkus
PostgreSQL @ Sun
San Francisco