Clodoaldo wrote:
> But the main factor to push me in the file system direction is the
> HTTP cache management. I want the internet web clients and proxies to
> cache the images. The Apache web server has it ready and easy. If
> the images where to be stored in the DB I would have to handle the
> HTTP cache headers myself. Another code layer. Not too big a deal,
> but if Apache give me it for free...
There's a hybrid approach which has worked well for us.
You store the binary data in the database along with a signature.
On the Apache side, you write a 404 handler that, based on the request,
fetches the binary from the database and writes it locally to the
filesystem based on the signature (using a multi-level hashing scheme
possibly as detailed in previous posts).
When a request comes in to Apache, if the file exists it is served
directly without any db interaction. OTOH, if it's missing, your 404
handler kicks in to build it and you get a single trip to the db.
You get the benefits of keeping the data in the db (transaction
semantics, etc.) but also get the scalability and caching benefits
of having the front-end webservers handle delivery.
If you lose the locally cached data it's not an issue. They'll be
faulted back into existence on demand.
With multiple webservers, you can just allow the data to be cached on
each machine, or if there's too much data for that, have your load
balancer divide the requests to different webserver pools based on the
signature.
As an extension, if you need different versions of the data (like
different sizes of an image, etc.), you can modify your URLs to indicate
the version wanted and have the 404 handler take that into account when
building them. You only store the original content in the database but
could have any number of transformed versions on the webservers. Again,
losing those versions is not an issue and do not require backup.
Maurice