I'd like to share a simple trick I use to reduce roundtrips pulling data from a
cache server (like Redis or Kyoto Tycoon. Both
Redis and Kyoto Tycoon support efficient bulk-get operations, so it makes sense
to read as many keys from the cache as we can when performing an operation that
may need to access multiple cached values. This is especially true in web
applications, as a typical web-page may multiple chunks of data and rendered
HTML from a cache (fragment-caching) to build the final page that is sent as a
response.
If we know ahead-of-time which cache-keys we need to fetch, we could just grab
the cached data in one Redis/KT request and hold onto it in memory for the
duration of the request. The problem is, in the data-generation itself, how do
we differentiate from "pull-from the Redis cache" versus "pull-from an
in-memory cache"? For example:
The above function uses a common idiom in many Python cache libraries. The
decorator will transparently attempt to retrieve the data from the cache. If
the data is not available, then it is computed, stored in the cache, and
returned to the caller.
Django and Jinja template fragment caching works the same way. The HTML is
retrieved from the cache. If it's not available, it is re-generated and stored
in the cache automatically:
The way I solve this problem in ucache is
to use Python's context-managers to create a simple "scope" within the cache
itself. It looks like this:
This approach would not be beneficial in this example situation, but it
provides a significant speed-up when there are lots of cache-keys (or a few
cache-keys that are accessed multiple times).
One thing I do to make this more manageable is provide methods on my models
which generate cache-keys. I then use this method to refer to the cache-keys
wherever possible. For example:
If you're interested in a very simple implementation, check out the code for
ucache -
specifically the preload() method. Hope you found this helpful!