October 14, 2007 will

Timed Caching Decorator

Here's the caching decorator I mentioned in my previous blog entry about optimizing the front page of becontrary.com. It is pretty simple to use, you supply the same parameters to it as timedelata. So @timed_cache(minutes=30) would cache the result for half an hour. It is thread safe, so you can happily use it in the context of a web application, such as Turbogears. For many things it is a magical one-line speed-up, but there are some issues to be aware of. You can't use it to cache anything where the return value is context sensitive, such as SQLObject class instances -- because they are tied to a database context that wont exist at the second call. If you do want to cache such objects, then you should copy the information you need to a dictionary. Genshi templates make the difference completely transparent, so it is not much of a problem. Another issue is that parameters must be immutable, because they are used as keys in a dictionary.

from datetime import datetime, timedelta
from copy import deepcopy
from threading import RLock

def timed_cache(seconds=0, minutes=0, hours=0, days=0):

    time_delta = timedelta( seconds=seconds,
                            minutes=minutes,
                            hours=hours,
                            days=days )

    def decorate(f):

        f._lock = RLock()
        f._updates = {}
        f._results = {}

        def do_cache(*args, **kwargs):

            lock = f._lock
            lock.acquire()

            try:
                key = (args, tuple(sorted(kwargs.items(), key=lambda i:i[0])))

                updates = f._updates
                results = f._results

                t = datetime.now()
                updated = updates.get(key, t)

                if key not in results or t-updated > time_delta:
                    # Calculate
                    updates[key] = t
                    result = f(*args, **kwargs)
                    results[key] = deepcopy(result)
                    return result

                else:
                    # Cache
                    return deepcopy(results[key])

            finally:
                lock.release()

        return do_cache

    return decorate

if __name__ == "__main__":

    import time

    class T(object):

        @timed_cache(seconds=2)
        def expensive_func(self, c):
            time.sleep(.2)
            return c

    t = T ()

    for _ in xrange(30):
        time.sleep(.1)
        t1 = time.clock()
        print t.expensive_func('Calling expensive method')
        print "t - %i milliseconds"%int( (time.clock() - t1) * 1000. )

There are some other things to be aware of. Naturally it will use up more memory, because a copy of the result is stored for each combination of parameters. And the standard disclaimer applies, that you should check if there is actually a speed-up before using it in production code.

Use Markdown for formatting
*Italic* **Bold** `inline code` Links to [Google](http://www.google.com) > This is a quote > ```python import this ```
your comment will be previewed here
gravatar
Doug Napoleone

Question:

have you looked at django's cache backed system?

They have had this decorator, and more for some time now. While the 'view' decorator is django specific, the cache system is not and would integrate well with your decorator here. The advantage is the backend support. The only one which would not work in TG is the database backend (which relies on the django DB stack). The rest (localmem, file, memcached) and the extensible backend system it's self has great value. Having something similar in TG would be fantastic.

NOTE: I use Django, TG, pylons, and Plone. Each system has it's trade-offs, and none is a religion.

gravatar
Will

I haven't seen Django's cache backed system. I would like to look more in to Django -- I would probably use it if I attempt another web application project.

gravatar
Justin

You really should take a look at the Django cache system before reinventing it, since they already have great support for the various caching options you may want (most importantly, memcached). For a production environment you really want to use something like memcached anyways, so that you aren't duplicating your cache across however many processes your web server decides on creating.

It's also trivial in Django to switch out real caching with a dummy 'cache' that does nothing -- very useful for development.

gravatar
Ian Bicking

The equivalent thing in Pylons comes through Beaker; @beaker_cache specifically; some more info here:
http://wiki.pylonshq.com/display/pylonsdocs/Caching+in+Templates+and+Controllers

gravatar
Peter Crabtree

Wouldn't this break a recursive function by deadlocking?

gravatar
Justin

The python RLock object would only deadlock with a recursive function if you acquired the lock and waited for a result from another thread that was also trying to acquire the same RLock. Whichever thread 'owns' the lock may acquire it as many times as it wishes.