Spring cache annotations: some tips & tricks

2018/12/30 posted in  Spring

原文地址 https://www.foreach.be/blog/spring-cache-annotations-some-tips-tricks

Since version 3, Spring framework comes with a decent caching abstraction layer that allows you to use annotations like @Cacheable to interact with your caches.  In this post I'll assume you know how the caching annotations work and provide you with some tips & tricks for using them.

[This blog was updated on 27/11/2017 with  synchronized caching and tips on caching and transactions.]

Beware of the default cache keys

Caching a method outcome is really easy to do. Simply adding @Cacheable with a cache name would work already:

@Cacheable(value = "reservationsCache")
public List<Reservation> getReservationsForRestaurant( Restaurant restaurant ) {
}

However, it is also easy to lose track of the actual cache key that is being used. The above call uses a default key generation strategy that creates a SimpleKey that consists of all the parameters with which the method was called. This requires the parameters to have a decent hashCode()/equals() implementation, which is not usually a problem in and of itself, except that the performance of hashCode() and equals() also impacts the performance of cache retrieval. Again, that's usually not much of an issue.

A more important caveat here is that the parameters themselves become an integral part of the cache key, and that can have an unwanted impact on the actual heap size being used, as the cache key is kept on the heap as well. Consider our example: we use a Restaurant as the cache key. However the restaurant is a complex domain entity holding lots of data and having several collections of related entities. All this data is kept alive as long as the cache entry exists and keeps taking up room on the heap, even if it is no longer relevant.

Suppose our restaurant has an id property that uniquely identifies that specific restaurant, which domain classes often do. We can easily adapt our code as follows:

@Cacheable(value = "reservationsCache", key = "#restaurant.id")
public List<Reservation> getReservationsForRestaurant( Restaurant restaurant ) {
}

This uses SpEL for declaring a custom key that will be the value of the id property.  In our case this would be a simple long, and that would remain the case even if the Restaurant entity grows over time.  How much memory is actually consumed depends on the VM, but it's not hard to imagine that a single Restaurant entity holds hundreds or thousands of longs worth of data (consider several strings).  After we started tuning some of our cache keys in our projects, we got improvements in the hundreds of MB when it came to heap consumed, ending up with us being able to cache much more.

In short: you should not only pay attention to the unicity of your cache keys, but also to the size of the actual cache key being generated.  Use the key property or a custom key generator to have more fine-grained control over your cache keys.

Cacheable annotations and synchronization

For very expensive methods, you want to optimize the cache hits as much as possible.  When being accessed by multiple threads, you ideally want the first thread to do the actual calculation and all other threads to fetch it from the cache.  A classic case where you would synchronize access to the method.  However, the following code does not do what we want:

@Cacheable(value = "reservationsCache", key = "#restaurand.id")
public synchronized List<Reservation> getReservationsForRestaurant( Restaurant restaurant ) {
}

When using @Cacheable on a method, the caching code is outside of the original method body (added through AOP).  This means that any form of synchronization inside or on the method itself will take place after the actual cache lookup.  When calling the method the cache lookup will happen first and if there was a cache miss, then the lock will be taken and the method executed.  All cache misses will effectively execute the method and then insert the result into the cache, even if the calls are identical.

You could solve this using manual caching with some form of double-checked locking or by moving the synchronization around the cacheable method.  Take into account that the latter will always apply the synchronization and it will most likely require you to add an extra bean depending on your AOP proxy strategy

Update - Spring Framework 4.3

As of Spring Framework 4.3 there is some direct support for synchronized caching: @Cacheable allows you to specify the sync attribute to ensure only a single thread is building the cache value. To get the behavior we want, the above example could be updated to:

@Cacheable(value = "reservationsCache", key = "#restaurand.id", sync = true)
public List<Reservation> getReservationsForRestaurant( Restaurant restaurant ) {
}

Combine @CachePut and @Cacheable to optimize cache use

Using @Cacheable combines both looking in the cache and storing the result. Using @CachePut and @CacheEvict annotations gives you more fine-grained control.  You can also use the @Caching annotation to combine multiple cache related annotations on a single method.  Avoid combining @Cacheable and @CachePut on the same method, as the behavior can be quite confusing.  But you can put them to some good use.  Consider the following traditional service/repository hierarchy:

class UserService {
    @Cacheable(value = "userCache", unless = "#result != null")
    public User getUserById( long id ) {
        return userRepository.getById( id );
    }
}

class UserRepository {
    @Caching(
      put = {
            @CachePut(value = "userCache", key = "'username:' + #result.username", condition = "#result != null"),
            @CachePut(value = "userCache", key = "#result.id", condition = "#result != null")
      }
    )
    @Transactional(readOnly = true)
    public User getById( long id ) {
       ...
    }
}

In our example, when calling userService.getUserById, a lookup is done in the user cache using the id as cache key.  If no value is found, the call is forwarded to userRepository.getById.  The latter method does not look into the cache but will store the result under two different cache keys in the cache.  Even if the item already existed in the cache, a call to the repository would update it.

In a situation like this it is vital to make good use of the conditional properties (condition and unless) on the cache annotations.  In the example the result from the repository is only added to the cache if it is not null.  Without the condition we would get an exception as the outcome of #result.username could not be determined.  So in our code the repository does not cache any null values. On the other hand, the service call only stores null results back in the cache.  In our case this is the required behavior, as we do want to cache missing results for performance reasons.  Non-null values are cached by the repository, nulls by the service, the disadvantage being that cache logic is dispersed over several beans.

Caveat: If we were to remove the condition altogether on the service method, two identical cache puts would occur for every call with a valid result.  First the @CachePut from the repository, followed by storing the result because of the @Cacheable on the service method.  Be aware that only the first would be an actual cache put (inserting a new entry), immediately followed by an update of the same item.  This might not really be a performance issue, but when using cache replication that behaves differently for puts and updates - for example using Ehcache - this can bring some unwanted side effects.

If we don't want the service method to cache anything at all but simply look in the cache, we could use unless ="true"

Caching and transactions

Caching using annotations is very easy and transparent, but combined with transactions you sometimes need to take some special care. Suppose we have the following repository:

class UserRepository {
    @CachePut(value = "userCache", key = "#result.username")
    @Transactional(readOnly = true)
    User getByUsername( String username );

    @CacheEvict(value = “userCache”, key = "#p0.username"),
    @Transactional
    void save( User user );
}

We now create a single outer transaction that first saves the user and then fetches it.

class UserService {
    @Transactional
    User updateAndRefresh( User user ) {
        userRepository.save(user);
        return userRepository.getByUsername( user.getUsername() );
    }
}

The user will first be removed from the cache, only to be stored again right after. However, suppose that our updateAndRefresh() method is not the end of the transaction, and further on an exception occurs. The exception will cause the transaction to rollback. The changes will not be persisted in the backing database, but your cache will have been updated. Your system is now in an inconsistent state.

You can avoid this problem by binding your cache operations to the running transaction, and only execute them when the transaction commits. Attaching operations to the current transaction can be done using the TransactionSynchronizationManager, but Spring already has a [TransactionAwareCacheDecorator](https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/cache/transaction/TransactionAwareCacheDecorator.html) which does just that. It wraps around any Cache implementation and ensures that any put, evict or clear operations only execute after a successful commit of the current transaction (or immediately if there is no transaction).

If you do your cache operations manually, you can fetch the Cache from the CacheManager and wrap it yourself with a TransactionAwareCacheDecorator.

Cache transactionAwareUserCache( CacheManager cacheManager ) {
    return new TransactionAwareCacheDecorator( cacheManager.getCache( “userCache” ) );
}

 This works just fine if you do not use the cache annotations. If you want to use cache annotations and have transparent transactional support, you should configure your CacheManager to hand out out transaction aware caches.

Some CacheManager implementations, like EhCacheCacheManager, extend AbstractTransactionSupportingCacheManager and support handing out transaction aware caches directly:

@Bean
public CacheManager cacheManager( net.sf.ehcache.CacheManager ehCacheCacheManager ) {
    EhCacheCacheManager cacheManager = new EhCacheCacheManager();
    cacheManager.setCacheManager( ehCacheCacheManager );
    cacheManager.setTransactionAware( true );
    return cacheManager;
}

For other CacheManager implementations, for example the SimpleCacheManager, you can use a TransactionAwareCacheManagerProxy.

@Bean
public CacheManager cacheManager() {
    SimpleCacheManager cacheManager = new SimpleCacheManager();
    cacheManager.setCaches( Collections.singletonList( new ConcurrentMapCache( “userCache” ) ) );

    // manually call initialize the caches as our SimpleCacheManager is not declared as a bean
    cacheManager.initializeCaches(); 

    return new TransactionAwareCacheManagerProxy( cacheManager );
}

The TransactionAwareCacheDecorator is perhaps a lesser known feature of the Spring cache abstraction infrastructure. But it is no less useful and can be very helpful in avoiding some very hard to debug issues that pop up when combining caching with transactions.

Using the cache annotations from Spring is fairly straightforward, but it can sometimes be a bit unclear exactly what's happening and that can lead to strange results.  With this post, I hope to have given you some insights so certain pitfalls can easily be avoided.  All feedback and questions are appreciated, that's what the comments are for.