Got Cache?

by Billy T. on April 2, 2010

Once upon a time, I had a Servlet making expensive queries, about 3 seconds per request, using plain JDBC. Something to worry about if you already know that the servlet is placed in a high traffic website, and high means more than 10 million hits daily. Obviously without any cache mechanism in the middle the solution would be far from being acceptable to move to production.

To help to prevent this issue, Servlets can use a filter cache that would process all the incoming requests and evaluate if a cache entry is available to return or a db call should be made.

So, here is an implementation example using eh-cache web on a fictional “myServlet“:

1. How to set up the cache:

If you project is using maven, be sure to add the ehcache-web dependency:

<dependency>
<groupId>net.sf.ehcache</groupId>
<artifactId>ehcache-web</artifactId>
<version>2.0.0</version>
</dependency>

Probably you will need to add the slf4j dependency as well, if you are interested in log the cache manager messages.

<dependency>
<groupId>net.sf.ehcache</groupId>
<artifactId>ehcache-web</artifactId>
<version>2.0.0</version>
</dependency>

If you are not using maven, be sure to add the corresponding jar files to your classpath (including .jar dependencies).

then create or edit your ehcache.xml file, adding a new cache object. i.e.

<cache name="CachePageCachingFilter"
maxElementsInMemory="500"
eternal="false"
timeToIdleSeconds="300"
timeToLiveSeconds="300"  //5 minutes
overflowToDisk="true">
</cache>

the ehcache.xml, should be placed inside your classpath, i.e. /src/main/resources folder.

2. Create your cache servlet filter:

Basically, you need to create a filter that extends net.sf.ehcache.constructs.web.filter.CachingFilter class, you can see a good example of how to do this in the provided class net.sf.ehcache.constructs.web.filter.SimplePageCachingFilter . its likely override protected String calculateKey(HttpServletRequest httpRequest) method, in order to give a specific order to your request parameters or add more information to the cache key.

3.Set up the filter:

As any other servlet filter, it must be configured in the web.xml file, first declaring it:

<filter>
<display-name>CacheFilter</display-name>
<filter-name>CacheFilter</filter-name>
<filter-class>billyto.examples.MyCacheFilter</filter-class>
<!--    <filter-class>net.sf.ehcache.constructs.web.filter.SimplePageCachingFilter</filter-class>  -->
<init-param>
<param-name>suppressStackTraces</param-name>
<param-value>false</param-value>
</init-param>
<init-param>
<param-name>cacheName</param-name>
<param-value>CachePageCachingFilter</param-value>   <!-- same name as ehcache.xml -->
</init-param>
</filter>

and then, mapping it BEFORE your servlet mapping. i.e

<filter-mapping>
<filter-name>CacheFilter</filter-name>
<url-pattern>/example/retrieve-data</url-pattern>
<dispatcher>REQUEST</dispatcher>
<dispatcher>INCLUDE</dispatcher>
<dispatcher>FORWARD</dispatcher>
</filter-mapping>

<servlet>
<display-name>MyServlet</display-name>
<servlet-name>MyServlet</servlet-name>
<servlet-class>billyto.example.MyServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>MyServlet</servlet-name>
<url-pattern>/example/retrieve-data</url-pattern>
</servlet-mapping>

4. Try it.

You can include a date in your servlet response, and check if the time is changing between request(no cache) or if it remains the same (cached response).

Additionally if you need to condition the cache calling, you can also override the method:  protected void doFilter(final HttpServletRequest request, final HttpServletResponse response,final FilterChain chain) And set some logic to execute the cache, invoking super.doFilter(…) or  moving the execution to the next chain level with chain.doFilter(…)

EHCache have pretty good documentation and active forums, for more information you can go here.

{ 5 comments… read them below or add one }

Allard Buijze April 3, 2010 at 4:52 am

Hi Billy,

when reading your article, I was wondering why a query would take 3 seconds. That’s a long, long time! Could it be that your persistence model is not optimized for reads? A bit of denormalization here could help reduce query time drastically.

In my opinion, caches should be introduced with great care. You might be fighting symptoms of a problem, instead of the problem itself.

But in the end, if you do want to use caching, the way you describe here is a pretty viable way!

Cheers,

Allard

Billy T. April 3, 2010 at 8:29 am

Hi Allard,
I’m glad you asked for the query, in this particular case we cannot modify the database structure because is part of a third party solution, but you are totally right, the first thought about our problem was, de-normalize the implicated tables. Indeed, our DBA created multiple indexes and is working in a materialized view that would help us to save some milliseconds.

cheers,

p.s. the original query before indexes and SQL make over was about 8 seconds!!!

Rick April 3, 2010 at 10:09 am

Thanks for the post. Nice tip!

Den April 4, 2010 at 12:45 pm
Wille April 8, 2010 at 7:49 pm

I would suggest that if you can use the EhCache page-caching filter and still serve pages as you want, there might not even be a need for an RDBMS for most of your website if what you are doing is mostly serve content (as opposed to deal with transactional writes)..

Another approach could be to simply take the DB out of the picture and cache fragments of pages in EhCache, on-disk and have a final storage on something like Amazon S3, it’s a solution I detailed here: http://blog.recursivity.com/post/501659139/infinite-web-scalability-resilience-with-amazon-web

Leave a Comment

{ 1 trackback }

Previous post:

Next post: