Jul 12, 2018 2:08:37 PM Thomas Dumont avatar   4370

Cache management


Introduction

Cache management is an essential element to ensure optimal performance of Lutece.

Many page building operations are expensive in CPU. It is possible to mention in particular all the operations implementing the XML / XSL transformation or the parsing or access to the database. Without cache, Lutece could only support a few users with very poor response times. Calling on the cache is absolutely necessary!

The data stored in the cache occupies memory space so their volume is a parameter to always keep in mind.

In addition, cache data is only a reflection of actual data. In some cases we will keep it always in phase with the real data: the cache must be "invalidated" with each modification of a data or they are not "hidden". In other cases, it is the lifetime of the cached objects that will give the acceptable limit for the data update time.

These are all considerations that need to be discussed in this article to configure Lutece for maximum performance.

Fundamentals of cache

The basic algorithm of cache management is relatively trivial. The cache is a large storage tank in which objects are stored, each of which is identified by a unique calculated key. The collection model used in general to store a cache is a Map.

The algorithm for recovering an object is therefore the following:

  • we calculate its cache key (hash, identifier, or combination of keys)
  • we search for the object in the cache
  • if the object is found
    • we send it back
  • if not
    • we recover or recalculate the real object,
    • we put the object in the cache associated with its key
    • we return the object

The construction of the cache key is therefore an important aspect of management.

In addition, Lutece's services are based on the Ehcache product, which allows for finer management of caches by introducing concepts such as:

maximum number of objects life of objects disk storage ...

A centralized cache management interface

Lutece offers a centralized interface, accessible to the site administrator, management of all cache services. Cache Management Interface - Lutece v4

This interface lists the installed cache services, their status, the main configuration options and dynamic usage elements (number of objects, memory occupied, ...).

The different levels of cache

Cache devices are present at multiple levels and as soon as they can bring gains to access heavy resources to calculate or load.

There are two main levels.

First-level cache services

These services are implemented as a servlet filter. That is, they intercept any HTTP call to a resource (jsp, js, images, ...).

These services are based on the Ehcache-Web brick. Cache keys are built from the elements of the HTTP request: method, uri path, querystring.

These services provide other optimizations:

  • GZIP compression of the answers
  • HTTP headers adapted to the lifetime of objects to make the most of browser caches and proxies including static resources (images, css, scripts, ...).

Resource cache services

These services provide the cache for the different Lutece objects: page, portlet, menus, site tree, document.

Cache configuration

Two files located in the WEB-INF / conf / directory are used to manage the configuration of the caches:

  • caches.properties
  • caches.dat

caches.properties

This file contains the default cache settings.

# Default cache configuration
   lutece.cache.default.maxElementsInMemory = 10000
   lutece.cache.default.eternal = false
   lutece.cache.default.timeToIdleSeconds = 10000
   lutece.cache.default.timeToLiveSeconds = 10000
   lutece.cache.default.overflowToDisk = true
   lutece.cache.default.diskPersistent = true
   lutece.cache.default.diskExpiryThreadIntervalSeconds = 120
   lutece.cache.default.maxElementsOnDisk = 10000

It also contains the options for cache monitoring by JMX.

# JMX monitoring properties
   lutece.cache.jmx.monitoring.enabled = false
   lutece.cache.jmx.monitorCacheManager = false
   lutece.cache.jmx.monitorCaches = false
   lutece.cache.jmx.monitorCacheConfiguration = false
   lutece.cache.jmx.monitorCacheStatistics = false

caches.dat

This file contains the status and the specific settings of the caches at the start of the Webapp.

#Caches status file
   #Sun Mar 27 03:06:48 CEST 2011
   SiteMapService.enabled = 1
   MyPortalWidgetService.enabled = 1
   PortalMenuService.enabled = 1
   DocumentResourceServletCache.enabled = 1
   PageCacheService.enabled = 1
   PortletCacheService.enabled = 1
   MyPortalWidgetContentService.enabled = 1
   PageCachingFilter.enabled = 1
   StaticFilesCachingFilter.enabled = 1
   StaticFilesCachingFilter.timeToLiveSeconds = 1000000

The root of the properties is the name of the caches (with the spaces removed if the original name contains some). The enabled property indicates whether the cache is enabled (value = 1) or disabled (value = 0). If this property is absent, the cache is enabled by default.

The other properties are the same cache settings as those in the caches.properties file. The value specified in this file will override the default value set in caches.properties.

From version 4.1 these informations are saved in database in the Datastore. The caches.properties and caches.dat files are only used to initialize values.

Configuration strategies


As mentioned in the introduction, there is no universal configuration. It's about finding the best tradeoffs between data freshness, performance and memory resource consumption.

Here are the main configurations.

Development configuration

In development, there are no performance issues and it is better to follow the update of data without cache effect. The configuration to remember is to disable all caches.

StaticFilesCachingFilter.enabled = 0
   PageCachingFilter.enabled = 0
   PageCacheService.enabled = 0
   PortletCacheService.enabled = 0
   PortalMenuService.enabled = 0
   SiteMapService.enabled = 0

It is also possible to disable all caches by modifying a system property at the launch of the Java VM.

java -Dnet.sf.ehcache.disabled = true

Production configuration for site without customization

The first-level cache can be activated on both static resources (images, css, scripts) but also on portal pages including Portal.jsp.

The lifetime of the static resource cache can be set to one week, or 604800 seconds. The one of the JSP pages can be set to one hour or 3600 seconds

StaticFilesCachingFilter.enabled = 1
    StaticFilesCachingFilter.timeToLiveSeconds = 604800
    PageCachingFilter.enabled = 1
    PageCachingFilter.timeToLiveSeconds = 3600

Second level caches (pages, portlets, menu, ...) are not required in this case. If they are enabled, the lifetime of the objects must be consistent with that of the first level cache, in this case the value to remember would also be one hour.

PageCacheService.enabled = 0
    PortletCacheService.enabled = 0
    PortalMenuService.enabled = 0
    SiteMapService.enabled = 0
    DocumentResourceServletCache.enabled = 1

Production configuration for site with customization

Customization prevents the use of the top-level cache for JSP pages. Indeed the same page does not appear the same way according to the connected user, it is not possible to serve a page from its only address.

To ensure effective page caching, it is necessary to rely on second-level caches that can manage in their keys the identifier of the user.

StaticFilesCachingFilter.enabled = 1
    StaticFilesCachingFilter.timeToLiveSeconds = 604800
    PageCachingFilter.enabled = 0
    PageCacheService.enabled = 1
    PortletCacheService.enabled = 1
    PortalMenuService.enabled = 1
    SiteMapService.enabled = 1
    DocumentResourceServletCache.enabled = 1

In this type of configuration, the user IDs being in the cache key, it is imperative to consider the number of users for all aspects of sizing. It may be possible to create specific cache keys based on customization mode to reduce the number of objects and performance. Production configuration for limited environment in memory

In a context where the available memory is low and / or the size of the resources is important, it is advisable to size the maxElementsInMemory parameter correctly. If the number of objects in cache exceeds this value, the least recently used objects (eviction policy LRU Least Recently Used) will be moved to a disk cache, thus limiting the risk of memory overflow.

Development of a cache service

To develop a built-in cache service in Lutece just extend the AbstractCacheableService abstract class.

public class MyCacheService extends AbstractCacheableService
{
   private static final String SERVICE_NAME = "My Cache Service";


   public MyCacheService ()
   {
       initCache ();
   }

   public String getName ()
   {
       return SERVICE_NAME;
   }

   public MyResource getResource (String strId)
   {
        MyResource r = getFromCache (strId);
        if (r == null)
        {
            r = getResourceFromSource (strId);
            putInCache (strId, r);
        }
        return r;
    }
}

Building cache keys

To make the keys easy to read, the accepted norm is to concatenate the elements in the form [key1: value1] [key2: value2] ... [user: id].

Moreover, these constructions being presumed to be strongly solicited one will use a concatenation with a StringBuilder.

Here is a typical implementation meeting these standards:

private String getCacheKey (StriD string, LuteceUser user)
{
     StringBuilder sbKey = new StringBuilder ();
     sbKey.append ("[res:") .append (strId) .append ("] [user:"). append (user.getName ()) .append ("]");
     return sbKey.toString ();
}

The ICacheKeyService interface and the DefaultCacheKeyService implementation

Some services such as PageService accept injection through the Spring context of a class implementing the ICacheKeyService interface to generate cache keys from a parameter map and the user object.

The DefaultCacheKeyService class provides a default implementation of this interface. The interface plans to define for the generation of the key:

a list of parameters to use to generate the key. This is a security measure to prevent a generator using dummy parameters in urls from generating as many cache keys that would saturate memory. a list of parameters to ignore. Some settings may not be relevant in the build and may cause duplicate caching. They should be eliminated by declaring them in this list.

Here is the example of CacheKeyService injection in a Spring context for PageService:

<bean id="pageCacheKeyService" class="fr.paris.lutece.portal.service.cache.DefaultCacheKeyService" >
       <property name="allowedParametersList" >
           <list>
               <value>page_id</value>
           </list>
       </property>
   </bean>

   <bean id="portletCacheKeyService" class="fr.paris.lutece.portal.service.cache.DefaultCacheKeyService" >
       <property name="ignoredParametersList" >
           <list>
               <value>page-id</value>
               <value>site-path</value>
           </list>
       </property>
   </bean>

   <bean id="pageService" class="fr.paris.lutece.portal.service.page.PageService">
       <property name="pageCacheKeyService" ref="pageCacheKeyService" />
       <property name="portletCacheKeyService" ref="portletCacheKeyService" />
   </bean>

Summary of what's new with Lutece v 3.0

  • New cache at portlet level (introduced with role management at the portlet level necessary for content customization)
  • New first-level cache, based on an Ehcache-Web filter, allowing the management of expiry dates, the encoding of the response in gzip and ETag tags. Two caches are based on this filter: one for static resources and one for JSPs. The latter is only to be activated for sites without customization.
  • The default settings for all caches are in a cache.properties file under WEB-INF / conf (no longer in ehcache.xml).
  • The state of activation of the caches is centralized in a file caches.dat under WEB-INF / conf in the manner of the file plugins.dat
  • Each cache can overload the configuration parameters (lifetime, max items, ...) in the file caches.dat
  • The interface in the back office has been improved and allows to activate / deactivate a cache and display its memory size and the list of its keys.
  • A new API makes it possible to outsource the construction of cache keys. This API is used in particular by PageService and allows to inject via Spring a specific key constructor. This API plans to filter a map of parameters to exclude some or define a specific number.
  • Objects with an expired lifetime are removed from the cache (more expensive but free of memory)
  • The caches can be monitored by JMX (activation of the monitoring in caches.properties)