Sunday, March 15, 2015

Performance Pitfalls: commons-configuration

commons-configuration is a java library that offers a common API to access configuration data from a variety of sources like properties files, XML files, etc. It is a bit outdated but still a convenient way to make certain aspects of an application configurable. One of its great features is that it can be instructed (configured) to reload the configuration automatically. For file based configuration sources, it can check for changes in the underlying files and trigger a reload if necessary. This is pretty handy for test and development (maybe even production) environments as you can make configuration changes without restarting your application.

Unfortunately, all those nice features come at a great cost. Let's have a look at the base class for file-based configuration files - the AbstractFileConfiguration. Access to every configuration item goes through the getProperty method which looks like this:

public Object getProperty(String key) {
    synchronized (reloadLock) {
        reload();
        return super.getProperty(key);
    }
}

The implementation first acquires the reloadLock, then it calls the reload method to check whether a reload is necessary and perform it if it is. It then returns the actual property's value. This lock easily becomes a major bottleneck - each and every thread will try to acquire the same lock whenever any property (from the same configuration source) is being read. It is important to note that this behaviour does not depend on the actual reloading strategy and reloading intervals. See also this bug report describing the issue.

Apart from the performance aspects, this implementation might also result in inconsistent configuration being returned during a reload. Lets say your application calls an external service that requires authentication. Lets assume that you store the username and password in a configuration file and use commons-configuration to fetch them every time they are required. Lets say you have agreed on new credentials with the service provider and you are given a few hours to change them. In that period both the old and the new credentials are valid. Now imagine you change the credentials in the file. As we might potentially reload the configuration every time a property is read, you could end up reading the old username and just a moment later reading the new password within the same business operation. This would result in inconsistent credentials and in this case will cause the external service call to fail.

How could this have been done better? Well, to address both issues, one could have designed the library in such a way, that the user is forced to first get the current, read-only snapshot of the whole configuration and then query any number of configuration items from this snapshot. Obviously, reading from a read-only object does not require any synchronization and it would prevent inconsistent values being read as the snapshot is updated only as a whole. The user of this library would be advised to fetch a snapshot of the configuration when she starts processing a business operation and reuse it throughout the business operation. Additionally, one would reload the configuration in the background. If the background job decides that a new configuration is available, it would take all the time it needs to load the new configuration and store it in a new read-only snapshot. Then it would have to use some kind of synchronization (e.g. read-write locks, a volatile field at the very least) to switch the current snapshot with the new one. This operation would only happen if the configuration has actually changed and it would be a fast operation as it would only consist of changing a pointer.

Another issue arises when the same configuration is shared among multiple machines. It is not really possible to guarantee that the new configuration will be loaded at the same point in time. This could be addressed if one would store a validity period for each configuration source. The new configuration is then just appended (the old one has to stick around for a while longer in case servers are restarted before the new one kicks in) to the current configuration with a validity period in the future. This can be done safely in advance and when the time comes (provided that all machines have synchronized clocks), the changes will become visible at the same point in time.

To sum up, my advice is - do not use commons-configuration in any performance critical execution path - at least until version 2 is released and it has been verified that it does not have any major locking issues.