I wrote “Implementing Transparent Encryption with NHibernate Listeners” last year. If you haven’t read it, I recommend you do, even if nHibernate is not your cup of tea, you are guaranteed to learn something.
Michael Johnson from Sharpened Developer commented on the post asking “How do I report on encrypted data?”
Most reports are aggregations and encrypted data is typically personally identifiable information (PII), in many cases this data is not included in reports. In the case that we need encrypted data for reports we have options.
Reports within the Application
Generating reports with in the application is the simplest way to access the data. Mostly likely encryption and decryption mechanisms are already in place. There a number of third party (Telerik, DevExpress, etc..) libraries available for building reports. Building reports with in the application works well on small sets of data, but as the data grows this simply doesn’t scale.
Using a Reporting Database
The long term solution is to move the data into a reporting database. Once the data is in it’s own database an enterprise reporting solution can be put in place.
The leads me to the more interesting aspect of keeping the reporting data current.
Firstly, an external facing application should not have direct access to the decrypted reporting database. If the external application is compromised it is a good bet they’ll soon have access to the decrypted reporting database.
Ideally decrypted data is only available on the internal network.
With all the hacking (Sony) lately, I wonder if it’s a good idea to have any sensitive data decrypted at rest… but I digress. That’s a discussion for another time.
There are two types of reporting data: real time and everything else.
Real time data requires the data to be nearly insync with production data. The only way to report on real time data is to capture changes as they happen. This is fraught with issues, but a cleverly crafted asynchronous process using a message queue is a good solution. A service watching the message queue can process the messages near real-time. The service is decoupled, providing a layer of security and it’s asynchronous minimizing the performance impact.
When data freshness isn’t a concern, a daily or even weekly service is a good solution. It might be as simple as restoring a production database backup on the reporting server.
Lastly, avoid syncing directly from production databases, the last thing you’d want is to hinder production’s performance by inundating it with database requests.
Reporting is rarely thought of at the onset of an application. As applications mature, stakeholders want to mine data for insights. This inevitably turns into a growing collection of reporting requests that seem to never end. A solid reporting solution will help abate these requests. In the best scenario the stakeholders can create their own reports.
* image reference (http://www.ajboggs.com/consulting/software-development/reporting-solutions/)