Saturday, July 5, 2008

Yet an another EC2 Experience - Cybernet Slash Support runs Payroll on EC2

Introduction
The objective of this exercise is to determine the effectiveness of utility computing over enterprise method of computation. To ascertain the facts on a real-time scenario, we decided to try out a payroll system as it involves complex and time consuming processes.

We were amazed by the outcome of this exercise as our Payroll system thrived on EC2, and in a nutshell the initial finding is that this is a right step towards reducing cost /time and improving productivity. We are pleased with the outcome and would like to take this opportunity to congratulate Amazon for this great service. Infact, we are pretty confident that based on our testimony of this great service from Amazon, many more enterprises would contemplate and eventually move their payroll system to Amazon web services environment.

Background
Our payroll system takes about 5 to 10 seconds to calculate a pay check per employee and it takes approximately 6 hours for 4500 employees. Our current infrastructure has two dedicated servers to process the payroll and we could achieve each payroll cycle in about 3 hours.

Our challenges were to,

1. Reduce the processing time
2. Optimize the usage of available resources
3. To come up with an approach to withstand the increasing head counts.

Let’s take a look at our approach towards solving the above listed problems.

Our Approach
We looked at various cloud computing offerings and finally decided on AWS. This is mainly because of its cost effectiveness and composite service offerings like EC2, S3 and Simple Message Queue. This unique infrastructure provides flexibility for us to store and operate independently.

Our payroll software is well architected based on distributed computing. We could easily plug out the Payroll engine and make our first AMI. The core components of this AMI are Fedora, Java 1.5 and the Payroll engine. We have also created a My-Sql AMI to host our payroll database.

On the other hand, the transfer of entire database to S3 may pose problems due to its volume. So we decided to extract the necessary information using Kettle (It’s an amazing tool for data extraction needs) and upload to S3.

We launched four different instances of our Payroll AMI and one instance of My-Sql AMI, importing the database from S3. The db server was running with a binary log option switched on. The “payroll engine” configurations were modified in accordance with the My-Sql database and launched the payroll execution. The entire payroll process is completed in about 2 hours, which includes delivering the pay slip PDF to our employees email box.

Upon completion of the payroll process, we took a differential backup on the payroll database and stored it in S3. The differential database is pushed back to the original payroll database using kettle again.

Conclusion
It was a great experience running our payroll on AWS. This is not only because of the fact that it enabled us to reduce the processing time considerably (from 5 hours to approximately 1 hour), but we have also got a scalable solution to withstand our increasing head count day by day.

As we discussed, AWS has rich set of offerings for utility computing like EC2, S3, Simple Message Queue...etc. This would definitely change the enterprise computing to a new dimension soon and our payroll exercise is a classic beginning for that.

The biggest advantage of using EC2 is ‘Power Costs associated with air conditioning in India’ A 200 square feet data center consumes close to $1500 in power costs each month. We plan to embrace Amazon’s cloud computing for several applications that are not constrained by latencies of (250 to 250 ms) on the cloud.