We are working on determining an in house hardware configuration using WAS (8 Core AIX, 50 GB mem) /DB2 (4 Core AIX, 32 GB mem) servers running a JEE Application.
There would be corresponding failover servers.
What we foresee is that after the system has gone live, there would be a need of server capacity to be scaled up to accommodate bulk processing of data, but this increased capacity (in terms of Cores/Memory) will only be required for
a very short duration e.g. a week or two.. twice or thrice a year, every year.
My question is how we can scale-up and scale-down the capacity on demand with minimal impact to a running system.
One option on the table is to introduce a new AIX+WAS working on the same Database and use that exclusively for the bulk processing.
Any other options?
Do you mean you already have the hardware, and you want to start using the hardware when demand goes up? What do you do with that hardware when you are not doing your bulk processing? Use it for something else? If you don't have anything else for the hardware to do, why not just keep them running all the time?
Or do you mean that you don't want to buy that much idle hardware, and "borrow" it only when you need it?
Actually, this use case is exactly what cloud computing is designed for. The idea is that there is a cloud provider that hosts several applications. The cloud provider decides how much resources to give each application. Depending on the load on the application the cloud provider will deploy the application on as much hardware as the application needs. One caveat is that you have to design the application to be able to run on the cloud. You need to make the applications stateless for example. One common pattern to use is map-reduce. COnceptually, you should be able to do what you need to using cloud, as long as you design your code the right way
Practically, there are several options that you can look at. You might want to research more on cloud providers, because I won;t be able to do justice
One option is to look at commercial cloud providers, like Amazon. At the very basic level, Amazon gives you access to raw hardware. Whenever you need it, you ask Amazon for machines using either their UI or API calls. They will give you some IP addresses. You install your software on the machines, process your data and then tell Amazon to "shutdown" the machines (the machines aren;t really shutdown. they just go back to a pool for someone else to borrow) You pay AMazon by the hour. So, if you used the machines for 24 hours every month , you pay them for 24 hours. ANother option is to use Amazon's EMR. Amazon's EMR runs on Hadoop that is hosted on Amazon's infrastructure. You submit "jobs" to the Hadoop cluster, and the jobs are executed by "workers" that you built. Amazon controls how many nodes in the cluster will execute your worker depending on how you configure it. You pay them by number of jobs (or something like that). The biggest advantage in using a commercial cloud provider is that you have 0 upfront investment. You just pay for what you use. And I think Amazon actually gives you some resources for free for 1 year. This is to give you some time to build the application and start earning money before they charge you. They are hoping to get money from you when you are succesful. The disadvantage here is that your code is executed on Amazon's infrastructure. So, you need to worry about things like security, network latency and unreliability of Amazon. ANyways, don't want to go too much in detail here. The industry has solutions to deal with those issues. You will need to find more once you go down this route
Another option is to host a cloud yourself. This options is costlier, but you are hosting everything in your infrastructure., so you worry about security, latency and reliability less. Also, if your hardware is being used most of the time, it;s going to be cheaper to use your own hardware than get it from Amazon. I wouldn;t reccomend this unless you yourself have several applications that need to share hardware. Basically, there are products like Xenserver that you can deploy on your own hardware. They will make your internal hardware act like a cloud. Conceptually, you will use it in the same way you would use Amazon above (although the APIs will be differrent). When you need hardare, "borrow" it from your own cloud, deploy software, run, return it back. Or you can even host a Hadoop cluster in your own hardware. Hadoop os open source. It costs nothing. Again I recommend this option only if you have several applications that want to share hardware. If you don;t have several applicatins, your hardware is going to be idle anyways. If you do go this route, using Hadoop is by far the easiest option
Third option is to mix both, and this is what most companies grow into. You can have an in-house cloud that can handle most of your load, and you "burst" into the AMazon cloud only when you need it. Most companies start out with using Amazon, and when they become succesful, they realize that their AMazon cluster is up 100% of the time, and they are paying Amazon through the nose. SO, they go into hosting a cloud in house, and send peaks to Amazon. For some reason, Amazon is completely happy with this business model, and even encourages you to do this.