Hi Daniel & Ganapthy,
The "pro" (if you can call it that) of a single point of failure is that it can be quicker and easier to set up and maintain.
To get around the single point of failure, you may decide to run multiple copies of your application. You then have to work out how to handle the fact that a user's transaction may be going to either application - how do you maintain your sessions across different instances of the software? (can your container handle that for you?). This can then get compounded if your computer now forms your single point of failure, and you set up multiple computers to run your multiple instances of your application (possibly at different sites).
The "con" for a single point of failure is that you have a single point of failure.
If you have a power outage in the only location where you are running your application, then your application stops. If your company is relying on getting bookings through your application 24 hours a day, 7 days a week, you could loose thousands of dollars while you have this outage. If you have removed the single point of failure (in this case by having a second machine in a different location), then you continue to receive bookings while one machine is off line.
Single point of failure is architectural related or application related?
It can be either.
The ideal situation is to have multiple instances of your application running on multiple machines at multiple sites. However many companies are not willing to accept the time it would take to engineer such a solution, nor are they willing to meet the costs. So it usually comes down to a tradeoff on what sort of outage they are willing to accept, and what you can do to fix what they are unwilling to accept.
This may mean that when you design your standalone application, you may have to think about making it as robust as possible, so that it still continues to run even if it gets bad data (nothing worse than an application that crashes at 6pm on a Friday night, and doesn't get restarted until 9am the following Monday). And then possibly design it so that it can be automatically restarted if it crashes (so no GUI on your server requiring user interaction). The design it to come back up in a suitable manner - how much of the state it had before the crash can be restored?
Or management may not be happy with the 30 second - 1 minute outage that having an application restart may cause. So you may have to have multiple instances of your application. Then you have to decide whether one is a failover for the other (may reduce the outage to 5 seconds or less), or whether they both run simultaneously (so perhaps zero outage). But then you have to design your application so that such a configuration is possible.
And, as architects, we have to be aware of the platforms we will run on. (The best written application in the world will not behave very well if it is working on old equipment in a non ventilated dusty room.
) So if management have said that your application is mission critical, and that no outage under any circumstance is acceptable, then you may have to look at the hardware - do you go for Stratus (or Tandem) equipment in multiple locations?
These are all just questions you might have to think about. Basically you want to identify any point where a single failure can stop your application from running. And then decide whether to fix it or not (or more accurately: identify the issue to management so that they can decide what to fix). Even in that multi location scenario I just gave, you may still have a single point of failure if the second machine requires the internet connection at the first site to be working.
Regards, Andrew