Here's a little something I wrote on the topic a while back. You might find it interesting:
WebSphere Application Server vs. The Apache Web Server
Let’s look at a typical web based interaction with our application server at runtime.
Let’s use a typical scenario where the request is coming in from an Internet browser, such as Internet Explorer, Netscape Navigator or Opera. We’ll also simplify the interaction by eliminating the complications presented by workload enhancements such as the Edge Server components, network sprayers and caching proxies. We’ll just focus on a typical web based interaction between a web based client and our WebSphere Application Server.
From Client to Web Server
When dealing with web based requests, before tunneling through to our application server, a client will always hit a web server first. The WebSphere Application Server does not replace the need for a web server. A web server remains as pivotal a part of the WebSphere architecture as ever.
Web servers are great at doing one thing: serving up files. A web server takes requests from clients, maps that request to a file on the file system, and then sends that file back to the client.
If you want an html file, a web server can efficiently and reliably find that file and send it back to you. If you need an image, a web server can serve it up to you as well. You want to download a zip file or a pdf file quickly and efficiently? A web server can make that happen.
Unfortunately though, your web server is about as intelligent as a male model. A web server can serve up static files until the cows come home, but ask your web server to add ‘one plus one’ and you’ll be waiting there for a very, very long time.
A web server responds to a request for static content.
Request - Response Cycle with a Web Server
Static Content Requested
Static content returned
(html, jpg, pdf, mp3)
If our applications use any images, HTML, pdf or zip files, we like to keep all of those static files on the web server. If we need some logic or dynamic content in our applications, we will delegate to our Servlets, JSPs, EJBs and JavaBeans that are running on our Application Server.
Now here is the dilemma. Our application server contains all of our Servlets and JSPs, but all of the requests go through the web server, and the web server, not being a very clever machine, tries to handle all requests, regardless of whether the request is for an image, html file, or to our detriment, a Servlet or a JSP.
How do you stop a web server from trying to handle requests for our Servlets and JSPs?
The key to stopping a web server from trying to serve up JSPs or Servlets, is to install something called the WebSphere Plug-in on the web server.
The general idea, although not a hard and fast rule, is that before you install WebSphere, you should install your web server first. Even when you do a full installation of the Application Server, behind the scenes, WebSphere installs the web server first, and then installs the ‘WebSphere plug-in’ into that web server.
Web servers forward all request for JSPs or Servlets to the J2EE server. This all occurs over http/https.
Forwarding Servlet Requests to a J2EE Web Container
Servlet or JSP needed
What exactly does the web server plug-in do?
As was stated earlier, the web server tries to handle every single request that it receives. However, when the WebSphere Application Server comes onto the scene, it introduces itself to the web server and has a conversation that goes something like this:
WebSphere: Hey, WebServer?
Web Server: Ya, what’s up.
WebSphere: Hey, not much.
Web Server: What can I do for you?
WebSphere: Well, I know that you’re really great at serving up static files and all, but you’re going to get some crazy requests for JSPs and Servlets that won’t be able to find on your file system.
Web Server: Really? What am I going to do? I won’t be able to find any of these JSPs and Servlets, and I’ll end up sending a bunch of 404 errors back to clients, and the clients will be pissed!
WebSphere: Hey, calm down. Here’s what you do: just take those request and send them to me. I’ll handle the request, generate some HTML, give that HTML back to you, and you can send the HTML back to the client.
Web Server: Kewl. You do the work, but the client thinks it’s me handling the request? I like this arrangement already. How do I know what files to send to you though?
WebSphere: Don’t worry. I’ll make a thorough list and write it all down in a special file called plugin-cfg.xml. Just read that file every once in a while and keep up to date on which files you need to send back to me.
Web Server: Great. But when I do get a request for an item on the list, how will I know where to send it.
WebSphere: Hey, don’t worry. I’ve got it all covered. That plugin-cfg.xml file also contains a list of IP addresses/port combinations on which to send the request. It’s all right there in that plugin-cfg.xml file. And if you have a problem understanding how to use it, here’s a .dll file that explains everything to you as well. Read it every time you start up.
Web Server: Kewl. I think this is going to be a great relationship.
WebSphere: I think so too. It usually is.
From Web Server to WebSphere
When a client makes a request for a JSP or a Servlet, the request initially goes to the web server. The web server reads the plugin-cfg.xml file and realizes that the request that came in should be sent to the application server for processing.
The plugin-cfg.xml file also provides the ipaddress/port combination of listening application servers. The web server, using the http protocol, then sends the request to the WebSphere JVM listening on the appropriate port.
That JVM listening on the appropriate port represents our application server, and the port the JVM listens on can be configured through that JVM’s web container.
The web server handles the incoming request, and matches that request to the application server set up to handle the given Servlet or JSP.
Inside the Web Container
If the Servlet hasn’t been called before, the JVM loads the Servlet and then generates a thread to handle the request.
Servlets are shy little creatures. They sit on the hard drive just minding their own business, and don’t bother anyone if they’ve never been invoked. However, feed a few drink to those Servlets and get them loaded, and they remain resident in memory until the party ends, which happens when someone pulls the plug on the application server.
So, the request gets sent from the client, to the web server, and the web server passes the request to the Application Server, who in turn invokes and threads the appropriate Servlet.
What do our Servlet do?
Well, the Servlet can do pretty much anything the developer wants it to do. When programming Servlets, a developer is only limited by their creativity, and more likely, their Java programming skills.
Typically, a Servlet implements some control logic. For example, a Servlet might figure out what a user typed into some textfields in a web-based form. It might then take that information and save it to a database.
Servlets are intended to be controllers. While Servlets can interact directly with a database, they’re not really supposed to. Instead, Servlets are supposed to delegate to a JavaBean or an EJB to do such things. Let’s say, for the sake of argument, our Servlet calls an EJB.
From a Servlet to an EJB
To call an EJB, especially one residing on an application server in a galaxy far, far away, we must first connect to the naming service of that remote application server.
The naming service is like a gatekeeper for objects running on a server. If someone wants access to a remote EJB, they call on the gatekeeper and ask if an EJB named “com/ibm/UserEJB” is around. If there is indeed an EJB with that name running around the server, you’re in. If not, you get an exception.
So, to call our EJB, we first connect to the EJB’s naming service and ask if it’s Home. If it is, we get a handle to that EJB and can call its methods, just as though it were a regular JavaBean.
So, what will do with this remote EJB? Well, probably tell it what the user typed into textfileds that appeared on the users web page. The EJB can then shove that information on a message queue, or if it is an Entity Bean, the data might even get saved to a database. The world is your oyster when you’re using EJBs.
Now you may not have noticed the little slight of hand that was played on you there, but there was a switch of protocols when you weren’t looking.
When the client request is routed from the client to the web server, http/https is the protocol. When the web server forwards to the web container of the application server, the protocol remains http/https.
Web servers forward all request for JSPs or Servlets to the J2EE server. This all occurs over http/https.
Servlet Invoking an EJB
(Servlets and JSPs)
result returned to calling Servlet
However, when a request is made from the Servlet engine to the EJB container, the protocol switches from http to RMI/IIOP (Remote Method Invocation over the Internet Inter Orb Protocol).
Don’t worry, the whole protocol switch happens behind the scenes, so we don’t have to worry about it in our code, but it is empowering to know what is going on under the covers.
From the Web Container, Back to the Web Server
Once the Servlet is done interacting with the EJBs, JavaBeans or other Java components that might help the Servlet implement control logic, it then figures that it has to send some response back to the client. After all, the client always likes to see a web page that lets them know that everything is working the way it should.
Servlets themselves don’t generate HTML to display to the client. Well, they can, but again, they’re not supposed to. Instead, they forward to a JSP.
The JSP then runs, and when it’s done running, it forwards all of the html it has generated back to the web server. The Web server then forwards the html back to the client along with any images or other files the html page might need to display properly.
And that’s it, a simple round trip for our J2EE application.
What if we use the Edge components?
The above scenario assumes all of your requests are routed through a single web server. Of course, your site might just be so popular that one web server isn’t enough. If that’s the case, you’ll need to set up two or more web servers, install the WebSphere plug-in on both machines, and then get a third machine that will work as an IP sprayer.
An IP sprayer handles all incoming requests, and then sprays those requests across your various HTTP servers. IBM Edge Components provide a software package that allows you to set up a machine to act as an IP sprayer. It also provides a variety of caching mechanisms as well.
Nortel and Cisco also provide hardware solutions for spraying requests across web servers. The Edge server also provides a few tools that can pull performance information off your web servers and allow the Nortel or Cisco routers to spray the web servers that are most capable of handling requests.
How does security effect overall application flow?
When you turn security on, the WebSphere Security Service will challenge the client for a username and password as soon as a secured resource is requested. The client then provides the appropriate credentials, and WebSphere will validate those credentials against a user registry, most likely an LDAP server.
If your credentials check out, and you are indeed authorized to view the Servlet you selected, WebSphere will invoke that Servlet, and your credentials will even be passed onto any EJBs or JSPs that your Servlet subsequently invokes.
When content is returned to your browser, WebSphere will go so far as to place a little LTPA token in a cookie it plants on your machine. The cookie keeps getting sent back to the server on subsequent requests, and the token tells WebSphere that it can trust you and that it doesn’t need to ask you for your username and password again.
Which Web Servers can you install the plugin into?
The IBM plugin can be installed into all of the major web server vendors on the market today. This includes:
F Sun One
F Microsoft IIS
Can I bypass the web server and tunnel directly to my application server?
Yes, you can bypass the web server. WebSphere has what is known as an ‘embedded http server.’ It is this embedded web server that handles requests for the actual web server handling requests on port 80.
The Web server takes requests in on port 80, and then forwards them to the web container of the application server, which is usually handling requests on port 9080 or 9090.
You can explicitly specify the IP address of your server and the port number, and test your Servlets and JSP files directly. It’s a great way to test your applications, but you should never use it in production.
In fact, if you look at the admin console for WebSphere, using the default configuration, it runs on port 9090, and is not accessed through a web server.
Charging millions for wrapping open source software and putting your own trademark on it is a very lucrative business to be in.
One more query i have.
I am having a project which contains struts and EJB parts.
For which server (IBM HtppServer or Websphere AppServer) I should go?
Will IBM HttpServer will work in this scenario?
WebSphere Application Server is a J2EE- or JEE-compliant Java Application Server and is a completely different product. Think of it as a Java plugin for Apache Http Server equivalent to any other language plugin (CGI script, PHP, Ruby, Cobol, etc.). Http Server (Apache/IBM) doesn't understand how to run a Java program but the Application Server plugin does. Http Server passes the entire HTTP request to the plugin to process; Application Server processes the request and passes the HTTP response back to Http Server which in turn passes it to the TCP/IP stack for serializing back to the browser.