• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Trying to wrap my head around http when authentication is required

 
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

I been trying to find ... really just good reads that help explain the details of what is necessary when interacting with a web server from a java point of view.

I've played a little with the Apache APIs, some with jutil, and of course native Java, and most of it I've been able to muddle through and understand, but I got stuck at trying to access pages from a web page that require authorization. I HAVE successfully written a couple of classes that do in fact authenticate to the server and I receive the OK response to my credentials. But that's where it stops, because after I've authenticated, then try to read pages that are customized for my login, the pages I get back are as if I never logged in. So I'm assuming that when I log in, I need to somehow keep that login token and also SOMEHOW pass it back to the server when I make further requests from it ... and it's that connection of post authentication and the HOW of pulling down information based on authentication that I cannot seem to locate any information on. It's like everyone out there talks about the process of getting info from the server, and they also talk a lot about how to log into a server, but I can't find anything that discusses with any significant explanation, how to maintain a logged in session while querying the server.

Any enlightenment would be wonderful, and thank you for taking the time to help.

Sincerely,

Mike Sims
 
Ranch Hand
Posts: 334
2
Netbeans IDE Tomcat Server Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think what you're missing is cookies. Here's one ofmany intros: https://docs.oracle.com/javase/tutorial/networking/cookies/

Joe
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Joe Areeda wrote:I think what you're missing is cookies. Here's one ofmany intros: https://docs.oracle.com/javase/tutorial/networking/cookies/

Joe


Yeah that isn't an overly complicated read... oh boy!

No, what I NEED is a solid example of authenticating to a web site, then pulling up different pages from that site with that login token being sent with each page query.
 
Saloon Keeper
Posts: 7582
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
For that kind of programmatic web access, I generally advise to use the HtmlUnit library. It makes the kind of back-and-forth that web requests and responses consist of a lot simpler than using an HTTP library like Apache's HttpClient. (And yes, the issue may well revolve around cookies, but HtmlUnit makes using those as easy as a single APi call that turns them on - which may even be the default.)
 
Michael D Sims
Ranch Hand
Posts: 167
1
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Moores wrote:For that kind of programmatic web access, I generally advise to use the HtmlUnit library.


Tim,

I did download and experiment with this library, but I'm still at a loss for being able to authenticate to a website, then make new requests to that site while making sure the web server sends those requests with my authenticated credentials. The documentation for HtmlUnit seems very ... limited ... perhaps? With no clear examples of what I am trying to do.

Maybe this will help clarify a little: Let's say you're using Chrome and you've gone to a site which asks you to log in. That URL might be something like this https://www.mywebsite.com/auth/login ... now that you're logged in, you need to pull up a page such as https://www.mywebsite.com/myprofile.html ... now with a web browser, I don't have to think about my credentials being passed to the second-page query, it just happens for me - magically it would seem.

This is the behavior I need to model in Java...

Hopefully you were already on the same page as this, but I thought maybe I would state it just in case.

Thank you,

Mike Sims
 
Tim Moores
Saloon Keeper
Posts: 7582
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
No worries, that's what I understood your problem to be :-) And no question about it, the docs of that library are sparse, to say the least.

I have various codes using it somewhere around here, and will try to dig those out, but likely not today. The way to provide authentication credentials is shown in http://stackoverflow.com/questions/29760463/htmlunit-basic-auth-issues

And then, cookies need to be enabled, but a brief web search only finds people having trouble doing that. I'll check that later as well.
 
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
HTTP is not a session connection protocol. That is, you don't make a "permanent" network connection to the server and keep using it.

Instead, it operates more in a hit-and-run mode. Every HTTP(S) request you make connects to the server, pushes the request payload up and receives a response back. Then it disconnects. Repeat over and over again.

Technically. In actual fact, there's generally a "keep-alive" mechanism that reduces some of the work for repeated connections, but as far as the app code is concerned, every request is a fresh connection.

Because of that. you cannot know the requester's identify merely because of the connection the way you would if you logged in as a time-sharing user on a traditional computer system. So instead the each request has to be self-identifying.

The preferred way to do that is to include an identify cookie in the request (jsessionid). The nice thing about this is that the built-in java.net http classes manage cookies for you automatically. As, of course, do all major web client apps (browsers etc.)

But sometimes that's not possible. There may be legal or physical constraints on using cookies. Or you may be working in a shop where the resident "genius" fatuously declares "Cookies are BAD because 'X' and are expressly forbidden in all our apps". Where "X" is generally some ignorant statement.  

Anyway, where cookies cannot go, there's a technique known as "URL rewriting". Well-designed webapps use this as a fallback mechanism because instead of carrying the session ID in a cookie, it's appended to the URL itself. For example, "https://coderanch.com/forums/posts/reply/1234;jsessionid=99e3ad7c". This is done by feeding the bare URL to the url rewriting method which is on HttpServletResponse (I think). You then post the rewritten links to the response (web page) that you send out so that clicking on them will ensure an identification for the next request.

Regardless of whether the session ID comes in from a cookie or from a URL, however, the session ID determines which user made the request. The session ID itself carries no data, It's just a "random" hash key into the server-side sessions dictionary, which allows the sessionID to resolve to something resembling or containing an HttpSession object, thus allowing the server APIs to find things when they're needed, such as the remoteUser login ID from the JEE security manager or session-scope application objects.

You should never attempt to cache the session ID on the client side. It's only a hash key and it's only guaranteed valid until the next Http(s) request is made to the server. In particular, it is a documented security feature in Tomcat that when an application user switches to https, a new session ID is generated and replaces the one that had been used previously. That keeps "man-in-the-middle" attackers from accessing secured resources by using an unsecured session ID.

Or in other words, the cookie doesn't go out to the client and stay there. Every HttpServletResponse updates the jsessionid cookie.
 
Tim Moores
Saloon Keeper
Posts: 7582
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here's code I just hacked together for logging into this very site. The page that gets returned after submitting the login form is your personalized start page, as you can verify when examining the page source (commented out at the moment). This assumes a form-based login; if the login is based on HTTP Basic Auth, see the link I posted earlier.
 
Tim Holloway
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
One thing to note: If a webapp is using J2EE standard container-managed security, there is no login URL that you can request.

When container-managed security is in control, the trigger that brings up the login form or dialog is a request by an un-authenticated user for a secured URL. When a URL that requires authorization is requested, the server sidelines the request and presents the login and processes the login. Then, if the login succeeds, the sidelined original request is re-activated internally so that the login process is completely transparent to the webapp. In fact, there aren't even any hooks to let a webapp know that a user has logged in or out in the J2EE spec. Partly because if you're using something like a single-signon security system, no such event may ever occur within the webapp.

A login form does have a URL, based on its WAR resource path, but attempting to login by using that URL directly will not work because the server context is not set up properly for login. The page will render, but the proper backend processing will not be dispatched.
 
reply
    Bookmark Topic Watch Topic
  • New Topic