I am currently working in a project that uses the Google Search Appliance (GSA), and although its default usage is to act as a web crawler it can be fed so called "Feeds" that can be just as simple as a list of URLs to crawl, up to a list of content (ie the full content) that it should index. These Feeds are in XML format, and are fed to the GSA by doing a HTTP POST to it. I have written the code that does the POST call, but now I have a few problems:
I have no easy access to the GSA server, and it is used in a live environment already, so I am not allowed to do "all sorts of testing". So for now I can not know for sure that the code I have written (using HttpClient) performs the POST correctly so that the GSA will accept it. By using the DEBUG logging mode I managed to get HttpClient print out various information about the request it sends (when I tried it against some random web server). But that information wasn't perfectly clear, so I still don't know if it is in exactly the right format.
But even if it I could confirm that it *is* in the right format, I would very much be able to have a unit test that verifies that, instead of relying on analyzing some debug printouts. So I started looking around, but couldn't find anything really useful in this area. I found a lot of information about how to test a servlet, and mocking http requests. But what I want is the exact opposite. I want to mock a servlet that can take real http requests, and then verify that the requests are in certain format. And I want this mocked servlet to be deployed in a temporary web server on the fly for the duration of the unit test listening on localhost, so I don't have to have any external dependencies.
Does anyone has any tips on how to go on with this? Can this be done with Jetty? Also, if someone is an expert in the HTTP protocol and HttpClient, then maybe he/she could tell me right away if I my code seems about right:
Here is the documentation for the GSA Feeds:
And here is the section that describes what the HTTP POST should look like:
Here is their example of what the post should look like:
Here is my code that sends the request:
And here is the relevant the debug output:
What worries me is that it doesn't output any information about the "feedtype", "datasource" and "data" parts that are added. If I remove the line where I add the "data" part, then it prints out a lot more information, regarding "feedtype" and "datasource":
Is there no way of making it print out this information when I add the "data" part too? Is it because it sends it in a binary form? If so, is there a way to send it in regular urlencoded form? Because that is what the GSA expects. And the file I send *is* a regular plain text xml file. And regarding this, I have no idea why the GSA want it in urlencoded format, because they say:
You should post the feed using enctype="multipart/form-data". Although the search appliance supports uploads using enctype="application/x-www-form-urlencoded", this encoding type is not recommended for large amounts of data.
...and when sending so called "Content Feeds" the data sent can be quite large. Up to 1GB. Isn't it terribly inefficient to urlencode 1GB text data? But still, that is what they seem to demand that I do, even though they don't want me to specify enctype="application/x-www-form-urlencoded", so that confuses me a bit.
Oh, and I use httpcomponents httpclient 4.0.1 if anyone wonders.
David Newton wrote:Sure, it could be done with Jetty, why not? Or any other container in which you could write a servlet (or whatever).
Well, I know it can be done. I just wanted to know if anyone know of any existing framework that helps me test this. Just as there are frameworks that help produce the http requests when one wants to test a servlet, I thought there might be some framework that helps me test this reverse scenario.
I'm not convinced yet you'd *need* to, though--if you construct a request and want to unit-test that request, aren't you basically unit testing HttpClient?
Well, the way I see it I wouldn't be unit testing htttpclient itself, but instead my own usage of it. And as far as I can tell there are no exact instructions on how to use the httpclient to produce requests in the format that I need (as described in the GSA documentation, see the link in my first post).