• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

GZIP deflating response from server failure help!

Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have been trying for a week now to correctly use a socket to read the stream of a response from a server than simply parse out the GZIP response body portion and than place those bytes into the GZIPInputStream and get back my results as a string. I'm able to get it to unzip for certain responses however others fail? I think I'm incorrectly removing bytes from the stream maybe when I dechunk the response. The response is sent in a chunked format so I need to remove those bytes and put back together in order for the gzip file to be correct. So here is what I do first let me give you the gzipped test response I have created:

Gzip Unit Test

The contents of gzip.php which send that content gzipped up

Here is the response and the parsed content I get from the server,after parsing the content of response out and after cleaning up erroneous chunk data out:

I'm assuming I must be stripping out an important byte somewhere when I'm cleaning up the bytes here is how I get those bytes above:

To strip out the headers and body from response I do the following:

To me that process looks to be getting the correct content as shown by the bytes above chunked response body looks to be the content of response in bytes to me. Know because this particular response is chunked I need to remove the erronous chunked headers insinde that response body:

I have a sneaking suspicion this is where i'm stripping out maybe a byte or leaving a byte that should not be a part of the gzipped data. After this I send those bytes to the GZIP deflater below:

So whats happening is for a response like the one below after gzip deflates I get the following:

String after deflating and excetion thrown:

But a simple change of the s in Lights in response to Lighta like the following response:

I get back correctly unzipped with no exceptions? Can anyone give me any ideas about what may be wrong? I tried taking that same response content and place inside a file and gzip it and it unzipps just fine. I even tried to compare the bytes of the gzips from sever vs file and they are completely different for some reason. Even when the server one correctly unzips so I can't use that as a way to compare the bytes to see which byte from server may be missing?

This was what the response looks like in bytes from file:
From File which decompresses correctly:
[31, -117, 8, 0, 0, 0, 0, 0, 0, 0, -77, -55, 40, -55, -51, -79, -29, -27, -78, -55, 72, 77, 76, 1, -47, -71, -87, 37, -119, 10, 25, 37, 37, 5, -70, -87, -123, -91, -103, 101, -74, 74, -50, -7, 121, 37, -87, 121, 37, -70, 33, -107, 5, -87, 74, 10, -55, 16, -98, -83, 82, 73, 106, 69, -119, 62, 72, -77, -75, -126, -77, -121, 99, 80, -80, 107, -120, 109, 105, 73, -102, -82, -123, 18, -56, -112, -110, -52, -110, -100, 84, 59, -49, -68, -92, -4, 10, 5, 93, 5, -1, -46, -110, -100, -4, -4, 108, -123, -16, -44, 36, 5, -57, -28, -28, -44, -30, 98, 5, -97, -52, -12, -116, -110, 98, 0, -127, 62, 118, 36, 125, 0, 0, 0]

As you can see the one from server and file look very different so I can't use them to compare. I'm open for any and all ideas on what to try next.
Ranch Hand
Posts: 245
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
parseChunks2 is incorrect.
I have made a test program which seems to work after I changed the method to:

It is still not 100% ok. It should use the lengths from chunk headers to determine where each chunk ends. Searching for CR LF is not reliable since the (unchunked) content may also contain CR LF.

steve labar
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you very much for taking the time to look at the code. I will redo how I parse the chunks and retry!
A day job? In an office? My worst nightmare! Comfort me tiny ad!
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
    Bookmark Topic Watch Topic
  • New Topic