Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Transferring (over HTTP) and reading a file  RSS feed

 
Ryan McClain
Ranch Hand
Posts: 153
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This concept still somewhat mystical to me.

You click on a link on a website and the link refers to a JAR/zip file.
-> Your browser then generates a GET request.
-> The server (or CGI program) reads the Bytes from the zip/jar file and copies them over to the response stream and sends back the response to the client.
-> The client browser reads the Bytes from the stream, uses the appropriate MIME-type and successfully renders the result to the user; in this case it prompts the user with a SaveFileDialog box.

I understand that the Bytes are all being copied over, one by one, literally representing the actual file. Is that how computers handle and read files? By their Bytes (representation)? So the 'actual file' is just a series of Bytes? e.g. if you take a JPEG, drag it to notepad and add some text of your own to it and re-save it as JPEG, the OS can successfully read your new JPEG with some embedded information in it? I believe that certain people used this method to create executable JPEG files (malicious embedded code within).
So when the server 'copies' the file it actually just reads all of the file's Bytes as a stream? Is that valid for the client OS receiving it? It's essentially just doing a 'copy' (OS command) of the Bytes? It's enough to just send over Bytes to the client and his OS will have no problem reading those Bytes? I could as well give the client all of the Bytes of my JPEG in text form (let's say img.txt) and all he has to do is download img.txt and save it as img.jpg and it would display a picture instead of text?



 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Correct, a file is just a sequence of bytes, no matter what it contains.

if you take a JPEG, drag it to notepad and add some text of your own to it and re-save it as JPEG, the OS can successfully read your new JPEG with some embedded information in it?

Unless Notepad has gained image editing capabilities recently (which I doubt), this won't work. The only way to edit images is to use an application that understands the image file format in question (JPEG in this case). THat's also the reason why this won't work:
I could as well give the client all of the Bytes of my JPEG in text form

An image is binary data, it can't be treated as text.

Even though all files consist of bytes, they're not equal. An text editor like Notepad can handle only character data, abd evn then you may have to apy attention to the encoding being used in the text (not all text editors are smart about this).

Binary files (images, PDF, DOC, XLS, applications etc.) can only by handled by a hex editor -which lets you manipulate the bytes directly- or by an application that understands the file format: Photoshop, Acrobat Pro, Word, Excel etc.
 
Ryan McClain
Ranch Hand
Posts: 153
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just dragged a 27 KB BMP file into notepad. I got the following output in notepad:
http://pastebin.com/nFRe0bh8

What exactly is happening here? Notepad is trying to interpret the Bytes as characters?
 
Ulf Dittmer
Rancher
Posts: 42972
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes. Notepad assumes it deals with text files, which a BMP isn't.
 
Henry Wong
author
Sheriff
Posts: 23283
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ryan McClain wrote:I just dragged a 27 KB BMP file into notepad. I got the following output in notepad:
http://pastebin.com/nFRe0bh8

What exactly is happening here? Notepad is trying to interpret the Bytes as characters?


Yes. Notepad is trying to interpret the binary data as characters. The reason that there are lots of "?" marks is because that is what is shown when it hits a character code that can't be rendered.

Henry
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!