This week's book giveaway is in the Kotlin forum.
We're giving away four copies of Kotlin in Action and have Dmitry Jemerov & Svetlana Isakova on-line!
See this thread for details.
Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

can't able to write the actual format on a file  RSS feed

 
pradipta kumar rout
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sir,
I have used following code to read a file on web and write it on my computer ,but the text present on the web page is not the same after saved on the computer,

i can't able to find out the cause ,I think the text file uses ANSI format and the web page uses UNICODE or UTF-8 format ,is this the cause why I am geting dfferent out put or any other cause is there,

I wnt to read and save the content the format it has,
if the file is in ANSI then the file should saved in ANSI format
If the file is in other then it should be saved in that format,,

kindly tell me how to get this..

bellow the code i have used but can't able to read and write in actual format



i am getting a file name content_save.txt but in that file the title of the web page and other thing are not same as actual page
 
Maneesh Godbole
Bartender
Posts: 11445
18
Android Eclipse IDE Google Web Toolkit Java Mac Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i am getting a file name content_save.txt but in that file the title of the web page and other thing are not same as actual page

What do you mean by not as acutal page? Can you give an example?

 
pradipta kumar rout
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes,
open the web page I have mentioned and see the source. and save that to a file manually .(the web page is "http://ru.wikipedia.org/")

then use my code to save that source to your computer after that when you will open that .txt file you will see the title of the web page and other contents are changed to some other format ....
 
Maneesh Godbole
Bartender
Posts: 11445
18
Android Eclipse IDE Google Web Toolkit Java Mac Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your problem is the encoding.
Try using OutputStreamWriter with UTF-8 encoding
 
pradipta kumar rout
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you for your response ,

Can you give me an example how to read and write using OutputStreamWriter class
 
Maneesh Godbole
Bartender
Posts: 11445
18
Android Eclipse IDE Google Web Toolkit Java Mac Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The code which you posted, did you write it yourself?
Change the PrintStream to OutputStreamWriter. You can specify the encoding in the constructor. Use the write() to write to the file.
 
Paul Clapham
Sheriff
Posts: 22489
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If your goal is simply to download the data from a URL, then converting it from bytes to chars and then from chars back to bytes again is (a) a waste of time and (b) likely to mangle your data if you do it wrong.

Which you did do it wrong. You used an InputStreamReader using your system's default charset, whereas the charset of the page is UTF-8 (which you can tell by looking at its contents).

So don't do that. Just copy from the URL's input stream to a FileOutputStream.

And forget about ANSI versus not-ANSI. That isn't a useful concept in the real world where documents can be written with any of several dozen charsets.

Also, when you look at the file you write, don't forget to use a display tool which (a) reads the file using the correct charset and (b) is able to display non-Latin scripts.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!