• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

can't able to write the actual format on a file

 
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sir,
I have used following code to read a file on web and write it on my computer ,but the text present on the web page is not the same after saved on the computer,

i can't able to find out the cause ,I think the text file uses ANSI format and the web page uses UNICODE or UTF-8 format ,is this the cause why I am geting dfferent out put or any other cause is there,

I wnt to read and save the content the format it has,
if the file is in ANSI then the file should saved in ANSI format
If the file is in other then it should be saved in that format,,

kindly tell me how to get this..

bellow the code i have used but can't able to read and write in actual format



i am getting a file name content_save.txt but in that file the title of the web page and other thing are not same as actual page
 
Bartender
Posts: 11497
19
Android Google Web Toolkit Mac Eclipse IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

i am getting a file name content_save.txt but in that file the title of the web page and other thing are not same as actual page


What do you mean by not as acutal page? Can you give an example?

 
pradipta kumar rout
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes,
open the web page I have mentioned and see the source. and save that to a file manually .(the web page is "http://ru.wikipedia.org/")

then use my code to save that source to your computer after that when you will open that .txt file you will see the title of the web page and other contents are changed to some other format ....
 
Maneesh Godbole
Bartender
Posts: 11497
19
Android Google Web Toolkit Mac Eclipse IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your problem is the encoding.
Try using OutputStreamWriter with UTF-8 encoding
 
pradipta kumar rout
Ranch Hand
Posts: 43
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you for your response ,

Can you give me an example how to read and write using OutputStreamWriter class
 
Maneesh Godbole
Bartender
Posts: 11497
19
Android Google Web Toolkit Mac Eclipse IDE Ubuntu Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The code which you posted, did you write it yourself?
Change the PrintStream to OutputStreamWriter. You can specify the encoding in the constructor. Use the write() to write to the file.
 
Marshal
Posts: 28425
102
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If your goal is simply to download the data from a URL, then converting it from bytes to chars and then from chars back to bytes again is (a) a waste of time and (b) likely to mangle your data if you do it wrong.

Which you did do it wrong. You used an InputStreamReader using your system's default charset, whereas the charset of the page is UTF-8 (which you can tell by looking at its contents).

So don't do that. Just copy from the URL's input stream to a FileOutputStream.

And forget about ANSI versus not-ANSI. That isn't a useful concept in the real world where documents can be written with any of several dozen charsets.

Also, when you look at the file you write, don't forget to use a display tool which (a) reads the file using the correct charset and (b) is able to display non-Latin scripts.
 
reply
    Bookmark Topic Watch Topic
  • New Topic