Thank you for the reference to the explanation of text encoding. It was very helpful.
My program reads in text from a text file.
This is what part of the text looks like:
農場
farm
fan
fail
field
望む
hope
hold
hour
hop
Then, my program parses the text and makes a few more text files using the text.
This is how the program reads in the text:
BufferedReader br = new BufferedReader(new FileReader(block + ".txt"));
String line = "not null";
while (line != null){
line = br.readLine();
if (line != null){
// Puts the line in a String[] and does some other processing
}
}
This is how the program writes to one of the output files:
for (int i = 0; i < osTotal; i++){
bw.write("<TR #FFFFFF");
}else{
bw.write("#EEEEEE");
}
bw.write(34);
bw.write("><TD>");
bw.newLine();
bw.write(osAra[i].befUnd);
// osAra[i].befUnd is the Japanese words 農場 and 望む
bw.newLine();
bw.write("<FONT #00FF00");
bw.write(34);
bw.write(">");
bw.write(osAra[i].rightAns);
// osAra[i].rightAns is the English translation of the Japanese,
// specifically, farm and hope
bw.write("</FONT>");
bw.newLine();
bw.newLine();
bw.write("</TD></TR>");
bw.newLine();
}
This is what part of the output file looks like. The first four lines are what happened when
the program processed the words 農場 and farm. The last five lines are how the program
processed 望む and hope, which is the same way the rest of the text was processed, and which
is the way I expected the program to work.
<TR ><TD>
・<FONT COLOR="#00FF00">farm</FONT>
場
</TD></TR>
<TR ><TD>
望む
<FONT COLOR="#00FF00">hope</FONT>
</TD></TR>
As you can see, the 農 (no) of 農場 (nojo) is rendered unreadable, and the elements are
switched around.
Instead of
nojo [linebreak] <FONT COLOR="#00FF00">farm</FONT> [linebreak] [linebreak]
I have
[unreadable] <FONT COLOR="#00FF00">farm</FONT> [linebreak] jo [linebreak]