what i wanted to do is not easy to answer, sorry for the long post here.
my app was attempting to do crypto analysis of simple substitution.
i already learned how to break viginere by re-writing a open source python program into java.
so i had some experience.
each language has it's own frequency, dutch and english for examples since you know these.
using those freq tables, i can compute an approximate guess how many of each letter are in a message of any given length.
for example if the message is 20 chars long, i might expect 5 e's, 3 a's, but no n's at all in english, but those expectations will differ for dutch, and there is more chance of the n so maybe we expect 2 n's.
(i didn't really compute those numbers, so its just an example.)
you will not know what the language is beforehand, since its all just scrambled letters, so you must decode for each possible language trial and error style.
i was just finding the most common letter in the message, then re-mapping it to the letter it was most likely to be in the chosen language.
the hashmap i was creating is for replacing. fakeLetter -> realLetter
i assumed that replacing by frequency alone would be perfect, but i was wrong, it gives such a poor result that it never creates something readable.
(i didnt ever figure out how to make the hashmap using the streams, i tested it a different way.)
so my entire algorithm is a failure. my fault, i didn't look up the correct algorithm, i tried to design it myself.
at this site they say i should use a hill-climbing method and quadgrams.
i have asked here before and nobody at java ranch knows anything about crypto or any things related to it such as hill climbing & chi square tests.
so i am left to figure a lot out on my own, very few websites deal with this, and the practical crypto site which is the best one to consult has no forum at all.
i would have to re-tool my app now for quadgrams, i don't know if i can even find tables for those, if not i would have to gather lots of texts just to analyze and build quadgram freq tables myself.
accurate calculation of this would need tons of plain texts in various languages, 1000 pages of text or more per language i think, not sure. finding all this text would be hard.
where can you download entire books of ascii texts in all languages?
but after all that i would still have to figure out a hill climbing algorithm. so this isn't a very beginner type of project. this is why i just ask how to do certain things and not give much details about what i'm trying to do.
because you all will think i'm crazy for trying it!
and on top of that you won't understand it, which is funny since i had thought everyone here was taught to learn crypto breaking stuff like this at uni, i'm not in uni i'm just teaching myself.
i have spent a year creating this thing so far, it has charts and graphs to show observed and expected counts since i coded the ui in javaFX. i learned a lot doing it even if i can't finish it.
i would be very sad if i can't finish it though since i put so much work into it.