• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Liutauras Vilda
  • Knute Snortum
  • Bear Bibeault
Sheriffs:
  • Devaka Cooray
  • Jeanne Boyarsky
  • Junilu Lacar
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
  • salvin francis
Bartenders:
  • Tim Holloway
  • Piet Souris
  • Frits Walraven

AES without base 64?

 
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Does anyone have any source code example for how to use AES in JAVA, but without transforming strings into base 64?  For some reason all of the examples that I've found do that, and when I try not to I get errors, like sometimes I think it seems to cut strings short or mess them up when converting into byte arrays and so on.

Anyway, I really need to avoid using base 64, because for one thing it's less efficient for memory, but more importantly, I need to make my files compatible with the equivalent algorithm already implemented in C#, and it does not use base 64, but rather it just takes plain strings (they're ASCII, and I guess Java uses Unicode, but regardless, I really just need a way to pull the string/array in as is, no matter what the data contains, and encrypt it straight from that).

Help would be much appreciated, thank you!
 
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Encryption works on bytes. So if you have a string, you first need to encode it to a byte array in some way before you can encrypt it.

How to encrypt the data exactly depends on what you're going to do with it afterwards. Are you going to send it to somebody else? Are you going to store it on the same machine, to decrypt it later? Is the storage long term or short term?
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oh, sorry, I didn't realize there was a section for security or I would have put it there.

I know all about different algorithms for different purposes and all of that, but it's not really the issue here.

The problem is that I just don't seem to be able to encrypt without first converting to base 64, which I do not want to do, but all examples seem to do it, so I was wondering if anyone could paste or point me to some source code that does it without the conversion.

I think something might possibly be going wrong when I convert between a string and byte array but I'm not sure.  I just know that the result of the whole thing is wrong, and when I use base 64 it seems to be able to encrypt and then decrypt back to the original data, but when I don't use base 64 it can't.  Also it seems like the string might have been somehow cut short when converting to a byte array, but there weren't any null characters or anything that might trick it into thinking that it's the end of the string (though I can't promise that would always be the case - I was just using some test data).
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Please post what you've tried, including your edits to encrypt a byte array instead of a Base64 string. We'll comment on what the problem is.

Terrance Samson wrote:I know all about different algorithms for different purposes and all of that, but it's not really the issue here.


Well if you want to see example code, it kind of is an issue. I can not give you a general purpose algorithm that uses AES, because it will get used in a context in which it was not intended. You need to use appropriate block cipher and padding modes.
 
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why do you not want to use Base64?
It's an encoding which can be very easily undone.
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well I certainly understand that desire. Base64 has absolutely nothing to do with encryption. Why deal with Base64 at all if all you need to do is encrypt a message?
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Zachary Griggs wrote:Why do you not want to use Base64?
It's an encoding which can be very easily undone.



As I said, it increases the size of the data, and it's also not compatible with the already existing C# version of the application which must be compatible with the Java one that I'm making, so that files written using either application can be read using the other one.

As for example code, I might be able to get you some when I have time but all I really need is something like public String encrypt(String plainText, String key) and public String decrypt(String cipherText, String Key) or byte arrays in place of strings, or whatever.  And I don't really care about the padding mode and all that, because I can adjust those settings as necessary.

Really, the only thing that's not working is that when I use base 64 it encrypts and decrypts just fine, but when I remove the base 64 conversion, it doesn't.
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My understanding is that base64 is used to convert the result of the AES algorithm to a String because if you don't, the returned data will have non-printable characters in it. How are you testing your code without base64? Are you copy/pasting the output from the encrypt into the decrypt? If so, it may be charset issues causing the problem, not an actual issue with the algorithm. But yes, without code posted, we can only guess at the problem.
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Zachary Griggs wrote:How are you testing your code without base64? Are you copy/pasting the output from the encrypt into the decrypt? If so, it may be charset issues causing the problem, not an actual issue with the algorithm.


You can easily pass around byte arrays without transporting them as Base64. Again, Base64 is unrelated to encryption/decryption.

But yes, without code posted, we can only guess at the problem.


Agreed. It's time for Terrance to show us what he's tried.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alright, I said I'd supply the code, and here it is.  I pretty much got this from examples on the Internet:





That works fine, but with base 64.  When I try without it I change the return line in encrypt to:

return Base64.getEncoder().encodeToString(cipher.doFinal(strToEncrypt.getBytes("UTF-8")));

And I change the return line in decrypt to:

return new String(cipher.doFinal(Base64.getDecoder().decode(strToDecrypt)));

But doing so causes it to print this to the console:

PêŠSò)ú/{]HK%Ç76bYp片ɨq=

© ï†û8¼­;²ÃPrc‘œLóEYɾ6?³ 7𪾯E
Error while decrypting: javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.
null

So it seems to encrypt, though possibly not correctly, then there's an error when decrypting, which seems to have to do with padding.  Honestly I didn't remember what it was doing until I saw it just now, but I don't see why there would be an error, since the only difference is whether or not I use base 64.  In any case, I can try using the ToString function instead of the String constructor, but I don't think it makes a difference.
 
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Without the encoding, the result is binary data. It can't be held in a string. Doing so, will likely mess up the binary data, and hence, can no longer be decrypt correctly.
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:

Zachary Griggs wrote:How are you testing your code without base64? Are you copy/pasting the output from the encrypt into the decrypt? If so, it may be charset issues causing the problem, not an actual issue with the algorithm.


You can easily pass around byte arrays without transporting them as Base64. Again, Base64 is unrelated to encryption/decryption.


Yes - but as shown, he is attempting to use Strings, not Bytes. This is why Base64 is used. It's not directly related to encryption/decryption but it is a character encoding that is used to make the result of encrypting/decrypting printable. Which is the problem that the poster is running in to.

I would recommend either converting to Base64 or storing the result as bytes rather than a String.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Henry Wong wrote:Without the encoding, the result is binary data. It can't be held in a string. Doing so, will likely mess up the binary data, and hence, can no longer be decrypt correctly.



Alright, so then how would I fix those lines of code?  I'm not sure of a better alternative way which would achieve what I want.
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The code needs to either directly use bytes, or to base64 encode it (or some similar encoding that can represent all the bytes AES uses)
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Zachary Griggs wrote:

Stephan van Hulst wrote:

Zachary Griggs wrote:How are you testing your code without base64? Are you copy/pasting the output from the encrypt into the decrypt? If so, it may be charset issues causing the problem, not an actual issue with the algorithm.


You can easily pass around byte arrays without transporting them as Base64. Again, Base64 is unrelated to encryption/decryption.


Yes - but as shown, he is attempting to use Strings, not Bytes. This is why Base64 is used. It's not directly related to encryption/decryption but it is a character encoding that is used to make the result of encrypting/decrypting printable. Which is the problem that the poster is running in to.

I would recommend either converting to Base64 or storing the result as bytes rather than a String.



Well like I said, I don't want to use base 64, because that's the whole thing that I'm trying to avoid.  As for storing in bytes, I guess that's alright, except that ultimately I'll need to be able to read it as a string, because:

1: I need to test to make sure that the data is alright, so if I can read a string that makes it a lot easier.
2: Ultimately this will be used for encrypting/decrypting text, among other things, but remember, the files that it reads/writes must be compatible with a C# program which already exists, does not use base 64, and uses ASCII strings (as far as I can tell, but ultimately it stores RTF, but I'm pretty sure it's using ASCII to store the RTF, but with formatting that can make Unicode characters as necessary).
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you save it as a binary file which then feeds in to your other application?
I recommend unit tests to test functionality, which would not necessarily require strings.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sorry for posting so much, but people keep posting while I do, and then I notice something new to which I want to reply:

Zachary Griggs wrote:The code needs to either directly use bytes, or to base64 encode it (or some similar encoding that can represent all the bytes AES uses)



I can't encode it in base 64 or anything like that!  I keep telling you all that it needs to be in a format that is compatible with the C# version of the program, and that one uses:

- Regular strings/bytes/whatever with NO base 64 conversion
- ASCII strings 1-byte-per-character for storing RTF, which sometimes uses non-ASCII-compatible Unicode characters specified as RTF commands within the string (like \rquote for a closing quotation mark, for example), but I don't think that should affect this because it could just be treated as a regular ASCII string

Basically, the C# program takes a text string in ASCII format, does NOT convert it to base 64 or anything else, and just straight encrypts it and saves it in a file.  It can also read from the file and decrypt the text.  It doesn't even have to be text, because it also can encrypt/decrypt any data, but whenever it is text, it's ASCII default byte-sized characters, with NO 64 bit conversion.

What I need the Java program to do is exactly the same thing as that, and whatever is encrypted with either program must be decryptable with the other program, so that it turns out identical to the original, and that if it happens to be text then it must still be readable in both programs.

I'm sorry to sound harsh, but it's just frustrating, and it seems like I have to keep saying that I can't use base 64, and people keep suggesting, "Well why don't you just use base 64 then?"

Also, I do appreciate the suggestions but it would be really nice to get a bit of source code, even if it's just to modify a couple of my lines to get it to do what I want, because if you just explain, "do it like this..." but don't provide any code then I may or may not be able to get it to do what you're saying.  I mean the whole point is that if I could get it to do anything just from a description of what I want then I wouldn't even be having this problem in the first place, because I already know what I want it to do, but I can't seem to figure out how to get it to do it.  Keep in mind that I'm very rusty with Java and haven't used it in about 12 years until I just made this program, so bear with me please.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Zachary Griggs wrote:Can you save it as a binary file which then feeds in to your other application?
I recommend unit tests to test functionality, which would not necessarily require strings.



Well, I suspect one of my problems is going to also be that Java uses Unicode and C# uses ASCII, so it might also be necessary for me to have functions in Java which can translate back and forth between ASCII and Unicode, but I'm not sure what functions those would be.
 
Henry Wong
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:
Well, I suspect one of my problems is going to also be that Java uses Unicode and C# uses ASCII, so it might also be necessary for me to have functions in Java which can translate back and forth between ASCII and Unicode, but I'm not sure what functions those would be.



Neither Unicode or ASCII can hold binary data. You need to go through all the code that uses either to hold the encrypted data -- convert them to use something that can hold the data (such as a byte array). You also need to modify all the code that works with the data type to use it correctly too.

This is obviously not a one or two line change.

Henry
 
Henry Wong
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:
1: I need to test to make sure that the data is alright, so if I can read a string that makes it a lot easier.



Unfortunately, this is not possible (via directly to strings). You need to create code that can check the binary data, which may have to include a loop that displays all the information as numbers.

Terrance Samson wrote:
2: Ultimately this will be used for encrypting/decrypting text, among other things, but remember, the files that it reads/writes must be compatible with a C# program which already exists, does not use base 64, and uses ASCII strings (as far as I can tell, but ultimately it stores RTF, but I'm pretty sure it's using ASCII to store the RTF, but with formatting that can make Unicode characters as necessary).



Binary data can't really be held using ASCII. So, there is either some sort of lack of checking on the C# side, or the C# code is using some sort of encoding that is not base64. I would advise against guessing, as you can't replicate it on the Java side, if you are not sure what the C# side is doing.

EDIT: BTW, forgot to mention. If the file is in RTF format, then, your Java side needs to process that too.

Henry
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I understand that you don't want to use encoding or direct bytes. I'm simply explaining why that's not going to work as you want it because:

Binary data can't really be held using ASCII



--

When you create your cipher, it returns bytes. These are the values of the first few bytes:
byte[0] = 80
byte[1] = -22
byte[2] = -118
byte[3] = 83
So, how are you going to represent -22 as an ASCII character? You can't. It's out of the spec.
I promise that the C# application isn't taking these bytes and converting them directly to ASCII either. Since it's not possible.
There isn't really a simple way you can make this work in ASCII without an encoding, which is why all examples online posted of this snippet use an encoding to show the value to you.

Can you post the C# source for the section where it encrypts?
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Henry Wong wrote:
Neither Unicode or ASCII can hold binary data. You need to go through all the code that uses either to hold the encrypted data -- convert them to use something that can hold the data (such as a byte array). You also need to modify all the code that works with the data type to use it correctly too.



You may have slightly misinterpreted me, or maybe I misspoke, but I didn't mean that it needs to be a string and can't be a byte array.  All I meant is that ultimately a string is essentially a byte array, or an array of numbers, each of which is interpreted as a character, either in ASCII or in Unicode, but the difference is that in ASCII each character is one single byte, so if I convert that to a byte array, I'd think that I'd get a different result than if I convert a Unicode string to a byte array, because then each character would become two bytes in the array, right?  So the resulting array would be different even with the same readable text.  So what I need is to make an ASCII string in Java, or otherwise take a Unicode string (which should only have ASCII compatible characters, theoretically, considering what I'm doing) and convert that into either an ASCII string or a byte array equivalent of an ASCII string, but not a Unicode string.  But why should this be so difficult to do in Java?

Henry Wong wrote:
Binary data can't really be held using ASCII. So, there is either some sort of lack of checking on the C# side, or the C# code is using some sort of encoding that is not base64. I would advise against guessing, as you can't replicate it on the Java side, if you are not sure what the C# side is doing.

EDIT: BTW, forgot to mention. If the file is in RTF format, then, your Java side needs to process that too.

Henry



For one thing, I'm absolutely certain that the C# version isn't using base 64, because I never converted to that, and because if I did then it would be taking more bytes, which it isn't.

And I don't think that RTF will make a difference, because it's still just ASCII text.  I know this because if I open a RTF file using Notepad it prints like regular text with a bunch of extra stuff in it for formatting, but that also prints as regular text, so all I'm doing is using that as a text string in C# and I want to do the same thing in Java.

Zachary Griggs wrote:
When you create your cipher, it returns bytes. These are the values of the first few bytes:
byte[0] = 80
byte[1] = -22
byte[2] = -118
byte[3] = 83
So, how are you going to represent -22 as an ASCII character? You can't. It's out of the spec.
I promise that the C# application isn't taking these bytes and converting them directly to ASCII either. Since it's not possible.
There isn't really a simple way you can make this work in ASCII without an encoding, which is why all examples online posted of this snippet use an encoding to show the value to you.

Can you post the C# source for the section where it encrypts?



Well first of all, that tells me that it's using -22 as a signed byte, but I really need unsigned bytes ranging from 0 to 255 because that's what ASCII is, after all.  Unicode has an even greater range, but neither use negative numbers, so I don't know why it's doing that, unless for some reason it just wants to interpret it as a signed byte even though given the context of text, only unsigned bytes are reasonable.  Can't I make an array of unsigned bytes and use that instead?

As for posting the C# code, that's really going to be tricky, because for security purposes, I really can't put it online.  The only reason why I can put the Java version online is because at this stage it's just a simple experimental thing, whereas the C# one has a ton of other stuff, and I can't just pull out a little piece of the code, because at the moment it's inaccessible to me (to make an incredibly long story short, it was on an air-gapped computer which needed temporary Internet access, so it had to be thoroughly reformatted first, and that code can't be put back onto it until it no longer needs Internet access and can once again become air-gapped, but at the moment it still need the Internet, and ideally it would be very convenient if I could figure this out before I re-air-gap the computer).
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Unsigned bytes (or unsigned anything) doesn't exactly exist in Java unless you make it yourself.
You can probably loop through the array of bytes and wrap any negative numbers around to positive numbers. As long as you do it consistently on the decrypt side it should still work. That would probably allow you to map it to ASCII. You'd have to figure out a way to know that those bits were changed from negative to positive, I guess.
I can't say if this will match that the C# program does, though, since I don't know how that behaves.

Here is a sample "solution" which prints it in ASCII.. which is actually just a custom encoding to treat it like unsigned bytes. No guarantee this matches your other application.
It's just an example, it isn't good code

Encrypt:


Decrypt:


intArrayToString


Output:
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Zachary:

Seriously, unsigned doesn't exist in Java?  Well that's hardly compatible with any data from other languages!

Thanks for the code, and I figured it might involve something like that, except that wouldn't translating to ints cause it to use a larger data type, therefore taking extra space and padding all bytes with like 3 extra bytes of 0 before them?

Though I see from your output that somehow it worked even despite that!  EDIT: I think I see why - you're converting back and forth between chars and ints, but you're encrypting while they're ints, right?  So then it will increase the size MUCH larger (and sometimes I'm using files that could be gigabytes in size, so I really don't want to increase them), and also, since the C# version doesn't do that it wouldn't be compatible, anyway.  Please correct me if I'm wrong though, and it's not actually increasing the size.  Actually though, judging by your encrypted string, it looks even smaller!  How did you manage that?

I also looked at my code again and it seems like I actually am converting from Unicode to ASCII by using UTF-8, which supposedly works up to character 127, and as far as I know I'm not using any characters after that (at least in the test data, though I will occasionally in the real data, so it will be necessary, but it wouldn't necessarily explain why testing isn't working as well as I'd hoped).

I also just now tried changing the getBytes() in the decrypt function to getBytes("UTF-8"), and unfortunately that didn't fix it but it did change from a BadPaddingException to an IllegalBlockSizeException, so maybe that's progress.  By the way, I don't know why it takes a string rather than an enumeration to set the mode to UTF-8 - that seems very silly to me.

I think I'll make a simple experimental C# program now, which will encrypt/decrypt and save files, and then I'll put the code here, so that we'll have something to compare to the Java one that I'm trying to make.

EDIT: Looking at it some more, I noticed that a char seems like it can be treated as a positive number which can exceed 127 (and may even be unsigned), so what do you think of the idea of using a char array instead of a byte or int array?  Though I suspect a char will take two bytes, so that might not quite fix the problem.  Also, I suspect that your negation of negative number or numbers greater than 127 is causing them to reverse their order, because if you take negative numbers and negate them to make them positive, the ones that were higher will become lower and vice-versa.

EDIT: As I was afraid would happen, I printed the length of the original unencrypted string and it's 48 characters, but then when I printed the length of the toDecrypt array (which is of course encrypted) it's 64 bytes, so it's expanding by one third (I don't know why it's that amount).
 
Henry Wong
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:
But why should this be so difficult to do in Java?



To start, ASCII codes are *not* a single byte... To be exact, it is 7 bits long. And historically, most systems (that store and transfer the data) just use 8 bits (a byte), with the last bit being zero. This is because it is a lot easier to repurpose hardware and code for it, than to redevelop for 7 bits.

.... and Unicode is backward compatible with ASCII codes. Basically, all valid ASCII codes are Unicode. This is because of how Unicode is defined. Since ASCII code are 7 bits long (with the 8th bit being zero), Unicode for the ASCII range are the same. If it is valid ASCII, then it is valid Unicode. The enhancement is when the 8th bit is not zero -- when there can be more than one byte per character.

So... Java strings are Unicode *and* they are also ASCII strings. And you can't hold 8 bit data with ASCII strings (not valid ASCII anyway).

Henry
 
Henry Wong
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:As I was afraid would happen, I printed the length of the original unencrypted string and it's 48 characters, but then when I printed the length of the toDecrypt array (which is of course encrypted) it's 64 bytes, so it's expanding by one third (I don't know why it's that amount).



Encrypted data are generally larger than their original data counterpart. This is because encrypted data also hold stuff needed to hide the original data. Additionally, many encryption algorithm purposely increases the size in a way that can't be calculated. This is because the algorithm is protecting the details of the original data -- which includes the size of the original data.

Henry  
 
Marshal
Posts: 25197
64
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In particular, Base-64 expands the original data by a factor of one-third.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Henry Wong wrote:
Encrypted data are generally larger than their original data counterpart. This is because encrypted data also hold stuff needed to hide the original data. Additionally, many encryption algorithm purposely increases the size in a way that can't be calculated. This is because the algorithm is protecting the details of the original data -- which includes the size of the original data.

Henry  



Well I know for a fact that the C# program that I'm using does not expand the data at all.  If I start with a 94,820,763 byte unencrypted file then when I encrypt it, I get a 94,820,763 byte encrypted file.  And it's using AES.

Henry Wong wrote:To start, ASCII codes are *not* a single byte... To be exact, it is 7 bits long. And historically, most systems (that store and transfer the data) just use 8 bits (a byte), with the last bit being zero. This is because it is a lot easier to repurpose hardware and code for it, than to redevelop for 7 bits.



Are you absolutely certain that ASCII never uses all 8 bits?  I was pretty sure that it does, but that half of the values are stuff like the Greek alphabet and other characters which aren't used as often.

Paul Clapham wrote:In particular, Base-64 expands the original data by a factor of one-third.



That doesn't make sense, because I specifically disabled the part that converts between base 64, so I'm just using regular byte data, like Zachary suggested with his code samples (which I'm using), but it's still one third larger!

EDIT: By the way, I'm trying to get an example C# program to work, but so far I'm getting an exception that says "Padding is invalid and cannot be removed" when I try to decrypt it, so I'll have to get back to you about that when I get it working, and I'll get the code posted here.
 
Henry Wong
author
Posts: 23868
141
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:
Well I know for a fact that the C# program that I'm using does not expand the data at all.  If I start with a 94,820,763 byte unencrypted file then when I encrypt it, I get a 94,820,763 byte encrypted file.  And it's using AES.



AES is not the primary encryption that I use -- so, I can't give any more information in that regard. Sorry.

Terrance Samson wrote:
Are you absolutely certain that ASCII never uses all 8 bits?  I was pretty sure that it does, but that half of the values are stuff like the Greek alphabet and other characters which aren't used as often.



There have been many extended ASCII formats over the years. Many of these are now replaced with Unicode. If this is one of the formats that you are using, there isn't much that Java can do to help. You will definitely need to use raw bytes -- and deal with all format specific stuff yourself.

Henry
 
Zachary Griggs
Ranch Foreman
Posts: 125
11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What I was doing in my code was grabbing the encrypted bytes, turning the negatives into positive and add 127 to it, and then just reversed the operation when decrypting. The string returned doesn't have any additional complexity/length in it. The int is just a necessary conversion in order to treat it like unsigned bytes.

It's a pain for me that Java doesn't have unsigned stuff as well.
When I need to use an unsigned int, I use a long and then perform some bit manipulation in order to normalize it into what the equivalent unsigned int would be.
For unsigned byte, you just need to use int basically.
This is a code sample of how I used an "unsigned" integer in java when I needed to use it to replicate an algorithm implemented in C++: https://github.com/zach-cloud/StringHashBreaker/blob/master/src/main/java/SStrHash2.java

It seems to me that the C# program is not using plain AES. It's using AES + another transformation on the data, which is why the plain AES on the java side is not working as you expect.
I believe ASCII used to use 7 bits only, but now has an extended charset to use all 8 bits. At least, java treats it that way.
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:You may have slightly misinterpreted me, or maybe I misspoke, but I didn't mean that it needs to be a string and can't be a byte array.  All I meant is that ultimately a string is essentially a byte array, or an array of numbers, each of which is interpreted as a character, either in ASCII or in Unicode, but the difference is that in ASCII each character is one single byte, so if I convert that to a byte array, I'd think that I'd get a different result than if I convert a Unicode string to a byte array, because then each character would become two bytes in the array, right?  So the resulting array would be different even with the same readable text. So what I need is to make an ASCII string in Java, or otherwise take a Unicode string (which should only have ASCII compatible characters, theoretically, considering what I'm doing) and convert that into either an ASCII string or a byte array equivalent of an ASCII string, but not a Unicode string. But why should this be so difficult to do in Java?


Ultimately a string is not a byte array. A string is a sequence of characters. Characters can have any internal representation, and are not necessarily encoded as bytes. If you want to convert a string to a byte array, you can only do so by specifying an encoding. This encoding is completely unrelated to whatever Java or C# use to store characters internally. The encoding you use in the Java version of your code must match the encoding you use in the C# version of your code. This is likely to be UTF-8.

Terrance Samson wrote:2: Ultimately this will be used for encrypting/decrypting text, among other things, but remember, the files that it reads/writes must be compatible with a C# program which already exists, does not use base 64, and uses ASCII strings (as far as I can tell, but ultimately it stores RTF, but I'm pretty sure it's using ASCII to store the RTF, but with formatting that can make Unicode characters as necessary).


We understand that you want to encrypt text messages. But encryption primitives don't operate on text. They operate on bytes. This is true for both Java and C#. So if you are encrypting text messages in C#, you must first convert from text to binary. It's likely that you're using a CryptoStream wrapped in a StreamWriter and you're using the default encoding for the StreamWriter. The default is UTF-8.

The output will also be raw bytes. I don't know what you do with the raw bytes after encryption, but if you're converting it to a string, you must also use an encoding. Using an encoding like ASCII is pretty pointless here, because the data is certain to contain non-printable characters. If you're storing the encrypted data to disk, you might as well store the binary data directly.

Terrance Samson wrote:Well I know for a fact that the C# program that I'm using does not expand the data at all.  If I start with a 94,820,763 byte unencrypted file then when I encrypt it, I get a 94,820,763 byte encrypted file.  And it's using AES.


Sorry, but you're not using AES. AES is a block cipher that uses 16 byte blocks, and if your data is not a multiple of 16 bytes, you MUST use a padding algorithm that will pad your ciphertext to the next multiple of 16 bytes. Maybe you're using Rijndael in a stream cipher mode, or you're using an algorithm that calls itself AES but is not really AES.

The fact that your encrypted data is exactly the same size as your plaintext worries me for another reason. It means that you're not prepending the initialization vector (IV) to the ciphertext. That implies that you're either not using an IV, or you're using a hard-coded IV. This is extremely insecure and it's just begging to get hacked.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Henry Wong wrote:There have been many extended ASCII formats over the years. Many of these are now replaced with Unicode. If this is one of the formats that you are using, there isn't much that Java can do to help. You will definitely need to use raw bytes -- and deal with all format specific stuff yourself.



Well do you happen to know whether RTF stores data as regular ASCII or extended ASCII?  I know it's not Unicode, because whenever it needs a character which is Unicode only, it puts it like "\character-name".

Stephan van Hulst wrote:The output will also be raw bytes. I don't know what you do with the raw bytes after encryption, but if you're converting it to a string, you must also use an encoding. Using an encoding like ASCII is pretty pointless here, because the data is certain to contain non-printable characters. If you're storing the encrypted data to disk, you might as well store the binary data directly.



Well, pointless or not, I MUST use ASCII, because once again, that's what C# is using and the algorithm must be IDENTICAL.

Stephan van Hulst wrote:The fact that your encrypted data is exactly the same size as your plaintext worries me for another reason. It means that you're not prepending the initialization vector (IV) to the ciphertext. That implies that you're either not using an IV, or you're using a hard-coded IV. This is extremely insecure and it's just begging to get hacked.



In C# I'm using the System.Security.Cryptography.Aes class.  I might possibly be using Reijndall but I don't remember.  However, I'm using a block, not a stream.  I don't think I actually did us an IV or if I did, I set it to some constant.  However, understand that I'm also XORing blocks together, reordering them, etc., and this isn't the only algorithm that I'm using, but rather I'm layering a bunch of completely different and unrelated algorithms together, so it would still be very secure in other ways.  In fact, the AES was kind of an afterthought that it just seemed like, "well, why not throw this in there too?"

Now that I think of it, the encrypted files from C# sometimes are a few bytes larger, which may be accounted for by the padding, but they certainly don't expand by 1/3 the size!

Hopefully I can get together an example for C# and get it actually working on Tuesday, so that we have something to which we can compare my Java.
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Terrance Samson wrote:Well do you happen to know whether RTF stores data as regular ASCII or extended ASCII?  I know it's not Unicode, because whenever it needs a character which is Unicode only, it puts it like "\character-name".


There is no single RTF format. There are RTF formats that use unicode normally.

Well, pointless or not, I MUST use ASCII, because once again, that's what C# is using and the algorithm must be IDENTICAL.


Sure, but we must know WHAT you are using ASCII for. What do you do with the ASCII after you've converted it from encrypted bytes?

In C# I'm using the System.Security.Cryptography.Aes class.


Your output should be a multiple of 16 bytes then.

However, understand that I'm also XORing blocks together, reordering them, etc., and this isn't the only algorithm that I'm using, but rather I'm layering a bunch of completely different and unrelated algorithms together, so it would still be very secure in other ways.  In fact, the AES was kind of an afterthought that it just seemed like, "well, why not throw this in there too?"


This is the worst form of security. People tend to think that just layering stuff on top of each other makes it safer, but in many instances it actually compromises security. Why are you performing all these transformations? It's not only pointless, it's unsafe.

Now that I think of it, the encrypted files from C# sometimes are a few bytes larger, which may be accounted for by the padding, but they certainly don't expand by 1/3 the size!


Only if the encrypted files are a multiple of 16 bytes. The example you quoted had an odd number of bytes, and that's will only be the case for stream ciphers or CTS mode.

Hopefully I can get together an example for C# and get it actually working on Tuesday, so that we have something to which we can compare my Java.


Yes, please do so. We really can't help you without knowing the original algorithm.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:
This is the worst form of security. People tend to think that just layering stuff on top of each other makes it safer, but in many instances it actually compromises security. Why are you performing all these transformations? It's not only pointless, it's unsafe.



Yes, I'm aware that in some cases it can be unsafe to layer algorithms, like for example if you use the same or a similar algorithm and the same key, because sometimes doing it twice can just undo the first time or change it in such a way as to make it less secure.  However, if each time you use completely different algorithms, with different keys, and even incompatible block sizes, then it really can't possibly be less secure because they have nothing to do with each other.  In any case, if it did make them less secure then that would make a cryptanalyst's job a whole lot easier, because all they'd have to do is apply various algorithms over top the already encrypted data, and that would weaken it significantly, which means that it wouldn't have been very strong to begin with if it's susceptible to such a ridiculous attack.

Stephan van Hulst wrote:
Only if the encrypted files are a multiple of 16 bytes. The example you quoted had an odd number of bytes, and that's will only be the case for stream ciphers or CTS mode.



The example was just a number that I made up to use as an example; I don't have any files of exactly that size.  In any case, I remember for certain that I was using blocks, not streams.

By the way, if I remember from what I read on the Wikipedia page about AES (at the time I was trying to see the inner workings, and then I realized that it's easier to use an algorithm that's already implemented than to write one manually), it seems to cut each block into smaller blocks, and then reorder them and manipulate their data in other ways, but without resizing anything, other than to perhaps add padding, so I'm not sure why you all seem to think that it's supposed to expand by some percentage or fraction like 1/3.
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What I don't understand is, why do you have all these layers in place if you can just encrypt it once and it will be secure? And why is all of this such a hard requirement when you said you added AES as 'an afterthought'?

Terrance Samson wrote:By the way, if I remember from what I read on the Wikipedia page about AES (at the time I was trying to see the inner workings, and then I realized that it's easier to use an algorithm that's already implemented than to write one manually), it seems to cut each block into smaller blocks, and then reorder them and manipulate their data in other ways, but without resizing anything, other than to perhaps add padding, so I'm not sure why you all seem to think that it's supposed to expand by some percentage or fraction like 1/3.


Who of us said that it adds a percentage? That is true for Base64 encoding, not for AES.

You yourself said:

As I was afraid would happen, I printed the length of the original unencrypted string and it's 48 characters, but then when I printed the length of the toDecrypt array (which is of course encrypted) it's 64 bytes, so it's expanding by one third (I don't know why it's that amount).



This may have caused some other posters to continue the discussion using fractions. Anyway, the reason you're getting 64 bytes instead of 48 bytes is because PKCS#7 will use an entire block of padding if your data happens to be an exact multiple of the block size. That means that using such a padding mode, the extra amount of data will always be between 1 and 16 bytes for AES.
 
Terrance Samson
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:What I don't understand is, why do you have all these layers in place if you can just encrypt it once and it will be secure? And why is all of this such a hard requirement when you said you added AES as 'an afterthought'?



Well what do you mean encrypt it once and it will be secure?  I don't agree with that.  AES or any single form of encryption that anyone uses by itself is not secure by my standards.  For one thing, they use keys which are like 256 bits, where as I use keys that are more like 256 bytes, and I use about a dozen of them, for substitution and transposition ciphers (alternating between them, by the way, for anyone who thought they'd interfere with each other, because when I transpose, I scatter the bits so wide that they won't fit within the same blocks during the next substitution cipher).  Sure it takes a couple of minutes to encrypt or decrypt anything, but that's a small price to pay for an incredibly high level of security.

And again, even though AES was just used as a bonus after everything else, it's still necessary because like I said, I need the Java version to be compatible with the C# version, and I've already encrypted several thousand files using the C# version, so I really don't want to have to individually decrypt and re-encrypt each and every one of them with the new version (and then I'd actually have to keep two copies of each if I wanted them to stay compatible with both versions).

Stephan van Hulst wrote:
Who of us said that it adds a percentage? That is true for Base64 encoding, not for AES.



Someone said that the C# version of AES which I'm using must not be true AES because the real version always expands each block or something

Stephan van Hulst wrote:
This may have caused some other posters to continue the discussion using fractions. Anyway, the reason you're getting 64 bytes instead of 48 bytes is because PKCS#7 will use an entire block of padding if your data happens to be an exact multiple of the block size. That means that using such a padding mode, the extra amount of data will always be between 1 and 16 bytes for AES.



Oh, alright.  I had assumed that if I used an exact block size (at least initially for testing) then it wouldn't pad at all, and that would be one less thing for me to worry about while I'm trying to diagnose the first test case, but I guess I was wrong.  I'm not sure what exactly PKCS#7 is (I'm a bit rusty on some of the lingo for this stuff) and I don't remember setting anything to that mode, but I'll look into that.  Is there an alternate mode that I could use which wouldn't put extra padding if none is needed?
 
Greenhorn
Posts: 9
2
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

I've been following this for quite a while now and everytime I get more and bigger question marks in my head. So, I think some of you already had this in mind, but let me ask it: If your crypto has to be "equal" to what you do in C#, and yes, there most likely will be a way to do it, why not just post your C# code in the first place so we can see what you're actually doing? As you see, you're explanations seem to be ambiguous. Also you may seem to lack some basic knowledge about charsets, encodings and cryptography.

What really worries me is your statements about the input and output sizes and that you're layering multiple crypto schemes. AES as most other algorithms is used as a block cipher for encrypting files, so you will end up with padding. As AES operates on 128bit blocks and depend on the selected padding scheme you will end up with an output size of either the next 16byte boundary or with an entire additional block. So, when you have an input size of 94.820.763 the output would be at least 94.820.768, or 5.926.298 blocks of 16 bytes unless you either specify NoPadding, wich shouldn't be done, is insecure, and in Java only works with ECB mode - wich is even more insecure to use, or you set AES to stream mode instead of block mode. But this isn't as easy done as said.

I also still struggle why you so focused about using ASCII, as from what I've read so far you seem not to familiar with it, character sets and encodings or that encryption is a binary operation, even when done on text input. So to be short: whenever you are encrypting something with AES, even if it's plain text, the output will be some arbitrary binary data. So, some method that takes in some ASCII (or String for that matter) and will return some ASCII will not work as the binary data returned by the AES cipher most likely will contain data within the range of 0x00 - 0x20 wich are "not printable" as they are known as "control characters". If you want some printable output you need some additional encoding like Base64 or a Hex-dump wich consist of only printable characters. That's why you found a lot of examples using such schemes.

In addition to that, your very first post:

For some reason all of the examples that I've found do that


I highly doubt that. A quick Google search with the term "java aes example" over the half of the links on the first page are in fact codes without any form of encoding like Base64 or even basic hex string. So let me ask: What terms you used to search for you only came up with Base64 examples?

As you said you want to use it for file encryption, one secure method is to generate a random AES key, use it for the file encryption, and secure it by a RSA key, pretty much like TLS works, and store the private key protected with a passphrase. There're lot of openSSL examples, but here's a complete example in Java. Please note: This example requires the BounceCastle crypto lib:

--- limit of 10k - please se next post for code ---

In fact: As I use a bit of BouncyCastle in this example already this could be re-written to only use BouncyCastle - and as BC is also available in C# it would be a 1-to-1 port so it would work exactly identical and would produce the exact same output result wich makes them compatible with eachother. I know BC doc are a pain to dig through, so I would only do it if you want/need to (and may provide a C# BC implementation so I can port it 1-to-1).

As others already mentioned: Just cause you chain different algorithms togehter doesn't make the whole process any more secure. If you use AES256 you get AES256 security - no matter if you wrap it around or inside other algorithms. Note that the RSA key although encrypted with AES depends on the chosen passphrase. If you only use a short phrase without special characters it can easily bruteforced, but it's a common practice and compatible with openSSL (at least somewhat - the IV part needs to be handled separately).

TLDR: From what you posted so far it's hard to understand what you try to do / want to accomplish and what your issues are. You state you want something compatible with some C# code - please post it. You also said you layer different crypto - don't do that! You didn't found what you need using google? I doubt that as crypto stuff is like exploded after Snoweden - so I guess there will be answers to your questions as most likely someone already done it. And as it seem you lack some kind of basic knowledge needed to do this (correctly) you shouldn't try to do your own, as I guess it would result in some unsecure DIY crypto - wich, and this isn't me and you but about any crypto, should not be done at all (Don't roll your own crypto!).
 
Lexi-Mae Erickson
Greenhorn
Posts: 9
2
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
 
Stephan van Hulst
Saloon Keeper
Posts: 11472
247
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to CodeRanch, Lexi-Mae, and thanks for your input!
 
sunglasses are a type of coolness prosthetic. Check out the sunglasses on this tiny ad:
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!