• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Liutauras Vilda
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
Bartenders:

Long Term Storage Encryption

 
Ranch Hand
Posts: 66
3
Netbeans IDE Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi! I haven't really worked with a lot of java security stuff before but I need to use it now to encrypt sensitive data for long term storage. We've never really had to deal with this situation before; where we need to store sensitive data for a long time, then retrieve it and decrypt it so we can use it again. The application that reads the data will be running on the same server as the application that writes the data, so the primary concern for this particular post isn't secure transmission, I just need to make sure that if anyone ever *does* gain access to our database, they won't be able to read the data in this particular table. It doesn't need to be NSA-proof 16KMatrix Filtrax-barred 1028-bit Inifnisigned secure; it just needs to be encrypted well enough that if someone got the data, they'd have to decide it was worth investing some significant time and resources to read it.

I'm running into some issues though because I'm not entirely familiar with all the encryption methods or how they're supposed to be used, and I'm not entirely sure how to use the java encryption providers, which ones I should be using, or what kind of encryption I should be using. I feel like I'm wading into a sea of acronyms which mean nothing to me, and somewhere in here is the acronym that's right for me...

So far I've ruled out SHA1, MD5, and PGP because apparently those are only supposed to be used for cases where you're communicating with someone and you can tell them some other random data like the length of the String you should get out or some kind of randomized Initialization Vector or something to make sure someone intercepting couldn't just keep intercepting until they found a pattern. I tried AES, but it looks like AES needs to know the length of the plaintext String it's looking for in order to decrypt it? And uhhhh... The operative word here is * long term * storage; and I can't store the length of the plaintext string along with the hashed string, so unless there's a way to decrypt AES with just the key String, without knowing the byte length of the plaintext String, it looks like AES is out too.

I'm running out of encryption methods. I need an encryption method that can be used to encrypt something based on a secret key, then decrypt it based on that same secret key, and I need it to not need anything else. It sounds like the category of what I'm looking for is "Symmetric Key" encryption, where you don't have any special data passed back and forth; just a secret key that really needs to remain secret, so I'm going to start going through all the methods listed in this article: https://en.wikipedia.org/wiki/Symmetric-key_algorithm#Implementations

The other thing is I'd really like to not have to import any libraries that aren't packaged with Java 7, which is proving to be problematic because most of the examples use BouncyCastle? I apparently already have the sunjce_provider jar in my jre lib folder, so I can use that one, but it looks like I can't do everything I need with that alone.

So this is kind of a two-parter; and the first part is:  Does anyone have recommendations for which encryption method I should be using?

The second part concerns the java security packages themselves. I've been studying this: https://docs.oracle.com/javase/7/docs/technotes/guides/security/crypto/CryptoSpec.html
But a lot of it isn't making a ton of sense to me yet, so I just want to see if I have this right generally:

Providers are just jar files that know how to hash strings with specific algorithms. In order to use one, you have to ask your Security something or other to get you an instance of an engine that knows how to use the encryption method you're asking for, like say, you'd call getInstance("MD5") and it would look through your loaded security providers and be like "Hey, which one of you knows how to MD5 things?", and you CAN tell it to use a specific provider, but that's frowned upon even though every example I've seen does that, and whenever I don't it yells at me for having no providers loaded?

But anyway, once you have your engine you have to get a key through one of a number of processes depending on the encryption method or the provider or something; but generally they all need some form of key to be fed in as a byte array, and then once you give that key to your engine, you can tell the engine to encrypt or decrypt things. I think there's additional steps, but they only apply if you're using the stuff to do stuff with public/private key pairs and non-symmetric encryption?

I feel like I'm still missing a number of steps though... Can anyone tell me if I'm generally steering in the right direction, or maybe point me to a tutorial that's geared towards secure storage rather than secure transmission?
 
Alex Lieb
Ranch Hand
Posts: 66
3
Netbeans IDE Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also for what it's worth, this is what I'm looking at now:

http://www.java2s.com/Code/Java/Security/EncryptionanddecryptionwithAESECBPKCS7Padding.htm

This example does *exactly* what I need to do, and it uses a thing that would work perfectly, but for some reason it looks like this thing needs to know the length of the plaintext string in order to decrypt it? What's up with that?
 
Marshal
Posts: 4694
587
VSCode Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What do you mean by Long Term Storage?  To me, that usually means data stored off-line on removal media in a physically secure location.
 
Alex Lieb
Ranch Hand
Posts: 66
3
Netbeans IDE Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Gotcha. Sorry; I'm still not totally sure what terminology I should be using. I don't mean "long term" in the sense that it's going to sit in a time capsule for 50 years; I just mean it in the sense that it's going to be stored in one place for months or years rather than sent, received, used, and forgotten.

We're storing it in a database that's used by our web application. While it's sitting there, we will need to use it at random intervals, so we will need to be able to retrieve and decrypt it at will.
 
Bartender
Posts: 7645
178
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Alex Lieb wrote:http://www.java2s.com/Code/Java/Security/EncryptionanddecryptionwithAESECBPKCS7Padding.htm


That code sample is a good start, but it has some issues. For example, it punts on the issue of encoding, simply assuming that the text will be ASCII. While that's true in the example, it's obviously not true for arbitrary texts. It also treats the ciphertext as text for the purposes of printing it out, whereas it's actually binary data - again, acceptable in a code sample, but not something you should ever do in production code.

... but for some reason it looks like this thing needs to know the length of the plaintext string in order to decrypt it?


It doesn't need it, it just uses it to allocate the array holding the decrypted text. Instead, you could just allocate an array that's maybe 10% larger than the ciphertext, and that should be sufficient. That, too, makes this sample code, not production quality code.

Lastly, the code also punts on the issue of key length (for which 128, 192 and 256 are generally available for AES). You should at least be aware of that issue (and why 192 and 256 may not be available), which https://coderanch.com/how-to/content/AES_v1.html explains in detail.

To sum it up, read java2s.com with a grain of salt, and don't assume all the code on it is bug-free or of great quality.
 
Bartender
Posts: 15737
368
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, symmetric encryption is the way to go if you trust both applications.

It's important that encryption shouldn't be done without authentication. Using just AES without validating a signature leaves you open to chosen-ciphertext attacks: an attacker can inject ciphertexts into your database and then use the output of our application to determine the encryption key.

Fortunately, Java provides AES in Gaulois/Counter mode. This is a stream cipher that automatically authenticates the ciphertext before it attempts decryption. It also automatically signs the message upon encryption. The associated transformation is "AES/GCM/NoPadding". Because it is a stream cipher, it requires no padding. It is of utmost importance that you generate a unique initialization vector for each message though, and store that along with the message, otherwise the algorithm reduces to a block cipher in ECB mode. NEVER use block ciphers with ECB mode, such as "AES/ECB/PKCS7Padding".

To encrypt:
  • Convert your message to a byte array.
  • Retrieve your key from a KeyStore. Don't hardcode your key.
  • Generate a cryptographically secure random IV.
  • Construct a GCMParameterSpec using your IV and a desired authentication tag length (signature strength).
  • Init a cipher in encryption mode with your key and your parameter spec.
  • Encrypt your message.
  • Store the IV and ciphertext in your database as binary, or optionally convert them to Base64 first.
  • Destroy all instances of the key, message and ciphertext.


  • To decrypt:
  • Retrieve the IV and ciphertext from the database.
  • Retrieve your key from a KeyStore.
  • Construct a GCMParameterSpec using your IV and the same authentication tag length you used to encrypt.
  • Init a cipher in decryption mode with your key and your parameter spec.
  • Decrypt the ciphertext.
  • Destroy all instances of the key and ciphertext.
  • Destroy the message after you're done with it.
  •  
    Stephan van Hulst
    Bartender
    Posts: 15737
    368
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    When you're done, post your code here. It's easy to get this wrong, without it looking wrong. We can offer some remarks.
     
    Alex Lieb
    Ranch Hand
    Posts: 66
    3
    Netbeans IDE Notepad Java
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Ok! Thank you!

    I finally got it to work with Triple DES encryption; I put it in Base 64 for storage, and it does NOT use a randomized Initialization Vector:




    Also there is some good news here; I'm kind of limited in some ways; ideally I'd like to avoid storing keys for this data, but some of the circumstances surrounding the project may also make things simpler or obviate certain steps that would ordinarily be necessary to ensure secure transmission:

    if you trust both applications.


    Both of the applications using this data are ours. We're building the application that writes the data and we're building the application that reads it, and they, and the database, will all be running on the same server.

    So here's my thoughts on this code based on what I've read and what people have said here:

    Things I know are wrong with it according to normal encryption guidelines:

    1) It doesn't randomize the Initialization Vector. If you had a bunch of data generated with this same, static, key, finding patterns in the hashes and figuring out the key would take less time.

    2) In response to this:

    an attacker can inject ciphertexts into your database and then use the output of our application to determine the encryption key.


    I'm not signing the data at all. If someone had access to our database they could write stuff to it and if it was written with the right key, our site would never be any the wiser.

    Reasons I think some of these are probably ok in this case:

    1) I don't actually know if this is ok. In fact I know it's bad, I just don't know *how* bad it is. I do know that the practical result of this would be that if someone figured out how to decrypt one, they would know how to decrypt them all, but if you had, say, 500 strings about 40 characters long, all encrypted with Triple DES using the same secret key, how long do you think it would take to figure out how to decrypt them? Like, I'm not totally clear on the signifiicance of this, for example:

    otherwise the algorithm reduces to a block cipher in ECB mode



    ... Which is bad? This sounds bad.

    2) The danger with not signing encrypted data, the way I understand this, is that someone could, say, take a String like "TheQuickBrownFoxIsATotalIdiot", hash it with a bunch of keys and put every hashed version of it in the database, then look at how our site decrypts them all?
    e
    From wikipedia: "In the attack, an adversary has a chance to enter one or more known ciphertexts into the system and obtain the resulting plaintexts."

    The way our site is set up though, only top-level users would be able to see the resulting plaintexts. We *could* theoretically remove even the top-level users' ability to see the unencrypted Strings; once it's entered we need to use it but we don't technically ever need our users to be able to see it again. But if someone had a top level login on our site, this probably wouldn't be our biggest concern anyway.

    Also, some asides, I'm a little confused here:

    and store that along with the message



    If you store the initialization vector along with the message, don't you still have the same problem with the IV itself being unsigned?
     
    Alex Lieb
    Ranch Hand
    Posts: 66
    3
    Netbeans IDE Notepad Java
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Also I just realized I did this:

    > TripleDES/ECB/PKCS5Padding

    ...

    > NEVER use block ciphers with ECB mode

    oops... I think it's time to Wikipedia ECB mode...
     
    Stephan van Hulst
    Bartender
    Posts: 15737
    368
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Alex Lieb wrote:I finally got it to work with Triple DES encryption;


    TripleDES is old and pretty much pointless to use if you have AES.

    I put it in Base 64 for storage, and it does NOT use a randomized Initialization Vector:


    ECB doesn't use IVs. That's why ECB is horrible. You might as well not use encryption at all. If you use a block cipher mode other than ECB, Cipher will automatically generate an IV for you. If you don't use this exact IV in the decryption step, your message will be garbage.

    1) I don't actually know if this is ok. In fact I know it's bad, I just don't know *how* bad it is. I do know that the practical result of this would be that if someone figured out how to decrypt one, they would know how to decrypt them all, but if you had, say, 500 strings about 40 characters long, all encrypted with Triple DES using the same secret key, how long do you think it would take to figure out how to decrypt them?


    Not long. Maybe a few days or weeks.

    ... Which is bad? This sounds bad.


    Yes. Using IVs make sure that encrypted data looks random. If you don't use an IV, an attacker may find patterns in your ciphertext, helping them to decrypt the message. Worse, they may be able to guess your key and then all messages are compromised. This is the reason why ECB is bad: It never uses an IV.

    The way our site is set up though, only top-level users would be able to see the resulting plaintexts. We *could* theoretically remove even the top-level users' ability to see the unencrypted Strings; once it's entered we need to use it but we don't technically ever need our users to be able to see it again. But if someone had a top level login on our site, this probably wouldn't be our biggest concern anyway.


    Fair enough. Signing the messages may not be your highest concern, but why not do it anyway unless you have a good reason not to? I'm pretty sure that if you change the algorithm to use AES/GCM, you get all the benefits with minimal downsides. One downside may be how fast the messages are encrypted and decrypted, but most of the time that actually just makes things more secure.

    If you store the initialization vector along with the message, don't you still have the same problem with the IV itself being unsigned?


    Yes. If you want to, you can generate a MAC for the IV and the ciphertext, and store that alongside the IV and ciphertext. I don't really feel it's worth the effort though, because the worst an attacker can do is inject an IV that by some miracle happens to be the one that was used to encrypt your message, which itself has already been authenticated. In a very hostile environment I would recommend generating the MAC, but it doesn't sound like it's really that important in your use case.

    Some remarks about your code:
  • Avoid String. Strings can not be easily destroyed and may remain in memory for a long time. It's good practice to destroy all sensitive data as soon as you don't need them any more. That means that passwords and base64 should be passed around as char arrays, which you can zero out when you're done.
  • If you use a password to encrypt messages, use a proper key derivation algorithm. String.getBytes() is NOT a proper key derivation algorithm. Instead, use something like PBKDF2.
  • Using passwords only makes sense if you require the user to enter the password when they use the application. Otherwise, generate a random key and either put it in a KeyStore or hardcode the raw bytes (see next post).
  • As I already mentioned earlier, don't use TripleDES. Use AES.
  • Don't use ECB. If you don't want an authenticated encryption mode like GCM, at least use CBC mode.
  • Java contains built in Base64 encoder and decoder classes.
  • NoSuchPaddingException and NoSuchAlgorithmException probably warrant an AssertionError if you use guaranteed transformations.
  • You never destroy your key, plaintext, ciphertext or encoded ciphertext.
  •  
    Stephan van Hulst
    Bartender
    Posts: 15737
    368
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Stephan van Hulst wrote:or hardcode the raw bytes


    Actually, let me get back on that one. *Never* hardcode keys. Instead, you can store the key as base64 in a configuration file, and control access to the configuration file through the OS. That way, only authorized applications can get to it.
     
    Alex Lieb
    Ranch Hand
    Posts: 66
    3
    Netbeans IDE Notepad Java
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Hi! So I'm thinking if I'm going to turn this into a more secure thing it would probably be better to do it on my own time; I showed my boss what I had so far and I read this back to him:

    > Not long. Maybe a few days or weeks.

    And he laughed at it; apparently that's more security than he was initially hoping for. I told him I could make it *more* secure by adding a few more columns to the database to store randomized Initialization Vectors for the fields we need to have encrypted, but he told me not to worry about it. So I don't really have a reason to continue working on this for work purposes, but I feel like it wouldn't really take that much work to make it significantly *more* secure, so it would be a shame to stop now.

    I didn't know you could do this stuff with simple Java libraries like this; I always envisioned it requiring some kind of external resources, like maybe you can set up the encryption and decryption engines yourself but you have, say, a hash dealer you meet in a dark alley on Thursdays at 4 to buy your secure hashes for that week. And sometimes he isn't there because the law caught up with him, and he isn't actually doing anything illegal, but he's one of those tin-foil hat types, so he doesn't like them knowing where he makes his deals.

    I'd still like to figure out how to get the best performance and security out of what I want to call 'genuine country home-grown encryption', so I might continue this at home and try to encrypt casserole recipes or something, but because the immediate problem has been solved, I'm going to go ahead and mark this 'resolved.'

    The one thing I might still fix in the code for work though:

    > *Never* hardcode keys.

    I don't really have enough experience with the security stuff to look at code and instinctively say "Ewww this is gross because it's not secure", but the hardcoded encryption key still feels gross just because it violates ordinary, non-security-related coding standards. This key is going to be used by multiple programs, and it shouldn't be hardcoded.

    Thanks for your help!
     
    Stephan van Hulst
    Bartender
    Posts: 15737
    368
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I get where your boss is coming from, but it's still bad.

    Using IVs doesn't make your application 'more secure' in the same way that using longer keys does. If you've properly set up your encryption algorithm, using longer keys or a different encryption algorithm will have a measurable effect on security. Not using IVs breaks your encryption and there is no telling how much less secure the application will be. Why use encryption at all if you're not planning on using it properly?

    For future reference, please remember the following:
  • Encrypted or hashed data needs to look like random noise. Patterns are BAD.
  • Encrypting the same thing twice needs to yield two completely different ciphertexts.
  • Block ciphers can only fulfill the former requirements if you generate either a unique IV or a unique key every time you encrypt something.
  • Encryption without authentication leaves you open to unexpected kinds of attacks. Never even attempt to decrypt a ciphertext without having authenticated the origin first.

  • Not adhering to these measures may still yield ciphertexts that look safe, BUT THIS IS FALSE SENSE OF SECURITY.
     
    Look! It's Leonardo da Vinci! And he brought a tiny ad!
    Smokeless wood heat with a rocket mass heater
    https://woodheat.net
    reply
      Bookmark Topic Watch Topic
    • New Topic