Re: Passwords in EncryptContent

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Re: Passwords in EncryptContent

Andy LoPresto-2

(I added the dev list back in because this is probably of interest to someone/ should be documented. )

This started out as a brief email and spiraled into a couple days of work. Part of that is because my Ruby is rusty, but part is because there are some serious underlying issues here (previously quickly noted in NIFI-1465 [1] but not fully expounded upon at the time). The tl;dr of this email is that I have written a Ruby script which will accomplish what you want and it is located here [2]. Read on at your own risk (various parts of this email were written over the course of the last two days, so it may be repetitive/incoherent where the story changed). 

This is a confusing case because it is unusual in execution and there are multiple layers here. I think I didn’t explain it well last time. I’ll try to step through it, but I also apologize in advance, because I found a lot of legacy stuff here that was probably written with no intention of ever being exposed/integrated with an external source. Kerckhoff would not approve. I will add some of this to NIFI-1465. 

The mechanism that NiFi uses to encrypt the sensitive property values (i.e. passwords for EncryptContent) is as I described previously, but in further investigating to help solve your problem, I realized that the Jasypt [3] StandardPBEStringEncryptor used in StandardFlowSynchronizer uses a random salt generator internally. You can verify this by making a new flow with two EncryptContent processors — even if you set the same password for each, the resulting cipher texts in the flow.xml will be unique because despite the same master key being used, the random salt will cause them to be different. 

Now this is actually good news, because it means the salt must be encoded and transmitted with the cipher text. If it was not, NiFi would not be able to decrypt these values unless it used a fixed salt, and clearly it does not. So as long as your Ruby code generates a salt of the correct length and embeds it in the cipher text, it will be compatible with NiFi. The salt is the first 16 bytes (32 hex characters) and the actual cipher text representing the encrypted processor sensitive property (happens to be another password, but this is irrelevant) is the second 16 bytes. 

Note: because your initial plaintext (the password you are trying to encrypt) is only 11 UTF-8 characters, it can be represented by 11 bytes. This means that when encrypted using AES-CBC (16 byte block size), it requires only and exactly one block (11 bytes of plaintext plus 5 bytes padding). The resulting cipher text is 16 bytes. If the processor password was longer than 16 characters, it would be encrypted in two blocks and encoded as 32 bytes, or 64 hex characters alone (remember to add the initial 32 chars for the salt for a total of 96 chars). 

What we are looking for as the output of the Ruby operation is, as you noted, a 64 character hex string, of the format:

output = hex_encode(salt || encrypt(processor_password, master_key, iv))

where the master_key and iv are derived by

(master_key, iv) = md5(master_passphrase || master_salt) || md5(md5(master_passphrase || master_salt) || master_passphrase || master_salt) || md5(md5(md5(…)…)…)

This is an unusual method and is described thusly on the OpenSSL EVP_BytesToKey documentation [4]: 

If the total key and IV length is less than the digest length and MD5 is used then the derivation algorithm is compatible with PKCS#5 v1.5 otherwise a non standard extension is used to derive the extra data.



The key and IV is derived by concatenating D_1, D_2, etc until enough data is available for the key and IV. D_i is defined as:

        D_i = HASH^count(D_(i-1) || data || salt)

where || denotes concatenation, D_0 is empty, HASH is the digest algorithm in use, HASH^1(data) is simply HASH(data), HASH^2(data) is HASH(HASH(data)) and so on.

The initial bytes are used for the key and the subsequent bytes for the IV.

The reason there are multiple MD5 operations above is because we have specified the encryption will use AES-256-CBC, which requires a 256 bit (32 byte) key — 32 bytes are represented by 64 hex characters. A single iteration of MD5 only yields 16 bytes (32 hex chars), so we must concatenate it with another invocation. However, as it is deterministic, running it on the same input would return the same output, and the key would just repeat the same 16 bytes. To counter this, the second ( up to n many) invocation “salts” the input with the result of the previous step. If we substitute some variables for the full expression above, we can see this more clearly:

let x = “master_passphrase || master_salt”

master_key = md5(master_passphrase || master_salt) || md5(md5(master_passphrase || master_salt) || master_passphrase || master_salt)

master_key = md5(x) || md5(md5(x) || x)

let y = “md5(x)

master_key = y || md5(y || x)

Indeed, the necessary length of the output is greater, as we need another 16 bytes for the IV, so we continue with this series:

let z = “md5(y || x)”

iv = md5(z || y || x)

So to revisit the NiFi encryptor, it requires a random salt to be embedded at the beginning of the cipher text so it can be split off before decryption to seed the cipher object. To mimic its behavior in Ruby, we’ll have to match those parameters. 

Normally this would be simple. You would use the cipher.pkcs5_keyivgen method to derive the encryption key and IV from the master passphrase and salt. You would then perform the encryption normally, using AES-CBC, the key, and the IV, and concatenate the salt with the cipher text, and be good to go. 

Here is where it gets nasty. 

As this feature was developed early in NiFi’s history, it leverages a library called Jasypt, specifically a class called StandardPBEStringEncryptor [5] wrapped in a local class StringEncryptor [6], to perform all encryption and decryption. I cannot speak for the developer of Jasypt, but they decided to set the salt size to the block size of whatever cipher was being used. For AES, that means 16 bytes. This is not the worst idea in the world, but it has serious consequences when used in conjunction with an algorithm that requires a specific salt length. In our case, the OpenSSL EVP_BytesToKey method expects (and enforces) an 8 byte salt. Jasypt does not expose a mechanism to provide a custom salt length other than by injecting a new SaltGenerator [7] implementation at initialization time, and this SaltGenerator is not aware of the algorithm selected for the encryptor. 

If we could intercept and override this value (which I was able to do via Groovy reflection [8] and breaking Java access controls), we could set it to 8 bytes, so Jasypt would follow the EVP_BytesToKey implementation. However, we cannot do this for NiFi itself (one, it would require fighting the intention of the library & Java access controls, two, it would be a breaking change, as every existing flow would be unable to decrypt any sensitive properties). When the default protection scheme is improved in NIFI-1465, this will be addressed using a migration tool. 

But the result of that decision is that we cannot simply use the cipher.pkcs5_keyivgen method that wraps all of that logic to generate the key and IV in a “standard” way. 

At this point, I was lucky enough to come across Ola Bini’s work [9] in porting the OpenSSL methods to JRuby. I was able to modify his implementation in Groovy to handle an arbitrary salt length, and then translate that back to Ruby. It is probably not the cleanest or most Ruby-idiomatic implementation because I haven’t touched the language in a few years, so feel free to clean it up, but it is functionally compatible with Jasypt and OpenSSL (for their respective salt lengths). 

You’ll have to adapt the Ruby scripts I provided to handle whatever your key/salt/value input mechanisms are, but currently you can just edit the script, populate the key, salt, and sensitive property values, and run the script. The output is of the form “enc{abcdef…}” so you can immediately populate your templates with it. 

Because unit tests only provide confidence over the specific system under test, I have verified this by making a flow which encrypted data using a key derived from “password123” and decrypted the same data using a key derived from “password456”. Both of these values were encrypted and written to the flow.xml file. Obviously, the decryption was failing. I stopped NiFi, ran the script, unzipped the flow.xml.gz to an XML file, copied the output into the second processor properties, rezipped the XML file to flow.xml.gz, and restarted the flow. The data was now successfully decrypted. 

I can provide screenshots/logs if necessary. 

I hope this helps. If you have further questions, please let me know. 

Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jul 25, 2016, at 1:43 PM, Hite, Brett <[hidden email]> wrote:

Thanks. I might take you up on that if you’d be willing and when you have time. I’ve been working with some of the sample code you’ve linked to so I can get a better understand. I’ve tried several and I think this is the right track so far, where “Cipher text” would be the value enclosed within enc{}:
require 'openssl'
pass_phrase = 'mybigsecretkey'
cipher = 'AES-256-CBC'
salt = nil
cipher.pkcs5_keyivgen pass_phrase, salt, 0
encrypted = cipher.update 'password123'
encrypted <<
puts "Cipher text: #{bin_to_hex(encrypted)}"
What’s not clicking for me is the length of “Cipher text”. I’m getting 32 hex characters when I need 64 for the enc{} field.
Brett Hite
From: Andy LoPresto [[hidden email]] 
Sent: Monday, July 25, 2016 12:42 PM
To: Hite, Brett
Subject: Re: Passwords in EncryptContent
I’m focused on some feature work to get NiFi 1.0 out the door right now, but in a couple of days I can chat with you if you need more help. Good luck getting it going. 
Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
On Jul 25, 2016, at 7:53 AM, Hite, Brett <[hidden email]> wrote:
Thanks again, Andy. I’m still playing around with it but I feel like I’m on the right track thanks to your input.
Brett Hite
From: Andy LoPresto [[hidden email]] 
Sent: Friday, July 22, 2016 12:20 PM
[hidden email]
Subject: Re: Passwords in EncryptContent
Hi Brett,
I believe the section you want is here — PKCS #5 Password Based Encryption [1]. There is also brief discussion of the method here [2] and full documentation here [3]. 
I don’t have my Ruby environment set up right now, but basically in the example from the first link, the “pass_phrase” is the value of “nifi.sensitive.props.key”, the salt is nil I believe, and the cipher is instantiated with AES-256-CBC unless you have changed the value of “nifi.sensitive.props.algorithm”. The digest param defaults to MD5, and you’ll need to provide an iterations value of 0 (or 1 — it happens once, but when invoking it in Java, the value needs to be 0 for some reason). Play around with these values and one combination will match. The plaintext is then the raw value of the property you want to encrypt. 
Check out this test [4] to see code which verifies the new NiFi implementation is compatible with the legacy key derivation, and this test [5] to see an example of verification of Ruby key derivation using PBKDF2 [5].  
Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
On Jul 22, 2016, at 8:41 AM, Hite, Brett <[hidden email]> wrote:
I’ve been banging my head on this for a while now so reaching out again. Ruby and encryption are topics I can learn more of for sure!
The Ruby code you have below was helpful. Maybe this is my naiveté with Ruby, but I can’t find a way to use EVP_BytesToKey or PBKDF1 with the OpenSSL module [1]. Any tips or resources that you would recommend?
Brett Hite
From: Andy LoPresto [[hidden email]] 
Sent: Thursday, July 14, 2016 2:46 PM
[hidden email]
Subject: Re: Passwords in EncryptContent
Happy to help, Brett. I always like seeing people use the software in a secure manner. If you can, you may want to publish your tool on GitHub. While you don’t have to submit it back for inclusion in NiFi itself (and if it’s a Ruby tool, it may not be correct for inclusion), there are many people who share their personal extensions and tools for administering NiFi with the public. 
I remembered I had written some Ruby code using OpenSSL for key derivation and encryption verification for an earlier ticket, so take a look here [1] for some examples that may help. Basically, you switch out the PBKDF2 invocation with the EVP_BytesToKey (aka PKCS #5 v1.5 PBKDF1) and use an empty salt, and change the cipher to AES-256-CBC. 
Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
On Jul 14, 2016, at 2:23 PM, Hite, Brett <[hidden email]> wrote:
Hi Andy,
I think I need a little time to review your post, but this sounds exactly like what I was looking for. I was looking for a way to create the encrypted value stored within the “enc{ … }” tag. Thank you for translating my question and for the quick response!
Brett Hite
From: Andy LoPresto [[hidden email]] 
Sent: Thursday, July 14, 2016 2:14 PM
[hidden email]
Subject: Re: Passwords in EncryptContent
Hi Brett,
I’m not sure I understand your question completely, so let me try to describe it and you can correct me where I get it wrong. 
You have some deployment system which uses a Ruby process to replace tokens in a flow template with the “real” values, and one of the values that needs to be set is the password used by an EncryptContent processor configured with password-based encryption. (This much makes sense to me). 
What I am confused by is your reference to “hash values”. While in many situations (most web applications, user databases, etc.) cryptographic hashing is the correct way to protect a password or other sensitive value when persisting to disk, this is only appropriate if the raw sensitive value does not need to be retrieved. However, in this scenario, the password must be usable in raw form to derive the key to encrypt content, so it cannot be stored in a “hash value” format (irreversible), but rather encrypted (reversible). 
In order to persist the encrypted form of this password, you need to run the same encryption algorithm and use the same key as NiFi does. These are exposed to you in using the keys “nifi.sensitive.props.key” and
“nifi.sensitive.props.algorithm”. By default the key is blank, and the algorithm is “PBEWITHMD5AND256BITAES-CBC-OPENSSL” — in English, that’s Password-Based Encryption using a single iteration of MD5 digest over the password (the previous property) and salt (none in this case), taking the resulting 32 hexadecimal characters (16 bytes) as the first half of a 256 bit (32 byte) key, then calculating the MD5 of this value concatenated with the raw password and raw salt again as the second half. [1][2] That key is now used with AES-256 in CBC mode [3] to encrypt the raw sensitive values and persist them in the form “enc{hex-encoded-ciphertext}” in the flow (see below for example). If you feel at this point that the default key derivation function is not sufficiently strong, know that I agree with you and have opened a Jira to increase the strength of this process [4]. 
Anyway, to answer (what I believe is your question), you can write Ruby code to populate your template with the encrypted value by retrieving the sensitive properties key from, use the Ruby OpenSSL bindings [5] to derive the key and encrypt the password, and then encode it in hexadecimal and wrap it with the “enc{“ and “}” tags. 
I would also suggest you look at the Variable Registry [6][7], upcoming encrypted configuration files [8], and deterministic templates [9][10], as these may provide an easier way to perform what you are looking for, or at least inform your next steps if you wish to keep your Ruby template system and move forward in a compatible manner. 
If this didn’t answer your question (or raised others), please reply and I’ll filter my thoughts through someone with a more human understanding of the system. 
      <position x="1101.0" y="176.0"/>
        <name>Encryption Algorithm</name>
Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
On Jul 14, 2016, at 1:23 PM, Hite, Brett <[hidden email]> wrote:
I have a flow file that is created from a Ruby template file (flow.xml.erb). The template contains variables that the user can set that then get populated when NiFi is set up. I have an EncryptContent processor and would like to create a template variable for the Password property. Ideally, the user would say “password = some_password” and the template variable would evaluate to the hash value stored in the actual flow file.
Is there a way that I can calculate the hash value given a plain text password? I’ve looked around and haven’t found too much. The NiFi Administration Guide has an Encryption Configuration section that doesn’t quite answer my question.
Brett Hite

signature.asc (859 bytes) Download Attachment