How can I pass a flowfile attribute to a controller service?

classic Classic list List threaded Threaded
6 messages Options
RP
Reply | Threaded
Open this post in threaded view
|

How can I pass a flowfile attribute to a controller service?

RP
Hi Team,

Following are the combination of processors that I am using:-

GetFile + SplitText + ExtractText + UpdateAttribute + ExecuteSQL +
ConvertAvroToJson + PutFile

Basically,I have a properties file which contains 5 comma separated values
that are required by the 'DBCPConnectionPool' controller service to
establish connection with the database. Here is the content of my
properties file:-

jdbc:mysql://localhost:3306/test,com.mysql.jdbc.Driver,C:\Program
Files\MySQL\mysql-connector.jar,root,root

Now, I am extracting the values from this properties file and storing them
in manually created properties.I am using this regex to store that values
into attributes using ExtractText.

ExtractedData: (.*)

Then use updateAttribute processor to manually add 5 properties and get
their values from the properties file like below:

connectionURL  : ${ExtractedData:getDelimitedField(1)}
driverClass    : ${ExtractedData:getDelimitedField(2)}
driverLocation : ${ExtractedData:getDelimitedField(3)}
user           : ${ExtractedData:getDelimitedField(4)}
password       : ${ExtractedData:getDelimitedField(5)}

So, by now the attributes have got their values from the properties file
and thus following values stored in them:

connectionURL  : jdbc:mysql://localhost:3306/test
driverClass    : com.mysql.jdbc.Driver
driverLocation : C:\Program Files\MySQL\mysql-connector.jar
user           : root
password       : root

Finally, Here is what I am trying to achieve. I am trying to use these
above 5 attributes in the DBCPConnectionPool Controller Service like this:

 Database Connection URL     : ${connectionURL}
 Database Driver Class       : ${driverClass}
 Database Driver Location(s) : ${driverLocation}
 Database User               : ${user}
 Password                    : ${password}

But I am unable to establish the connection and I am getting the error
'Cannot create PoolableConnectionFactory'. It seems that the controller
service is unable to read the value from the attributes. How can I pass a
flowfile attribute to a controller service?


RP
Reply | Threaded
Open this post in threaded view
|

Re: How can I pass a flowfile attribute to a controller service?

Andy LoPresto-2
Hi Rishab,

Someone asked a similar question and I answered it on Stack Overflow [1]. The long and short of it is while the DBCPControllerService properties support expression language, they do not have access to flowfile attributes, because the expression language is evaluated on controller service enable, not per-flowfile being operated on in a processor. This is the idea behind the separation of concerns with controller services and processors. 

If the configuration values are simply different per-environment, I would recommend you use the Variable Registry [2] or environment variables to hold those values and read them when the controller service is enabled. 

If for some reason the values are different per-flowfile, I think you will need to explore other options (primarily, re-organizing your dataflow) as described in the linked answer. 



Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Apr 10, 2018, at 3:45 AM, Rishab Prasad <[hidden email]> wrote:

Hi Team,

Following are the combination of processors that I am using:-

GetFile + SplitText + ExtractText + UpdateAttribute + ExecuteSQL +
ConvertAvroToJson + PutFile

Basically,I have a properties file which contains 5 comma separated values
that are required by the 'DBCPConnectionPool' controller service to
establish connection with the database. Here is the content of my
properties file:-

jdbc:mysql://localhost:3306/test,com.mysql.jdbc.Driver,C:\Program
Files\MySQL\mysql-connector.jar,root,root

Now, I am extracting the values from this properties file and storing them
in manually created properties.I am using this regex to store that values
into attributes using ExtractText.

ExtractedData: (.*)

Then use updateAttribute processor to manually add 5 properties and get
their values from the properties file like below:

connectionURL  : ${ExtractedData:getDelimitedField(1)}
driverClass    : ${ExtractedData:getDelimitedField(2)}
driverLocation : ${ExtractedData:getDelimitedField(3)}
user           : ${ExtractedData:getDelimitedField(4)}
password       : ${ExtractedData:getDelimitedField(5)}

So, by now the attributes have got their values from the properties file
and thus following values stored in them:

connectionURL  : jdbc:mysql://localhost:3306/test
driverClass    : com.mysql.jdbc.Driver
driverLocation : C:\Program Files\MySQL\mysql-connector.jar
user           : root
password       : root

Finally, Here is what I am trying to achieve. I am trying to use these
above 5 attributes in the DBCPConnectionPool Controller Service like this:

Database Connection URL     : ${connectionURL}
Database Driver Class       : ${driverClass}
Database Driver Location(s) : ${driverLocation}
Database User               : ${user}
Password                    : ${password}

But I am unable to establish the connection and I am getting the error
'Cannot create PoolableConnectionFactory'. It seems that the controller
service is unable to read the value from the attributes. How can I pass a
flowfile attribute to a controller service?




signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: How can I pass a flowfile attribute to a controller service?

Matt Burgess-2
+1 to Andy's answer. Note that DBCPConnectionPool isn't a connection
but a connection pool, which keeps some number of connections
available for use by multiple consumers (processors, e.g.). I wouldn't
want to have to keep track of X number of configurations in a single
controller service, which would result in X connection pools, each
times Y max connections. Also with

Pierre and I talked about an attempt to support this "hot-swap"
behavior in the following way (the names should be changed, just
trying to be overly obvious with their purpose):

- Add a DelegatingDBCPService interface, extending DBCPService but
adding a method getConnectionFromPoolByName(String name).
- An implementation would probably offer user-defined properties whose
keys are names and whose values are other DBCPConnectionPool instances
- A supporting processor would have to check if the DBCPService is a
DelegatingDBCPService (yuck)
- In that case, the processor could provide a name from a flow file
attribute or something to getConnectionFromPoolByName

To avoid the code smell in the third point, there could be "hot-swap"
versions of the processors, but that doubles the number of them.
Perhaps we'd want to pick and choose, such as only ExecuteSQL and
PutSQL (as it would make downstream handling of QueryDatabaseTable and
GenerateTableFetch pretty messy).

Alternatively, we could offer in the dropdown for DBCPService
instances an option for "Use dbcp.name attribute", but there can be
name clashes all over the place in NiFi, so it'd likely have to be
"Use dbcp.id attribute" and still you'd need to know how to map your
incoming parameters/attributes (URL, DB, intended destination, e.g.)
to the IDs of the DBCP instances.

Thoughts? Thanks,
Matt
RP
Reply | Threaded
Open this post in threaded view
|

Re: How can I pass a flowfile attribute to a controller service?

RP
Hi,

Thanks for the reply. From the above replies I understand that the flowfile
attributes were not available by the DBCPConnectionPool service because the
expression langugage used by me were evaluated at the time of service
enable. Thus, the service is starting before any other processor used in the
flow and at the time of service enable the file containing the properties
has not been read by the GetFile processor, and hence the expression
language doesn't even have the content to extract from.

So basically, the connection to the database is being established even
before the processors have started to run. So my question here is, why does
Nifi establishes the connection even before any other processor starts
running. Also, is there a way where we can postpone the evaluation of
expression language used til the processor starts running? If we can do
that, we can certainly make the flowfile attributes available to the
controller service. Ultimately, I am looking for a way where we can make the
flowfile attributes available to the controller services. (I know we can
achieve that using 'variable.registry' but every time their is change in the
file, it requires a restart. I need to avoid the restart)



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
RP
Reply | Threaded
Open this post in threaded view
|

Re: How can I pass a flowfile attribute to a controller service?

Mike Thomsen
Rishab,

> Also, is there a way where we can postpone the evaluation of expression
language used til the processor starts running?

No because the controller service is a dependency of the processor and
evaluating the expression language in the CS property descriptors must be
done when enabling the controller service to satisfy the aforementioned
dependency.

Thanks,

Mike

On Mon, Apr 23, 2018 at 7:31 AM rishabprasad005 <[hidden email]>
wrote:

> Hi,
>
> Thanks for the reply. From the above replies I understand that the flowfile
> attributes were not available by the DBCPConnectionPool service because the
> expression langugage used by me were evaluated at the time of service
> enable. Thus, the service is starting before any other processor used in
> the
> flow and at the time of service enable the file containing the properties
> has not been read by the GetFile processor, and hence the expression
> language doesn't even have the content to extract from.
>
> So basically, the connection to the database is being established even
> before the processors have started to run. So my question here is, why does
> Nifi establishes the connection even before any other processor starts
> running. Also, is there a way where we can postpone the evaluation of
> expression language used til the processor starts running? If we can do
> that, we can certainly make the flowfile attributes available to the
> controller service. Ultimately, I am looking for a way where we can make
> the
> flowfile attributes available to the controller services. (I know we can
> achieve that using 'variable.registry' but every time their is change in
> the
> file, it requires a restart. I need to avoid the restart)
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: How can I pass a flowfile attribute to a controller service?

Bryan Bende
I'm not sure if this helps, but you mentioned not being able to use
the variable.registry because it requires a restart.

That is true for the file-based variable registry, however it is not
true for the UI-based variable registry [1].

Keep in mind that neither of the variable registries are really
intended to store sensitive values like your DB password, but since
you are already seemed ok with putting your password unencrypted into
the file-based variable registry, it wouldn't be much different
putting it unencrypted into the UI-based variable registry.

[1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Variables_Window

On Mon, Apr 23, 2018 at 7:41 AM, Mike Thomsen <[hidden email]> wrote:

> Rishab,
>
>> Also, is there a way where we can postpone the evaluation of expression
> language used til the processor starts running?
>
> No because the controller service is a dependency of the processor and
> evaluating the expression language in the CS property descriptors must be
> done when enabling the controller service to satisfy the aforementioned
> dependency.
>
> Thanks,
>
> Mike
>
> On Mon, Apr 23, 2018 at 7:31 AM rishabprasad005 <[hidden email]>
> wrote:
>
>> Hi,
>>
>> Thanks for the reply. From the above replies I understand that the flowfile
>> attributes were not available by the DBCPConnectionPool service because the
>> expression langugage used by me were evaluated at the time of service
>> enable. Thus, the service is starting before any other processor used in
>> the
>> flow and at the time of service enable the file containing the properties
>> has not been read by the GetFile processor, and hence the expression
>> language doesn't even have the content to extract from.
>>
>> So basically, the connection to the database is being established even
>> before the processors have started to run. So my question here is, why does
>> Nifi establishes the connection even before any other processor starts
>> running. Also, is there a way where we can postpone the evaluation of
>> expression language used til the processor starts running? If we can do
>> that, we can certainly make the flowfile attributes available to the
>> controller service. Ultimately, I am looking for a way where we can make
>> the
>> flowfile attributes available to the controller services. (I know we can
>> achieve that using 'variable.registry' but every time their is change in
>> the
>> file, it requires a restart. I need to avoid the restart)
>>
>>
>>
>> --
>> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
>>