Enrichment plugin for adding attributes from SQL

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Enrichment plugin for adding attributes from SQL

Brett Ryan
Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.

I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.

Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.


Feel free to do with it what you please.

I've published to maven central but it will take a day to appear in the search.

<dependency>
  <groupId>com.drunkendev</groupId>
  <artifactId>nifi-drunken-nar</artifactId>
  <version>1.0.0</version>
  <type>nar</type>
</dependency>
<dependency>
  <groupId>com.drunkendev</groupId>
  <artifactId>nifi-drunken-processors</artifactId>
  <version>1.0.0</version>
</dependency>
<dependency>
  <groupId>com.drunkendev</groupId>
  <artifactId>nifi-drunken-bundle</artifactId>
  <version>1.0.0</version>
  <type>pom</type>
</dependency>


signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Andy LoPresto-2
Hi Brett,

It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for. 


Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:

Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.

I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.

Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.


Feel free to do with it what you please.

I've published to maven central but it will take a day to appear in the search.

<dependency>
  <groupId>com.drunkendev</groupId>
  <artifactId>nifi-drunken-nar</artifactId>
  <version>1.0.0</version>
  <type>nar</type>
</dependency>
<dependency>
  <groupId>com.drunkendev</groupId>
  <artifactId>nifi-drunken-processors</artifactId>
  <version>1.0.0</version>
</dependency>
<dependency>
  <groupId>com.drunkendev</groupId>
  <artifactId>nifi-drunken-bundle</artifactId>
  <version>1.0.0</version>
  <type>pom</type>
</dependency>



signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Joey Frazee
Andy, Brett,

Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.

If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.

-joey

On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:

> Hi Brett,
>
> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>
> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>
> Andy LoPresto
> [hidden email]
> [hidden email]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
> >
> > Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
> >
> > I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
> >
> > Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
> >
> > https://github.com/brettryan/nifi-drunken-bundle
> >
> > Feel free to do with it what you please.
> >
> > I've published to maven central but it will take a day to appear in the search.
> >
> > <dependency>
> >   <groupId>com.drunkendev</groupId>
> >   <artifactId>nifi-drunken-nar</artifactId>
> >   <version>1.0.0</version>
> >   <type>nar</type>
> > </dependency>
> > <dependency>
> >   <groupId>com.drunkendev</groupId>
> >   <artifactId>nifi-drunken-processors</artifactId>
> >   <version>1.0.0</version>
> > </dependency>
> > <dependency>
> >   <groupId>com.drunkendev</groupId>
> >   <artifactId>nifi-drunken-bundle</artifactId>
> >   <version>1.0.0</version>
> >   <type>pom</type>
> > </dependency>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
Thanks Andy, how would update attribute be able to get the value from sql?

Consider a flow where a piece of information needs to be obtained from a DB but i do not want the contents of the current FF to be altered, using ExecuteSQL anywhere prior would not be possible due to replacing the FF contents.

What I had was two seperate flows, one that updates an oauth key in a db keeping it fresh, the main flow would then read the db just before an invokehttp.

I was originally using a distributed map cache but had concerns that it might not be secure and was also advised the cache server has been known to go down.

> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>
> Andy, Brett,
>
> Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.
>
> If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.
>
> -joey
>
>> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
>> Hi Brett,
>>
>> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>>
>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>
>> Andy LoPresto
>> [hidden email]
>> [hidden email]
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>>> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>>>
>>> Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
>>>
>>> I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
>>>
>>> Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
>>>
>>> https://github.com/brettryan/nifi-drunken-bundle
>>>
>>> Feel free to do with it what you please.
>>>
>>> I've published to maven central but it will take a day to appear in the search.
>>>
>>> <dependency>
>>>   <groupId>com.drunkendev</groupId>
>>>   <artifactId>nifi-drunken-nar</artifactId>
>>>   <version>1.0.0</version>
>>>   <type>nar</type>
>>> </dependency>
>>> <dependency>
>>>   <groupId>com.drunkendev</groupId>
>>>   <artifactId>nifi-drunken-processors</artifactId>
>>>   <version>1.0.0</version>
>>> </dependency>
>>> <dependency>
>>>   <groupId>com.drunkendev</groupId>
>>>   <artifactId>nifi-drunken-bundle</artifactId>
>>>   <version>1.0.0</version>
>>>   <type>pom</type>
>>> </dependency>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Andy LoPresto-2
UpdateAttribute doesn’t pull from a database, it uses static or dynamic attribute values and supports NiFi Expression Language. In your original message, you didn’t mention any database interaction, so I thought you were just trying to accomplish "I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile”, which is indeed what UpdateAttribute does. 

If you need to retrieve those values from a database, as Joey mentions, the LookupService is the right tool. 

With your prior setup, the distributed map cache is as secure as the NiFi configuration — if using secured NiFi, the communication between that node and any other is over TLS, and within the node it’s a memory access. 

A big part of the NiFi philosophy is the same as the Unix philosophy — each tool should do one job very well, and to perform complicated tasks, chain those tools together. This helps with provenance reporting, usage reporting, debugging, flow development lifecycle, maintenance, etc. A processor which retrieves attributes from a database and updates the incoming flowfile with them is certainly useful in the use case you describe, but is not a generic pattern. There’s no intent to discourage custom development, and whatever makes your flow work is fine. Just explaining why you likely won’t see a solution like that in the NiFi bundled components. 


Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:

Thanks Andy, how would update attribute be able to get the value from sql?

Consider a flow where a piece of information needs to be obtained from a DB but i do not want the contents of the current FF to be altered, using ExecuteSQL anywhere prior would not be possible due to replacing the FF contents.

What I had was two seperate flows, one that updates an oauth key in a db keeping it fresh, the main flow would then read the db just before an invokehttp.

I was originally using a distributed map cache but had concerns that it might not be secure and was also advised the cache server has been known to go down.

On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:

Andy, Brett,

Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.

If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.

-joey

On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
Hi Brett,

It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.

[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html

Andy LoPresto
[hidden email]
[hidden email]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:

Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.

I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.

Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.

https://github.com/brettryan/nifi-drunken-bundle

Feel free to do with it what you please.

I've published to maven central but it will take a day to appear in the search.

<dependency>
 <groupId>com.drunkendev</groupId>
 <artifactId>nifi-drunken-nar</artifactId>
 <version>1.0.0</version>
 <type>nar</type>
</dependency>
<dependency>
 <groupId>com.drunkendev</groupId>
 <artifactId>nifi-drunken-processors</artifactId>
 <version>1.0.0</version>
</dependency>
<dependency>
 <groupId>com.drunkendev</groupId>
 <artifactId>nifi-drunken-bundle</artifactId>
 <version>1.0.0</version>
 <type>pom</type>
</dependency>




signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
In reply to this post by Joey Frazee
Ooo, i shall take a look at this, that sounds great.

Yeah, my inexperience is probably a sore point. You know what would be great, either in the add processor browser to have categories to find processors. Trying to find enrichment processors only is probably the hardest part of identifying the right processor.

Where I’m working they’re on nifi 1.1 and iirc they had > 200 processors. It’s also possible that newer processors are available in newer versions.


> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>
> Andy, Brett,
>
> Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.
>
> If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.
>
> -joey
>
>> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
>> Hi Brett,
>>
>> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>>
>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>
>> Andy LoPresto
>> [hidden email]
>> [hidden email]
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>>> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>>>
>>> Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
>>>
>>> I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
>>>
>>> Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
>>>
>>> https://github.com/brettryan/nifi-drunken-bundle
>>>
>>> Feel free to do with it what you please.
>>>
>>> I've published to maven central but it will take a day to appear in the search.
>>>
>>> <dependency>
>>>   <groupId>com.drunkendev</groupId>
>>>   <artifactId>nifi-drunken-nar</artifactId>
>>>   <version>1.0.0</version>
>>>   <type>nar</type>
>>> </dependency>
>>> <dependency>
>>>   <groupId>com.drunkendev</groupId>
>>>   <artifactId>nifi-drunken-processors</artifactId>
>>>   <version>1.0.0</version>
>>> </dependency>
>>> <dependency>
>>>   <groupId>com.drunkendev</groupId>
>>>   <artifactId>nifi-drunken-bundle</artifactId>
>>>   <version>1.0.0</version>
>>>   <type>pom</type>
>>> </dependency>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
In reply to this post by Andy LoPresto-2
Cool, thanks for the help. I’ll investigate implementing this as a LookupService as there’s no present service for connecting to SQL, I think; man, I could be wrong, I did try looking at all available ;)

To help me learn all the processors I actually dragged every processor into process groups of all related processors just so I could help remember each of the tools.

I like your analogy to unix tools, however; I probably find the tools used to chain tasks considerably less and conditional expressions easier to work out :)

> On 5 Jan 2018, at 08:32, Andy LoPresto <[hidden email]> wrote:
>
> UpdateAttribute doesn’t pull from a database, it uses static or dynamic attribute values and supports NiFi Expression Language. In your original message, you didn’t mention any database interaction, so I thought you were just trying to accomplish "I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile”, which is indeed what UpdateAttribute does.
>
> If you need to retrieve those values from a database, as Joey mentions, the LookupService is the right tool.
>
> With your prior setup, the distributed map cache is as secure as the NiFi configuration — if using secured NiFi, the communication between that node and any other is over TLS, and within the node it’s a memory access.
>
> A big part of the NiFi philosophy is the same as the Unix philosophy — each tool should do one job very well, and to perform complicated tasks, chain those tools together. This helps with provenance reporting, usage reporting, debugging, flow development lifecycle, maintenance, etc. A processor which retrieves attributes from a database and updates the incoming flowfile with them is certainly useful in the use case you describe, but is not a generic pattern. There’s no intent to discourage custom development, and whatever makes your flow work is fine. Just explaining why you likely won’t see a solution like that in the NiFi bundled components.
>
>
> Andy LoPresto
> [hidden email]
> [hidden email]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
>> On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:
>>
>> Thanks Andy, how would update attribute be able to get the value from sql?
>>
>> Consider a flow where a piece of information needs to be obtained from a DB but i do not want the contents of the current FF to be altered, using ExecuteSQL anywhere prior would not be possible due to replacing the FF contents.
>>
>> What I had was two seperate flows, one that updates an oauth key in a db keeping it fresh, the main flow would then read the db just before an invokehttp.
>>
>> I was originally using a distributed map cache but had concerns that it might not be secure and was also advised the cache server has been known to go down.
>>
>>> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>>>
>>> Andy, Brett,
>>>
>>> Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.
>>>
>>> If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.
>>>
>>> -joey
>>>
>>>> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
>>>> Hi Brett,
>>>>
>>>> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>>>>
>>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>>>
>>>> Andy LoPresto
>>>> [hidden email]
>>>> [hidden email]
>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>>
>>>>> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>>>>>
>>>>> Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
>>>>>
>>>>> I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
>>>>>
>>>>> Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
>>>>>
>>>>> https://github.com/brettryan/nifi-drunken-bundle
>>>>>
>>>>> Feel free to do with it what you please.
>>>>>
>>>>> I've published to maven central but it will take a day to appear in the search.
>>>>>
>>>>> <dependency>
>>>>>  <groupId>com.drunkendev</groupId>
>>>>>  <artifactId>nifi-drunken-nar</artifactId>
>>>>>  <version>1.0.0</version>
>>>>>  <type>nar</type>
>>>>> </dependency>
>>>>> <dependency>
>>>>>  <groupId>com.drunkendev</groupId>
>>>>>  <artifactId>nifi-drunken-processors</artifactId>
>>>>>  <version>1.0.0</version>
>>>>> </dependency>
>>>>> <dependency>
>>>>>  <groupId>com.drunkendev</groupId>
>>>>>  <artifactId>nifi-drunken-bundle</artifactId>
>>>>>  <version>1.0.0</version>
>>>>>  <type>pom</type>
>>>>> </dependency>
>>>>>
>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
In reply to this post by Andy LoPresto-2
I should qualify what my security concern on the map cache was. Given a shared cache server any other flow not related to mine could read my keys.

> On 5 Jan 2018, at 08:32, Andy LoPresto <[hidden email]> wrote:
>
> UpdateAttribute doesn’t pull from a database, it uses static or dynamic attribute values and supports NiFi Expression Language. In your original message, you didn’t mention any database interaction, so I thought you were just trying to accomplish "I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile”, which is indeed what UpdateAttribute does.
>
> If you need to retrieve those values from a database, as Joey mentions, the LookupService is the right tool.
>
> With your prior setup, the distributed map cache is as secure as the NiFi configuration — if using secured NiFi, the communication between that node and any other is over TLS, and within the node it’s a memory access.
>
> A big part of the NiFi philosophy is the same as the Unix philosophy — each tool should do one job very well, and to perform complicated tasks, chain those tools together. This helps with provenance reporting, usage reporting, debugging, flow development lifecycle, maintenance, etc. A processor which retrieves attributes from a database and updates the incoming flowfile with them is certainly useful in the use case you describe, but is not a generic pattern. There’s no intent to discourage custom development, and whatever makes your flow work is fine. Just explaining why you likely won’t see a solution like that in the NiFi bundled components.
>
>
> Andy LoPresto
> [hidden email]
> [hidden email]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
>> On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:
>>
>> Thanks Andy, how would update attribute be able to get the value from sql?
>>
>> Consider a flow where a piece of information needs to be obtained from a DB but i do not want the contents of the current FF to be altered, using ExecuteSQL anywhere prior would not be possible due to replacing the FF contents.
>>
>> What I had was two seperate flows, one that updates an oauth key in a db keeping it fresh, the main flow would then read the db just before an invokehttp.
>>
>> I was originally using a distributed map cache but had concerns that it might not be secure and was also advised the cache server has been known to go down.
>>
>>> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>>>
>>> Andy, Brett,
>>>
>>> Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.
>>>
>>> If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.
>>>
>>> -joey
>>>
>>>> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
>>>> Hi Brett,
>>>>
>>>> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>>>>
>>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>>>
>>>> Andy LoPresto
>>>> [hidden email]
>>>> [hidden email]
>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>>
>>>>> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>>>>>
>>>>> Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
>>>>>
>>>>> I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
>>>>>
>>>>> Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
>>>>>
>>>>> https://github.com/brettryan/nifi-drunken-bundle
>>>>>
>>>>> Feel free to do with it what you please.
>>>>>
>>>>> I've published to maven central but it will take a day to appear in the search.
>>>>>
>>>>> <dependency>
>>>>>  <groupId>com.drunkendev</groupId>
>>>>>  <artifactId>nifi-drunken-nar</artifactId>
>>>>>  <version>1.0.0</version>
>>>>>  <type>nar</type>
>>>>> </dependency>
>>>>> <dependency>
>>>>>  <groupId>com.drunkendev</groupId>
>>>>>  <artifactId>nifi-drunken-processors</artifactId>
>>>>>  <version>1.0.0</version>
>>>>> </dependency>
>>>>> <dependency>
>>>>>  <groupId>com.drunkendev</groupId>
>>>>>  <artifactId>nifi-drunken-bundle</artifactId>
>>>>>  <version>1.0.0</version>
>>>>>  <type>pom</type>
>>>>> </dependency>
>>>>>
>>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Andrew Lim
In reply to this post by Brett Ryan
Hi Brett,

Thanks for your feedback on the “Add Processor” window.  I’m sorry you had trouble identifying the right processor to use for your data flow.

There are tags on the left of the “Add Processor” window that categorize many of the processors into functional groups.  For example, selecting the “attributes” tag would limit the processors listed to the those related to attributes.  Additionally, there is a Filter field at the top that can also be used to search processors.  More information on using tags and filters can be found in the documentation [1].

Having said that, the UX of finding/adding a processor can definitely be improved.  There are two Jiras related to this effort that I am aware of [2][3].  If you have the opportunity to review those proposed improvements, additional thoughts/suggestions are welcomed and greatly appreciated!

[1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#adding-components-to-the-canvas
[2] https://issues.apache.org/jira/browse/NIFI-3338
[3] https://issues.apache.org/jira/browse/NIFI-4249

-Drew


> On Jan 4, 2018, at 4:33 PM, Brett Ryan <[hidden email]> wrote:
>
> Ooo, i shall take a look at this, that sounds great.
>
> Yeah, my inexperience is probably a sore point. You know what would be great, either in the add processor browser to have categories to find processors. Trying to find enrichment processors only is probably the hardest part of identifying the right processor.
>
> Where I’m working they’re on nifi 1.1 and iirc they had > 200 processors. It’s also possible that newer processors are available in newer versions.
>
>
>> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>>
>> Andy, Brett,
>>
>> Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.
>>
>> If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.
>>
>> -joey
>>
>>> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
>>> Hi Brett,
>>>
>>> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>>>
>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>>
>>> Andy LoPresto
>>> [hidden email]
>>> [hidden email]
>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>
>>>> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>>>>
>>>> Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
>>>>
>>>> I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
>>>>
>>>> Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
>>>>
>>>> https://github.com/brettryan/nifi-drunken-bundle
>>>>
>>>> Feel free to do with it what you please.
>>>>
>>>> I've published to maven central but it will take a day to appear in the search.
>>>>
>>>> <dependency>
>>>>  <groupId>com.drunkendev</groupId>
>>>>  <artifactId>nifi-drunken-nar</artifactId>
>>>>  <version>1.0.0</version>
>>>>  <type>nar</type>
>>>> </dependency>
>>>> <dependency>
>>>>  <groupId>com.drunkendev</groupId>
>>>>  <artifactId>nifi-drunken-processors</artifactId>
>>>>  <version>1.0.0</version>
>>>> </dependency>
>>>> <dependency>
>>>>  <groupId>com.drunkendev</groupId>
>>>>  <artifactId>nifi-drunken-bundle</artifactId>
>>>>  <version>1.0.0</version>
>>>>  <type>pom</type>
>>>> </dependency>
>>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
In reply to this post by Andy LoPresto-2
Looking at using a lookupservice, this doesn't seem to support sending multiple keys to the LookupService at the same time.

What I was thinking of doing was implement a LookupService that took an attribute "sql.query" which would use this to evaluate the query but then pass in a map of key/value pairs for attribute-name/column-name to set the attributes.

I could implement this as I imagined it to work, however it will evaluate the SQL expression multiple times for the same query on the same flow.

I am also wondering why LookupService#getRequiredKeys must return a Set<String>, yet; this set must only contain one value.


On 5 Jan 2018, at 08:32, Andy LoPresto <[hidden email]> wrote:

UpdateAttribute doesn’t pull from a database, it uses static or dynamic attribute values and supports NiFi Expression Language. In your original message, you didn’t mention any database interaction, so I thought you were just trying to accomplish "I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile”, which is indeed what UpdateAttribute does. 

If you need to retrieve those values from a database, as Joey mentions, the LookupService is the right tool. 

With your prior setup, the distributed map cache is as secure as the NiFi configuration — if using secured NiFi, the communication between that node and any other is over TLS, and within the node it’s a memory access. 

A big part of the NiFi philosophy is the same as the Unix philosophy — each tool should do one job very well, and to perform complicated tasks, chain those tools together. This helps with provenance reporting, usage reporting, debugging, flow development lifecycle, maintenance, etc. A processor which retrieves attributes from a database and updates the incoming flowfile with them is certainly useful in the use case you describe, but is not a generic pattern. There’s no intent to discourage custom development, and whatever makes your flow work is fine. Just explaining why you likely won’t see a solution like that in the NiFi bundled components. 


Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:

Thanks Andy, how would update attribute be able to get the value from sql?

Consider a flow where a piece of information needs to be obtained from a DB but i do not want the contents of the current FF to be altered, using ExecuteSQL anywhere prior would not be possible due to replacing the FF contents.

What I had was two seperate flows, one that updates an oauth key in a db keeping it fresh, the main flow would then read the db just before an invokehttp.

I was originally using a distributed map cache but had concerns that it might not be secure and was also advised the cache server has been known to go down.

On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:

Andy, Brett,

Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.

If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.

-joey

On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
Hi Brett,

It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.

[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html

Andy LoPresto
[hidden email]
[hidden email]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:

Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.

I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.

Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.

https://github.com/brettryan/nifi-drunken-bundle

Feel free to do with it what you please.

I've published to maven central but it will take a day to appear in the search.

<dependency>
 <groupId>com.drunkendev</groupId>
 <artifactId>nifi-drunken-nar</artifactId>
 <version>1.0.0</version>
 <type>nar</type>
</dependency>
<dependency>
 <groupId>com.drunkendev</groupId>
 <artifactId>nifi-drunken-processors</artifactId>
 <version>1.0.0</version>
</dependency>
<dependency>
 <groupId>com.drunkendev</groupId>
 <artifactId>nifi-drunken-bundle</artifactId>
 <version>1.0.0</version>
 <type>pom</type>
</dependency>





signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Mike Thomsen
Take a look at the mongo lookup service. I think it could serve as a good
example here.
On Fri, Jan 5, 2018 at 10:49 PM Brett Ryan <[hidden email]> wrote:

> Looking at using a lookupservice, this doesn't seem to support sending
> multiple keys to the LookupService at the same time.
>
> What I was thinking of doing was implement a LookupService that took an
> attribute "sql.query" which would use this to evaluate the query but then
> pass in a map of key/value pairs for attribute-name/column-name to set the
> attributes.
>
> I could implement this as I imagined it to work, however it will evaluate
> the SQL expression multiple times for the same query on the same flow.
>
> I am also wondering why LookupService#getRequiredKeys must return a
> Set<String>, yet; this set must only contain one value.
>
>
> On 5 Jan 2018, at 08:32, Andy LoPresto <[hidden email]> wrote:
>
> UpdateAttribute doesn’t pull from a database, it uses static or dynamic
> attribute values and supports NiFi Expression Language. In your original
> message, you didn’t mention any database interaction, so I thought you were
> just trying to accomplish "I wanted to add some attributes to a FlowFile
> while not altering the contents of that FlowFile”, which is indeed what
> UpdateAttribute does.
>
> If you need to retrieve those values from a database, as Joey mentions,
> the LookupService is the right tool.
>
> With your prior setup, the distributed map cache is as secure as the NiFi
> configuration — if using secured NiFi, the communication between that node
> and any other is over TLS, and within the node it’s a memory access.
>
> A big part of the NiFi philosophy is the same as the Unix philosophy —
> each tool should do one job very well, and to perform complicated tasks,
> chain those tools together. This helps with provenance reporting, usage
> reporting, debugging, flow development lifecycle, maintenance, etc. A
> processor which retrieves attributes from a database and updates the
> incoming flowfile with them is certainly useful in the use case you
> describe, but is not a generic pattern. There’s no intent to discourage
> custom development, and whatever makes your flow work is fine. Just
> explaining why you likely won’t see a solution like that in the NiFi
> bundled components.
>
>
> Andy LoPresto
> [hidden email]
> *[hidden email] <[hidden email]>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:
>
> Thanks Andy, how would update attribute be able to get the value from sql?
>
> Consider a flow where a piece of information needs to be obtained from a
> DB but i do not want the contents of the current FF to be altered, using
> ExecuteSQL anywhere prior would not be possible due to replacing the FF
> contents.
>
> What I had was two seperate flows, one that updates an oauth key in a db
> keeping it fresh, the main flow would then read the db just before an
> invokehttp.
>
> I was originally using a distributed map cache but had concerns that it
> might not be secure and was also advised the cache server has been known to
> go down.
>
> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>
> Andy, Brett,
>
> Taking a quick glance at the code it looks like it's enriching attributes
> from a database according to a query.
>
> If that's correct, there's a LookupAttribute processor that delegates
> lookups to a "LookupService" and adds attributes without altering content.
> There are a variety of these LookupServices included. I think what you
> implemented would make sense as a LookupService and then you could just
> configure the processor to use that. It could also be used with
> LookupRecord then too so there'd be a double payoff.
>
> -joey
>
> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>,
> wrote:
> Hi Brett,
>
> It’s great that you found it easy to write a new processor for Apache
> NiFi. It is probably an indicator that we need to improve
> education/evangelism/documentation, however, that you did not find
> UpdateAttribute [1], which should do exactly what you were looking for.
>
> [1]
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>
> Andy LoPresto
> [hidden email]
> [hidden email]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>
> Hi all, having used NiFi for a couple days I wanted to add some attributes
> to a FlowFile while not altering the contents of that FlowFile.
>
> I had suggestions to use a script processor but that just sounded like a
> hack which could become a nuisance to replicate.
>
> Anyway, I figured I'd write a processor to do this, anyone interested you
> can find it here.
>
> https://github.com/brettryan/nifi-drunken-bundle
>
> Feel free to do with it what you please.
>
> I've published to maven central but it will take a day to appear in the
> search.
>
> <dependency>
>  <groupId>com.drunkendev</groupId>
>  <artifactId>nifi-drunken-nar</artifactId>
>  <version>1.0.0</version>
>  <type>nar</type>
> </dependency>
> <dependency>
>  <groupId>com.drunkendev</groupId>
>  <artifactId>nifi-drunken-processors</artifactId>
>  <version>1.0.0</version>
> </dependency>
> <dependency>
>  <groupId>com.drunkendev</groupId>
>  <artifactId>nifi-drunken-bundle</artifactId>
>  <version>1.0.0</version>
>  <type>pom</type>
> </dependency>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
MongoDBLookupService can't be used with an UpdateAttribute processor though, it returns a Record type, and has no required keys. The whole purpose of the processor that I originally wrote was to update an attribute from SQL on a FF.

The implementation of LookupAttribute requires

- getRequiredKeys to return a single element set
- getValueType to return String.class

I guess what I could do is

(1) Write a processor that updates an attribute on a FF from a Record based LookupService. Say I call it UpdateAttributeFromRecordLookup
(2) Write a service SQLLookupService which given a query as a lookup coordinate would return a Record for the first found entry.
(3) Write a processor that updates an attribute from a record lookup, call it UpdateAttributeFromRecordLookup.

It's a bit more long winded to achieve what I originally had with a single processor, but I guess UpdateAttributeFromRecordLookup would have the benefit of updating attributes from LookupService implementations that do not meet the LookupAttribute criteria.

If you think this is viable, please let me know and I'll have a look at starting it tomorrow afternoon.


On 7 Jan 2018, at 03:23, Mike Thomsen <[hidden email]> wrote:

Take a look at the mongo lookup service. I think it could serve as a good
example here.
On Fri, Jan 5, 2018 at 10:49 PM Brett Ryan <[hidden email]> wrote:

Looking at using a lookupservice, this doesn't seem to support sending
multiple keys to the LookupService at the same time.

What I was thinking of doing was implement a LookupService that took an
attribute "sql.query" which would use this to evaluate the query but then
pass in a map of key/value pairs for attribute-name/column-name to set the
attributes.

I could implement this as I imagined it to work, however it will evaluate
the SQL expression multiple times for the same query on the same flow.

I am also wondering why LookupService#getRequiredKeys must return a
Set<String>, yet; this set must only contain one value.


On 5 Jan 2018, at 08:32, Andy LoPresto <[hidden email]> wrote:

UpdateAttribute doesn’t pull from a database, it uses static or dynamic
attribute values and supports NiFi Expression Language. In your original
message, you didn’t mention any database interaction, so I thought you were
just trying to accomplish "I wanted to add some attributes to a FlowFile
while not altering the contents of that FlowFile”, which is indeed what
UpdateAttribute does.

If you need to retrieve those values from a database, as Joey mentions,
the LookupService is the right tool.

With your prior setup, the distributed map cache is as secure as the NiFi
configuration — if using secured NiFi, the communication between that node
and any other is over TLS, and within the node it’s a memory access.

A big part of the NiFi philosophy is the same as the Unix philosophy —
each tool should do one job very well, and to perform complicated tasks,
chain those tools together. This helps with provenance reporting, usage
reporting, debugging, flow development lifecycle, maintenance, etc. A
processor which retrieves attributes from a database and updates the
incoming flowfile with them is certainly useful in the use case you
describe, but is not a generic pattern. There’s no intent to discourage
custom development, and whatever makes your flow work is fine. Just
explaining why you likely won’t see a solution like that in the NiFi
bundled components.


Andy LoPresto
[hidden email]
*[hidden email] <[hidden email]>*
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:

Thanks Andy, how would update attribute be able to get the value from sql?

Consider a flow where a piece of information needs to be obtained from a
DB but i do not want the contents of the current FF to be altered, using
ExecuteSQL anywhere prior would not be possible due to replacing the FF
contents.

What I had was two seperate flows, one that updates an oauth key in a db
keeping it fresh, the main flow would then read the db just before an
invokehttp.

I was originally using a distributed map cache but had concerns that it
might not be secure and was also advised the cache server has been known to
go down.

On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:

Andy, Brett,

Taking a quick glance at the code it looks like it's enriching attributes
from a database according to a query.

If that's correct, there's a LookupAttribute processor that delegates
lookups to a "LookupService" and adds attributes without altering content.
There are a variety of these LookupServices included. I think what you
implemented would make sense as a LookupService and then you could just
configure the processor to use that. It could also be used with
LookupRecord then too so there'd be a double payoff.

-joey

On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>,
wrote:
Hi Brett,

It’s great that you found it easy to write a new processor for Apache
NiFi. It is probably an indicator that we need to improve
education/evangelism/documentation, however, that you did not find
UpdateAttribute [1], which should do exactly what you were looking for.

[1]
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html

Andy LoPresto
[hidden email]
[hidden email]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:

Hi all, having used NiFi for a couple days I wanted to add some attributes
to a FlowFile while not altering the contents of that FlowFile.

I had suggestions to use a script processor but that just sounded like a
hack which could become a nuisance to replicate.

Anyway, I figured I'd write a processor to do this, anyone interested you
can find it here.

https://github.com/brettryan/nifi-drunken-bundle

Feel free to do with it what you please.

I've published to maven central but it will take a day to appear in the
search.

<dependency>
<groupId>com.drunkendev</groupId>
<artifactId>nifi-drunken-nar</artifactId>
<version>1.0.0</version>
<type>nar</type>
</dependency>
<dependency>
<groupId>com.drunkendev</groupId>
<artifactId>nifi-drunken-processors</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>com.drunkendev</groupId>
<artifactId>nifi-drunken-bundle</artifactId>
<version>1.0.0</version>
<type>pom</type>
</dependency>


signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
In reply to this post by Andrew Lim

> On 5 Jan 2018, at 10:17, Andrew Lim <[hidden email]> wrote:
>
> Hi Brett,
>
> Thanks for your feedback on the “Add Processor” window.  I’m sorry you had trouble identifying the right processor to use for your data flow.

As it turns out the right processor doesn't yet exist :) I'll look to create it.

> There are tags on the left of the “Add Processor” window that categorize many of the processors into functional groups.  For example, selecting the “attributes” tag would limit the processors listed to the those related to attributes.  Additionally, there is a Filter field at the top that can also be used to search processors.  More information on using tags and filters can be found in the documentation [1].

While this is handy there's several problems with this approach.

Tag clouds are really hard to read for visually impaired people. I am very visually impaired and my eyes shake so it's hard to focus on this sort of UX. lists are always better. Given that, an "enrichment" tag would be handy.

> Having said that, the UX of finding/adding a processor can definitely be improved.  There are two Jiras related to this effort that I am aware of [2][3].  If you have the opportunity to review those proposed improvements, additional thoughts/suggestions are welcomed and greatly appreciated!

Probably my biggest suggestion overall for NiFi is for accessibility support, it's almost non-existant and makes it really hard for folk like me who are quite visually impaired. NiFi is almost exclusively a mouse driven tool, which really is hard for the visually impaired.

I would imagine a categorical type browser similar to that of Atlassian Confluence macro browser would be a better approach.


The description field can be very hard to read when text is long, the tooltip thing doesn't help much, it would be better for this box to scroll.

Source IMHO isn't really relevant for the non-programmer or someone that didn't install the NAR. End users should be abstracted from the implementation of where the processors came from and good organisational standards would better support this.

One bug that's present is a large amount of whitespace between the description and dialog buttons.

And finally CamelCase is hard to read for variable width fonts, processor names, it would be better for a label name to be used.

> [1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#adding-components-to-the-canvas
> [2] https://issues.apache.org/jira/browse/NIFI-3338
> [3] https://issues.apache.org/jira/browse/NIFI-4249
>
> -Drew
>
>
>> On Jan 4, 2018, at 4:33 PM, Brett Ryan <[hidden email]> wrote:
>>
>> Ooo, i shall take a look at this, that sounds great.
>>
>> Yeah, my inexperience is probably a sore point. You know what would be great, either in the add processor browser to have categories to find processors. Trying to find enrichment processors only is probably the hardest part of identifying the right processor.
>>
>> Where I’m working they’re on nifi 1.1 and iirc they had > 200 processors. It’s also possible that newer processors are available in newer versions.
>>
>>
>>> On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
>>>
>>> Andy, Brett,
>>>
>>> Taking a quick glance at the code it looks like it's enriching attributes from a database according to a query.
>>>
>>> If that's correct, there's a LookupAttribute processor that delegates lookups to a "LookupService" and adds attributes without altering content. There are a variety of these LookupServices included. I think what you implemented would make sense as a LookupService and then you could just configure the processor to use that. It could also be used with LookupRecord then too so there'd be a double payoff.
>>>
>>> -joey
>>>
>>>> On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>, wrote:
>>>> Hi Brett,
>>>>
>>>> It’s great that you found it easy to write a new processor for Apache NiFi. It is probably an indicator that we need to improve education/evangelism/documentation, however, that you did not find UpdateAttribute [1], which should do exactly what you were looking for.
>>>>
>>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>>>
>>>> Andy LoPresto
>>>> [hidden email]
>>>> [hidden email]
>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>>
>>>>> On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
>>>>>
>>>>> Hi all, having used NiFi for a couple days I wanted to add some attributes to a FlowFile while not altering the contents of that FlowFile.
>>>>>
>>>>> I had suggestions to use a script processor but that just sounded like a hack which could become a nuisance to replicate.
>>>>>
>>>>> Anyway, I figured I'd write a processor to do this, anyone interested you can find it here.
>>>>>
>>>>> https://github.com/brettryan/nifi-drunken-bundle
>>>>>
>>>>> Feel free to do with it what you please.
>>>>>
>>>>> I've published to maven central but it will take a day to appear in the search.
>>>>>
>>>>> <dependency>
>>>>> <groupId>com.drunkendev</groupId>
>>>>> <artifactId>nifi-drunken-nar</artifactId>
>>>>> <version>1.0.0</version>
>>>>> <type>nar</type>
>>>>> </dependency>
>>>>> <dependency>
>>>>> <groupId>com.drunkendev</groupId>
>>>>> <artifactId>nifi-drunken-processors</artifactId>
>>>>> <version>1.0.0</version>
>>>>> </dependency>
>>>>> <dependency>
>>>>> <groupId>com.drunkendev</groupId>
>>>>> <artifactId>nifi-drunken-bundle</artifactId>
>>>>> <version>1.0.0</version>
>>>>> <type>pom</type>
>>>>> </dependency>
>>>>>
>>>>
>


signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Joey Frazee
In reply to this post by Brett Ryan
Hey Brett,

So as for the LookupAttribute only doing a single key lookup, that was a bit of a coin flip on whether it makes more sense to allow multiple lookups at once, each with a single key, or a single lookup with multiple AND-ed constraints. I opted for the former since that was the use case I was going after and we don’t really have a principled way of doing a map or set-valued property, so we do multiple properties and pretend it’s a map.

I think I considered enabling a flag to choose between the two alternatives, so that’d be the easy, dumb thing I could do and then we could relax those requirements (which are correct to have at the moment). I think the difficult part would really be more in writing a custom validator — and that wouldn’t be hard either.

-joey

On Jan 7, 2018, 5:23 AM -0600, Brett Ryan <[hidden email]>, wrote:

> MongoDBLookupService can't be used with an UpdateAttribute processor though, it returns a Record type, and has no required keys. The whole purpose of the processor that I originally wrote was to update an attribute from SQL on a FF.
>
> The implementation of LookupAttribute requires
>
> - getRequiredKeys to return a single element set
> - getValueType to return String.class
>
> I guess what I could do is
>
> (1) Write a processor that updates an attribute on a FF from a Record based LookupService. Say I call it UpdateAttributeFromRecordLookup
> (2) Write a service SQLLookupService which given a query as a lookup coordinate would return a Record for the first found entry.
> (3) Write a processor that updates an attribute from a record lookup, call it UpdateAttributeFromRecordLookup.
>
> It's a bit more long winded to achieve what I originally had with a single processor, but I guess UpdateAttributeFromRecordLookup would have the benefit of updating attributes from LookupService implementations that do not meet the LookupAttribute criteria.
>
> If you think this is viable, please let me know and I'll have a look at starting it tomorrow afternoon.
>
>
> > On 7 Jan 2018, at 03:23, Mike Thomsen <[hidden email]> wrote:
> >
> > Take a look at the mongo lookup service. I think it could serve as a good
> > example here.
> > On Fri, Jan 5, 2018 at 10:49 PM Brett Ryan <[hidden email]> wrote:
> >
> > > Looking at using a lookupservice, this doesn't seem to support sending
> > > multiple keys to the LookupService at the same time.
> > >
> > > What I was thinking of doing was implement a LookupService that took an
> > > attribute "sql.query" which would use this to evaluate the query but then
> > > pass in a map of key/value pairs for attribute-name/column-name to set the
> > > attributes.
> > >
> > > I could implement this as I imagined it to work, however it will evaluate
> > > the SQL expression multiple times for the same query on the same flow.
> > >
> > > I am also wondering why LookupService#getRequiredKeys must return a
> > > Set<String>, yet; this set must only contain one value.
> > >
> > >
> > > On 5 Jan 2018, at 08:32, Andy LoPresto <[hidden email]> wrote:
> > >
> > > UpdateAttribute doesn’t pull from a database, it uses static or dynamic
> > > attribute values and supports NiFi Expression Language. In your original
> > > message, you didn’t mention any database interaction, so I thought you were
> > > just trying to accomplish "I wanted to add some attributes to a FlowFile
> > > while not altering the contents of that FlowFile”, which is indeed what
> > > UpdateAttribute does.
> > >
> > > If you need to retrieve those values from a database, as Joey mentions,
> > > the LookupService is the right tool.
> > >
> > > With your prior setup, the distributed map cache is as secure as the NiFi
> > > configuration — if using secured NiFi, the communication between that node
> > > and any other is over TLS, and within the node it’s a memory access.
> > >
> > > A big part of the NiFi philosophy is the same as the Unix philosophy —
> > > each tool should do one job very well, and to perform complicated tasks,
> > > chain those tools together. This helps with provenance reporting, usage
> > > reporting, debugging, flow development lifecycle, maintenance, etc. A
> > > processor which retrieves attributes from a database and updates the
> > > incoming flowfile with them is certainly useful in the use case you
> > > describe, but is not a generic pattern. There’s no intent to discourage
> > > custom development, and whatever makes your flow work is fine. Just
> > > explaining why you likely won’t see a solution like that in the NiFi
> > > bundled components.
> > >
> > >
> > > Andy LoPresto
> > > [hidden email]
> > > *[hidden email] <[hidden email]>*
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > >
> > > On Jan 4, 2018, at 4:07 PM, Brett Ryan <[hidden email]> wrote:
> > >
> > > Thanks Andy, how would update attribute be able to get the value from sql?
> > >
> > > Consider a flow where a piece of information needs to be obtained from a
> > > DB but i do not want the contents of the current FF to be altered, using
> > > ExecuteSQL anywhere prior would not be possible due to replacing the FF
> > > contents.
> > >
> > > What I had was two seperate flows, one that updates an oauth key in a db
> > > keeping it fresh, the main flow would then read the db just before an
> > > invokehttp.
> > >
> > > I was originally using a distributed map cache but had concerns that it
> > > might not be secure and was also advised the cache server has been known to
> > > go down.
> > >
> > > On 5 Jan 2018, at 04:01, Joey Frazee <[hidden email]> wrote:
> > >
> > > Andy, Brett,
> > >
> > > Taking a quick glance at the code it looks like it's enriching attributes
> > > from a database according to a query.
> > >
> > > If that's correct, there's a LookupAttribute processor that delegates
> > > lookups to a "LookupService" and adds attributes without altering content.
> > > There are a variety of these LookupServices included. I think what you
> > > implemented would make sense as a LookupService and then you could just
> > > configure the processor to use that. It could also be used with
> > > LookupRecord then too so there'd be a double payoff.
> > >
> > > -joey
> > >
> > > On Jan 4, 2018, 8:44 AM -0800, Andy LoPresto <[hidden email]>,
> > > wrote:
> > > Hi Brett,
> > >
> > > It’s great that you found it easy to write a new processor for Apache
> > > NiFi. It is probably an indicator that we need to improve
> > > education/evangelism/documentation, however, that you did not find
> > > UpdateAttribute [1], which should do exactly what you were looking for.
> > >
> > > [1]
> > > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.4.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
> > >
> > > Andy LoPresto
> > > [hidden email]
> > > [hidden email]
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > >
> > > On Jan 4, 2018, at 7:03 AM, Brett Ryan <[hidden email]> wrote:
> > >
> > > Hi all, having used NiFi for a couple days I wanted to add some attributes
> > > to a FlowFile while not altering the contents of that FlowFile.
> > >
> > > I had suggestions to use a script processor but that just sounded like a
> > > hack which could become a nuisance to replicate.
> > >
> > > Anyway, I figured I'd write a processor to do this, anyone interested you
> > > can find it here.
> > >
> > > https://github.com/brettryan/nifi-drunken-bundle
> > >
> > > Feel free to do with it what you please.
> > >
> > > I've published to maven central but it will take a day to appear in the
> > > search.
> > >
> > > <dependency>
> > > <groupId>com.drunkendev</groupId>
> > > <artifactId>nifi-drunken-nar</artifactId>
> > > <version>1.0.0</version>
> > > <type>nar</type>
> > > </dependency>
> > > <dependency>
> > > <groupId>com.drunkendev</groupId>
> > > <artifactId>nifi-drunken-processors</artifactId>
> > > <version>1.0.0</version>
> > > </dependency>
> > > <dependency>
> > > <groupId>com.drunkendev</groupId>
> > > <artifactId>nifi-drunken-bundle</artifactId>
> > > <version>1.0.0</version>
> > > <type>pom</type>
> > > </dependency>
>
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Joey Frazee
In reply to this post by Brett Ryan
Brett, I think it’s great that you brought this up and made some specific suggestions, because it’s easy for people to overlook and hard to know how to do the right thing without that kind of feedback.

-joey

On Jan 7, 2018, 5:51 AM -0600, Brett Ryan <[hidden email]>, wrote:
>
> Probably my biggest suggestion overall for NiFi is for accessibility support, it's almost non-existant and makes it really hard for folk like me who are quite visually impaired. NiFi is almost exclusively a mouse driven tool, which really is hard for the visually impaired.
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
In reply to this post by Brett Ryan

On 7 Jan 2018, at 22:23, Brett Ryan <[hidden email]> wrote:

(1) Write a processor that updates an attribute on a FF from a Record based LookupService. Say I call it UpdateAttributeFromRecordLookup

Ok, I've implemented this now as a PoC and have tested it with the MongoDBLookupService which is working great.

The processor takes a record based lookup service and any number of attributes. The processor has a required attribute that specifies a key prefix (default = "key."), this prefix identifies further attributes as a key lookup, any other attributes are considered names to include as attribute/field pairs for the resulting FF, if none of these non-key custom attributes exist then all fields from the Record are added.

if we take the (LookupRecord with MongoDB) as an example

Key Prefix = ".key"
key.id_store = ${id_store}

We would end up with the following attributes added to the resulting FF

address = 177 Boulevard Haussmann, 75008 Paris
address_city = Paris
capacity = 464600
id_store = 1
manager = Jean Ricca
Given all this, do you think I'm now on the right track for a more general purpose processor that could be used for multiple lookups?

I could then implement the SQL Lookup Service to feed this.


(2) Write a service SQLLookupService which given a query as a lookup coordinate would return a Record for the first found entry.
(3) Write a processor that updates an attribute from a record lookup, call it UpdateAttributeFromRecordLookup.

It's a bit more long winded to achieve what I originally had with a single processor, but I guess UpdateAttributeFromRecordLookup would have the benefit of updating attributes from LookupService implementations that do not meet the LookupAttribute criteria.

If you think this is viable, please let me know and I'll have a look at starting it tomorrow afternoon.


signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Enrichment plugin for adding attributes from SQL

Brett Ryan
Have now implemented SQLLookupService which is a LookupService<Record> implementation.

This works as advertised to provide a service based method of assigning attributes to a flow file based on a Lookup Service.

The matching LookupAttributeFromRecord

If y'all think this could be worth contributing I'll invest further time in neatening these up and provide full test cases, atm these are just a PoC that's been proven.



On 9 Jan 2018, at 18:50, Brett Ryan <[hidden email]> wrote:


On 7 Jan 2018, at 22:23, Brett Ryan <[hidden email]> wrote:

(1) Write a processor that updates an attribute on a FF from a Record based LookupService. Say I call it UpdateAttributeFromRecordLookup

Ok, I've implemented this now as a PoC and have tested it with the MongoDBLookupService which is working great.

The processor takes a record based lookup service and any number of attributes. The processor has a required attribute that specifies a key prefix (default = "key."), this prefix identifies further attributes as a key lookup, any other attributes are considered names to include as attribute/field pairs for the resulting FF, if none of these non-key custom attributes exist then all fields from the Record are added.

if we take the (LookupRecord with MongoDB) as an example

Key Prefix = ".key"
key.id_store = ${id_store}

We would end up with the following attributes added to the resulting FF

address = 177 Boulevard Haussmann, 75008 Paris
address_city = Paris
capacity = 464600
id_store = 1
manager = Jean Ricca
Given all this, do you think I'm now on the right track for a more general purpose processor that could be used for multiple lookups?

I could then implement the SQL Lookup Service to feed this.


(2) Write a service SQLLookupService which given a query as a lookup coordinate would return a Record for the first found entry.
(3) Write a processor that updates an attribute from a record lookup, call it UpdateAttributeFromRecordLookup.

It's a bit more long winded to achieve what I originally had with a single processor, but I guess UpdateAttributeFromRecordLookup would have the benefit of updating attributes from LookupService implementations that do not meet the LookupAttribute criteria.

If you think this is viable, please let me know and I'll have a look at starting it tomorrow afternoon.



signature.asc (891 bytes) Download Attachment