Apache NiFi expression language

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Apache NiFi expression language

idioma
Hi,
I am interested to understand whether in NiFi it is possible to write regular expressions that only match the first occurrence of a pattern. I do not seem to be able to make this work and wonder whether it is just something not possible at all. Here is an example of what I mean:

https://regex101.com/r/nF3lE2/2

Thank you
Reply | Threaded
Open this post in threaded view
|

Re: Apache NiFi expression language

Joe Percivall
Hello,

Is there a specific reason you want to do this within Expression Language and not a Processor? You can use the ExtractTest processor and it will add attributes for the first occurrence of the pattern in the content of a FlowFile.
 
Joe


- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: [hidden email]



On Sunday, May 22, 2016 5:56 PM, idioma <[hidden email]> wrote:



Hi,
I am interested to understand whether in NiFi it is possible to write
regular expressions that only match the first occurrence of a pattern. I do
not seem to be able to make this work and wonder whether it is just
something not possible at all. Here is an example of what I mean:

https://regex101.com/r/nF3lE2/2

Thank you



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Apache-NiFi-expression-language-tp10610.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Apache NiFi expression language

Andy LoPresto-2
Hi Ilaria,

I tried to implement what you were asking for in the expression language and I think I discovered a bug in our implementation, which I have reported as NIFI-1919 [1]. The expression language evaluation delegates to String replace/replaceAll for the respective EL functions, but the invocation of replace compiles a Pattern with Pattern.LITERAL, which means it does not evaluate the regular expression. ReplaceAll works as expected, but obviously if you only want to match on the first occurrence of your pattern, this will not work for you. 

I have provided a unit test [2] which demonstrates this issue, and I will be working on it soon. 

Until this fix is in, I think the ExecuteScript processor with a simple Groovy line invoking replaceFirst is your easiest option. 


Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On May 24, 2016, at 1:47 PM, Joe Percivall <[hidden email]> wrote:

Hello,

Is there a specific reason you want to do this within Expression Language and not a Processor? You can use the ExtractTest processor and it will add attributes for the first occurrence of the pattern in the content of a FlowFile.

Joe


- - - - - -
Joseph Percivall
linkedin.com/in/Percivall
e: [hidden email]



On Sunday, May 22, 2016 5:56 PM, idioma <[hidden email]> wrote:



Hi,
I am interested to understand whether in NiFi it is possible to write
regular expressions that only match the first occurrence of a pattern. I do
not seem to be able to make this work and wonder whether it is just
something not possible at all. Here is an example of what I mean:

https://regex101.com/r/nF3lE2/2

Thank you



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Apache-NiFi-expression-language-tp10610.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Apache NiFi expression language

idioma
Andy,
as always thank you for your reply (and apologies for delay of my response too). Just to clarify, I have come across this issue while using the processor ReplaceTextWithMapping. I shared a related post with the community:

http://apache-nifi-developer-list.39713.n7.nabble.com/Issues-with-Regex-used-with-ReplaceTextWithMapping-where-am-I-going-wrong-tc10592.html

In my investigation, I also found out that ReplaceTextWithMapping does not work well with phrase sentences with space, but I am not sure if that was an already known limitation.

The matching to the first occurrence is required for substituting only the first input value. To circumvent this temporary issue, what would you recommend? To run ExecuteScript after the actual substitution? In this way, it will not be particularly flexible.

Thank you for your help,

Ilaria