Enforcing order across multiple nodes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Enforcing order across multiple nodes

Karthik Kothareddy (karthikk) [CONT - Type 2]

I have a use case that goes as follows,

InputPort (Flat files ) --> ProcessFlowfiles (combination of ExecuteScripts to filter out few fields) --> Load to Teradata(using a custom processor)

To do so I have to be sure that I process the flowfiles in certain order across the nodes so that the individual transactions are correct ( ex. An update statement executing before there is an Insert statement  for that record ). I receive all the flat files via remote NiFi instance meaning they all will be distributed across nodes and there is no way for me to know in what order the files are processing. Has anyone encountered similar problem before and know a way out of this scenario?

Also, I came across this JIRA ( https://issues.apache.org/jira/browse/NIFI-4155 ). I see that there is a patch available for this already but don't see a release Version tied to this. Is there a plan to include this patch as a part of 1.5.0 or any future releases?

Another question, apart from what Koji Kawamura mentioned in the comment section for the above JIRA to use
EnforceOrder --> Wait to block only 1 FlowFile can go through --> Processors required to run serially --> Notify to release the latch

Is there any other way to enforce order across multiple nodes?