Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Joe Witt
Hello

This exchange occurred today in the #nifi IRC chatroom.  Wanted to
memorialize it here and respond.

Big thanks to Josh Elser for manning the booth and helping out!

The question:
+++++++++++++++++
11:57 < ilesteban> Hi guys. I started to play around with NiFi
recently. I'm planning to start using it for a project I'm currently
working on.
11:57 < ilesteban> <ilesteban> I have some bpm background and I'm
trying to see what are the things than bpm and nifi have in common and
what are the things that they don't.
11:57 < ilesteban> <ilesteban> One of the first things I noticed is
that, apparently, in Nifi there is no separation between the workflow
definition and the workflow execution environment.
11:57 < ilesteban> <ilesteban> You define and execute your wrokflow in
the same environment.
11:57 < ilesteban> <ilesteban> I like the idea, but I'm struggling to
see how this fit into a distributed team environment.
11:57 < ilesteban> <ilesteban> Let's say I want to version a worflow
(in git, svn, or whatever) so I can share it among different team
members. I also want to keep versions from different milestones in my
project.
11:57 < ilesteban> <ilesteban> How can I do that?
11:57 < ilesteban> <ilesteban> Do I have to version the whole NiFi app?
11:57 < ilesteban> <ilesteban> Is there any way to extract just the
workflow definition (and maybe configuration) in a more
versioning-friendly format?
11:57 < ilesteban> <ilesteban> The way I used to do this with bpm
(jBPM in particular) was to export the process definition into XML and
then version that XML. Another team member could then get that XML,
import it into an editor and con
12:03 < elserj> Last I knew, there wasn't good versioning built into a workflow
12:03 < elserj> and it wasn't serialized in such a way that was
conducive to sharing among users
12:03 < elserj> i'm not sure if that's be changed
12:03 < ilesteban> elserj, I'm thinking about templates
12:03 < elserj> you'd probably be better off asking on
[hidden email] though :)
12:04 < ilesteban> maybe I can export a template and version it
12:04 < ilesteban> elserj, I'll do that
12:14 < elserj> sorry I'm not of much help. I don't do much with Nifi.
I just try to help out where I can :)
+++++++++++++++

Response:

NiFi certainly does take a different approach to the more common
'design and deploy' model.  NiFI provides a real-time modification
capability that affords immediate feedback.  This model is really
powerful in that it lets you immediately see the good or bad of
changes being made.  Further you can safely copy the live flow of data
off for evaluation and testing without impacting the production flow.
The whole idea is to foster and encourage iteration which we believe
is best done in the fires of real data.

But the point you bring up about being able to capture or save a
design and share it with others is great too.  NiFi supports templates
for this exact reason.  You can make real proven dataflows and then
generate templates of them, version them as you wish, and share them
with others.  They can import them into their flows, improve them, and
share them back and so on.  I think there is more we can do to improve
that but that will come with feedback and based on what folks are
trying to accomplish.

Thanks
Joe
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Bryan Bende
Joe,

In terms of sharing a flow configuration across a development team, do you
recommend against version controlling the underlying flow.xml.gz file?

I realize it is very specific to the internals of NiFi, but if you need to
replicate an exact flow across environments (developer machine,
integration, production), and possibly automatically deploy a flow, I
wasn't sure if templates solved that problem. Admittedly I don't know much
about templates.

The obvious downside is that the flow.xml.gz is not friendly for merges
when two people modify it, so you have to be extremely careful that you are
modifying the latest one. One approach is to make a rule that everyone on
the Dev team always modify the flow on a shared NiFi instance, and copy the
flow.xml back to version control from that instance. This way if two people
modify it at the same time, both their changes should be captured.

Thoughts?

-Bryan

On Thursday, February 19, 2015, Joe Witt <[hidden email]> wrote:

> Hello
>
> This exchange occurred today in the #nifi IRC chatroom.  Wanted to
> memorialize it here and respond.
>
> Big thanks to Josh Elser for manning the booth and helping out!
>
> The question:
> +++++++++++++++++
> 11:57 < ilesteban> Hi guys. I started to play around with NiFi
> recently. I'm planning to start using it for a project I'm currently
> working on.
> 11:57 < ilesteban> <ilesteban> I have some bpm background and I'm
> trying to see what are the things than bpm and nifi have in common and
> what are the things that they don't.
> 11:57 < ilesteban> <ilesteban> One of the first things I noticed is
> that, apparently, in Nifi there is no separation between the workflow
> definition and the workflow execution environment.
> 11:57 < ilesteban> <ilesteban> You define and execute your wrokflow in
> the same environment.
> 11:57 < ilesteban> <ilesteban> I like the idea, but I'm struggling to
> see how this fit into a distributed team environment.
> 11:57 < ilesteban> <ilesteban> Let's say I want to version a worflow
> (in git, svn, or whatever) so I can share it among different team
> members. I also want to keep versions from different milestones in my
> project.
> 11:57 < ilesteban> <ilesteban> How can I do that?
> 11:57 < ilesteban> <ilesteban> Do I have to version the whole NiFi app?
> 11:57 < ilesteban> <ilesteban> Is there any way to extract just the
> workflow definition (and maybe configuration) in a more
> versioning-friendly format?
> 11:57 < ilesteban> <ilesteban> The way I used to do this with bpm
> (jBPM in particular) was to export the process definition into XML and
> then version that XML. Another team member could then get that XML,
> import it into an editor and con
> 12:03 < elserj> Last I knew, there wasn't good versioning built into a
> workflow
> 12:03 < elserj> and it wasn't serialized in such a way that was
> conducive to sharing among users
> 12:03 < elserj> i'm not sure if that's be changed
> 12:03 < ilesteban> elserj, I'm thinking about templates
> 12:03 < elserj> you'd probably be better off asking on
> [hidden email] though :)
> 12:04 < ilesteban> maybe I can export a template and version it
> 12:04 < ilesteban> elserj, I'll do that
> 12:14 < elserj> sorry I'm not of much help. I don't do much with Nifi.
> I just try to help out where I can :)
> +++++++++++++++
>
> Response:
>
> NiFi certainly does take a different approach to the more common
> 'design and deploy' model.  NiFI provides a real-time modification
> capability that affords immediate feedback.  This model is really
> powerful in that it lets you immediately see the good or bad of
> changes being made.  Further you can safely copy the live flow of data
> off for evaluation and testing without impacting the production flow.
> The whole idea is to foster and encourage iteration which we believe
> is best done in the fires of real data.
>
> But the point you bring up about being able to capture or save a
> design and share it with others is great too.  NiFi supports templates
> for this exact reason.  You can make real proven dataflows and then
> generate templates of them, version them as you wish, and share them
> with others.  They can import them into their flows, improve them, and
> share them back and so on.  I think there is more we can do to improve
> that but that will come with feedback and based on what folks are
> trying to accomplish.
>
> Thanks
> Joe
>
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Joe Witt
Bryan,

I do think sharing the flow.xml.gz is problematic:

- It contains the sensitive properties (albeit encrypted).  Templates
do not include those values at all.  So with templates you're not
unnecessarily increasing exposure of those values.

- It is all or nothing.  Templates are 'pieces of  the flow' and you
can treat them as such, import them in that manner, etc..

- automatic/continuous deployment for testing/etc.. I think templates
could also be used for this.  All the invocations go through the REST
api anyway so a client could be written to do the automatic deployment
stuff

Multiple devs editng a shared instance:

- This is interesting.  So the situation you describe is one where
there is so much live-change activity going on between 2 or more
people that the optimistic-locking model NiFi has is problematic?  One
thing I know we've sort of kicked around previously is that of
isolating the scope of the 'flow lock' to a single process group.
That way two people could work on sibling process groups with no worry
about one getting the lock before the other as is the case now.  If
this is really a problem you see that is worth solving though that is
really an exciting problem for us to have.

Thanks
Joe


On Thu, Feb 19, 2015 at 11:13 PM, Bryan Bende <[hidden email]> wrote:

> Joe,
>
> In terms of sharing a flow configuration across a development team, do you
> recommend against version controlling the underlying flow.xml.gz file?
>
> I realize it is very specific to the internals of NiFi, but if you need to
> replicate an exact flow across environments (developer machine,
> integration, production), and possibly automatically deploy a flow, I
> wasn't sure if templates solved that problem. Admittedly I don't know much
> about templates.
>
> The obvious downside is that the flow.xml.gz is not friendly for merges
> when two people modify it, so you have to be extremely careful that you are
> modifying the latest one. One approach is to make a rule that everyone on
> the Dev team always modify the flow on a shared NiFi instance, and copy the
> flow.xml back to version control from that instance. This way if two people
> modify it at the same time, both their changes should be captured.
>
> Thoughts?
>
> -Bryan
>
> On Thursday, February 19, 2015, Joe Witt <[hidden email]> wrote:
>
>> Hello
>>
>> This exchange occurred today in the #nifi IRC chatroom.  Wanted to
>> memorialize it here and respond.
>>
>> Big thanks to Josh Elser for manning the booth and helping out!
>>
>> The question:
>> +++++++++++++++++
>> 11:57 < ilesteban> Hi guys. I started to play around with NiFi
>> recently. I'm planning to start using it for a project I'm currently
>> working on.
>> 11:57 < ilesteban> <ilesteban> I have some bpm background and I'm
>> trying to see what are the things than bpm and nifi have in common and
>> what are the things that they don't.
>> 11:57 < ilesteban> <ilesteban> One of the first things I noticed is
>> that, apparently, in Nifi there is no separation between the workflow
>> definition and the workflow execution environment.
>> 11:57 < ilesteban> <ilesteban> You define and execute your wrokflow in
>> the same environment.
>> 11:57 < ilesteban> <ilesteban> I like the idea, but I'm struggling to
>> see how this fit into a distributed team environment.
>> 11:57 < ilesteban> <ilesteban> Let's say I want to version a worflow
>> (in git, svn, or whatever) so I can share it among different team
>> members. I also want to keep versions from different milestones in my
>> project.
>> 11:57 < ilesteban> <ilesteban> How can I do that?
>> 11:57 < ilesteban> <ilesteban> Do I have to version the whole NiFi app?
>> 11:57 < ilesteban> <ilesteban> Is there any way to extract just the
>> workflow definition (and maybe configuration) in a more
>> versioning-friendly format?
>> 11:57 < ilesteban> <ilesteban> The way I used to do this with bpm
>> (jBPM in particular) was to export the process definition into XML and
>> then version that XML. Another team member could then get that XML,
>> import it into an editor and con
>> 12:03 < elserj> Last I knew, there wasn't good versioning built into a
>> workflow
>> 12:03 < elserj> and it wasn't serialized in such a way that was
>> conducive to sharing among users
>> 12:03 < elserj> i'm not sure if that's be changed
>> 12:03 < ilesteban> elserj, I'm thinking about templates
>> 12:03 < elserj> you'd probably be better off asking on
>> [hidden email] though :)
>> 12:04 < ilesteban> maybe I can export a template and version it
>> 12:04 < ilesteban> elserj, I'll do that
>> 12:14 < elserj> sorry I'm not of much help. I don't do much with Nifi.
>> I just try to help out where I can :)
>> +++++++++++++++
>>
>> Response:
>>
>> NiFi certainly does take a different approach to the more common
>> 'design and deploy' model.  NiFI provides a real-time modification
>> capability that affords immediate feedback.  This model is really
>> powerful in that it lets you immediately see the good or bad of
>> changes being made.  Further you can safely copy the live flow of data
>> off for evaluation and testing without impacting the production flow.
>> The whole idea is to foster and encourage iteration which we believe
>> is best done in the fires of real data.
>>
>> But the point you bring up about being able to capture or save a
>> design and share it with others is great too.  NiFi supports templates
>> for this exact reason.  You can make real proven dataflows and then
>> generate templates of them, version them as you wish, and share them
>> with others.  They can import them into their flows, improve them, and
>> share them back and so on.  I think there is more we can do to improve
>> that but that will come with feedback and based on what folks are
>> trying to accomplish.
>>
>> Thanks
>> Joe
>>
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Toivo Adams
Often there are strictly separate environments:
  Test environment
  Production environment

Let's assume we have 10+ flow templates which describes different processes.
Each of them contain connections to database, connections to JMS broker and to remote servers.

A test environment should be as close to production as possible.
Often it is not allowed to do any testing in production environment.
This means that templates should be exactly same for test and production environments.
Only database, JMS broker and remote servers are different.

It would be extremely helpful to have some kind of named resources.
For example template contains only reference to named resource DATABASE.

When we use template in test environment, DATABASE will be mapped to test database.
And when we move template to production environment, template will use live database instead.

This way we can easily test, modify and deploy without changing configuration in each template when switching between test and live.
And not to worry possible little mistakes and differences in configuration.

Thanks
toivo
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Joe Witt
Toivo,

The concept of some sort of named resources is understandable.  In
general though the distinction of production versus test is something
that is very blurry in the world of dataflow.  Our context for
dataflow here is highly sensor driven continuous automated flows of
information from sensors/datasources, to processing/analysis systems,
to repositories.  In that world the idea of production vs test is
'data you can't lose' versus 'data you can tolerate to lose'.
Bringing that to your example one way to think of this is the 'dev'
database and the 'production' database are delivered data from the
same NiFI . There wouldn't be a 'production' nifi and a 'test' nifi.
It would be the dataflow cluster.  And it would serve development and
production activities equally.  It is designed for just that.

Ultimately the idea of production versus test often means fairly large
protracted cycles from idea to production outcome.  NiFi is tuned to
make execution of idea to production as fast as possible by enabling
live iterative improvements with immediate feedback.  This is really a
core value proposition of NiFi.

You can edit the live flow in fine grained ways adding production
destinations, development destinations, and each of them gets an
appropriate quality of service.  You won't need named resources to
represent TEST vs PRODUCTION because those destinations live on the
graph.  When you're test flow is ready to go live you click and drag
the flow of data over to production and there you go.

I'd be happy to do explain it further over a google hangout or whatever.

This is really getting to the heart of our dataflow philosophy and it
is rather different than traditional ETL or messaging approaches.

Thanks
Joe

On Sun, Feb 22, 2015 at 10:53 AM, Toivo Adams <[hidden email]> wrote:

> Often there are strictly separate environments:
>   Test environment
>   Production environment
>
> Let's assume we have 10+ flow templates which describes different processes.
> Each of them contain connections to database, connections to JMS broker and
> to remote servers.
>
> A test environment should be as close to production as possible.
> Often it is not allowed to do any testing in production environment.
> This means that templates should be exactly same for test and production
> environments.
> Only database, JMS broker and remote servers are different.
>
> It would be extremely helpful to have some kind of named resources.
> For example template contains only reference to named resource DATABASE.
>
> When we use template in test environment, DATABASE will be mapped to test
> database.
> And when we move template to production environment, template will use live
> database instead.
>
> This way we can easily test, modify and deploy without changing
> configuration in each template when switching between test and live.
> And not to worry possible little mistakes and differences in configuration.
>
> Thanks
> toivo
>
>
>
>
> --
> View this message in context: http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p799.html
> Sent from the Apache NiFi (incubating) Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Toivo Adams
Joe,

Thank you for explanation.

I hope I understand NiFi main idea.
And it's really well thought and implemented.
I like it very much.

But.
“Ultimately the idea of production versus test often means fairly large
protracted cycles from idea to production outcome.”
Welcome to my world.
I work for financial institution.

Maybe I am naive but I do see NiFi as general way how to do software development in many business domains.
I hate to hear NiFi is not usable in financial institution.

I've seen a lot of different business applications.
One of worst of them are EJB style monolithic web applications.
I have learned that software should be implemented as side-effect free components which will do only one thing but do it very well. Software development is very costly and reusing components is the key to      
keep development cost at reasonable level.
Also debugging and scalability is much simple using such components.
So NiFi is perfect fit.

IDEAL world
Development and operations are separated.
Almost none are allowed to see live data because it contains highly sensitive customer data.
Breaking rules may lead immediate firing. (this is not theory, it has happen few times)
So development must use scrambled test data.
Before even single simple change can be applied to Live system, strict change management procedure must be pass through. Yes, this takes usually 1-2 days when everything is OK.
And deploy will be done by operations (not by developers).
Positive is what all changes to live systems are recorded and anytime roll back can be done quickly (s*it happens).
When whatever new application or change is in live, it might be running many years without any efforts from development team.

REAL world
Sometimes something still goes wrong whatever the reason is.
Usually monitoring will get alert and will forward problem to predetermined person(persons)
When operations (administrations) are unable to resolve problem this will end up to some developer leader desk. In this case developer is authorized to be use live data to solve problem quickly.
Now NiFi can again be very, very valuable.

Summary
I want to use NiFi but at the same time I must follow our strict test/live environment rules.
Or NiFi would not be accepted at all.


Thanks
toivo
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Corey Flowers
Good afternoon Toivo,

        I work nifi operations/integration daily, on very vital datasets,
and can tell you that we too had to change the views and procedures of our
customers/leadership to accept this type of thinking. NIFI is a step
forward in the evolution of software development not a software to be
placed in previous software development methodologies. Our team has had
this fight numerous times, from canned data, to test environments to
upgrading policies. We even had one customer tell us, we needed a 17 day
notice just to restart NIFI because all previous versions of their software
had to be started in a sequence, previously took 2 hours to start back up
and this was their current operating posture. In our daily activities our
operations personnel work hand in hand with the developers to integrate new
processors into production environments, usually by cloning production data
into grouped development processors on the production graph. This allows us
to expedite the integration process, save money on building test
environments and also allows us to see real load on the production
suite/cluster. There is no doubt, this is not only a change in development
processes but also the mindsets around software development in general. I
can only assure you the fight is well worth it in the end.

Corey

On Sun, Feb 22, 2015 at 4:46 PM, Toivo Adams <[hidden email]> wrote:

> Joe,
>
> Thank you for explanation.
>
> I hope I understand NiFi main idea.
> And it's really well thought and implemented.
> I like it very much.
>
> But.
> “Ultimately the idea of production versus test often means fairly large
> protracted cycles from idea to production outcome.”
> Welcome to my world.
> I work for financial institution.
>
> Maybe I am naive but I do see NiFi as general way how to do software
> development in many business domains.
> I hate to hear NiFi is not usable in financial institution.
>
> I've seen a lot of different business applications.
> One of worst of them are EJB style monolithic web applications.
> I have learned that software should be implemented as side-effect free
> components which will do only one thing but do it very well. Software
> development is very costly and reusing components is the key to
> keep development cost at reasonable level.
> Also debugging and scalability is much simple using such components.
> So NiFi is perfect fit.
>
> IDEAL world
> Development and operations are separated.
> Almost none are allowed to see live data because it contains highly
> sensitive customer data.
> Breaking rules may lead immediate firing. (this is not theory, it has
> happen
> few times)
> So development must use scrambled test data.
> Before even single simple change can be applied to Live system, strict
> change management procedure must be pass through. Yes, this takes usually
> 1-2 days when everything is OK.
> And deploy will be done by operations (not by developers).
> Positive is what all changes to live systems are recorded and anytime roll
> back can be done quickly (s*it happens).
> When whatever new application or change is in live, it might be running
> many
> years without any efforts from development team.
>
> REAL world
> Sometimes something still goes wrong whatever the reason is.
> Usually monitoring will get alert and will forward problem to predetermined
> person(persons)
> When operations (administrations) are unable to resolve problem this will
> end up to some developer leader desk. In this case developer is authorized
> to be use live data to solve problem quickly.
> Now NiFi can again be very, very valuable.
>
> Summary
> I want to use NiFi but at the same time I must follow our strict test/live
> environment rules.
> Or NiFi would not be accepted at all.
>
>
> Thanks
> toivo
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p801.html
> Sent from the Apache NiFi (incubating) Developer List mailing list archive
> at Nabble.com.
>



--
Corey Flowers
Vice President, Onyx Point, Inc
(410) 541-6699
[hidden email]

-- This account not approved for unencrypted proprietary information --
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Bryan Bende
What if you are making significant changes to a custom processor?

Using Toivo's example, if there is a custom processor for inserting to a
relational database, and the graph has one instance inserting to a
production database, and another inserting to a test database, is there a
good way to test your latest Nar and not affect the production part of the
flow?

-Bryan


On Sun, Feb 22, 2015 at 5:59 PM, Corey Flowers <[hidden email]>
wrote:

> Good afternoon Toivo,
>
>         I work nifi operations/integration daily, on very vital datasets,
> and can tell you that we too had to change the views and procedures of our
> customers/leadership to accept this type of thinking. NIFI is a step
> forward in the evolution of software development not a software to be
> placed in previous software development methodologies. Our team has had
> this fight numerous times, from canned data, to test environments to
> upgrading policies. We even had one customer tell us, we needed a 17 day
> notice just to restart NIFI because all previous versions of their software
> had to be started in a sequence, previously took 2 hours to start back up
> and this was their current operating posture. In our daily activities our
> operations personnel work hand in hand with the developers to integrate new
> processors into production environments, usually by cloning production data
> into grouped development processors on the production graph. This allows us
> to expedite the integration process, save money on building test
> environments and also allows us to see real load on the production
> suite/cluster. There is no doubt, this is not only a change in development
> processes but also the mindsets around software development in general. I
> can only assure you the fight is well worth it in the end.
>
> Corey
>
> On Sun, Feb 22, 2015 at 4:46 PM, Toivo Adams <[hidden email]>
> wrote:
>
> > Joe,
> >
> > Thank you for explanation.
> >
> > I hope I understand NiFi main idea.
> > And it's really well thought and implemented.
> > I like it very much.
> >
> > But.
> > “Ultimately the idea of production versus test often means fairly large
> > protracted cycles from idea to production outcome.”
> > Welcome to my world.
> > I work for financial institution.
> >
> > Maybe I am naive but I do see NiFi as general way how to do software
> > development in many business domains.
> > I hate to hear NiFi is not usable in financial institution.
> >
> > I've seen a lot of different business applications.
> > One of worst of them are EJB style monolithic web applications.
> > I have learned that software should be implemented as side-effect free
> > components which will do only one thing but do it very well. Software
> > development is very costly and reusing components is the key to
> > keep development cost at reasonable level.
> > Also debugging and scalability is much simple using such components.
> > So NiFi is perfect fit.
> >
> > IDEAL world
> > Development and operations are separated.
> > Almost none are allowed to see live data because it contains highly
> > sensitive customer data.
> > Breaking rules may lead immediate firing. (this is not theory, it has
> > happen
> > few times)
> > So development must use scrambled test data.
> > Before even single simple change can be applied to Live system, strict
> > change management procedure must be pass through. Yes, this takes usually
> > 1-2 days when everything is OK.
> > And deploy will be done by operations (not by developers).
> > Positive is what all changes to live systems are recorded and anytime
> roll
> > back can be done quickly (s*it happens).
> > When whatever new application or change is in live, it might be running
> > many
> > years without any efforts from development team.
> >
> > REAL world
> > Sometimes something still goes wrong whatever the reason is.
> > Usually monitoring will get alert and will forward problem to
> predetermined
> > person(persons)
> > When operations (administrations) are unable to resolve problem this will
> > end up to some developer leader desk. In this case developer is
> authorized
> > to be use live data to solve problem quickly.
> > Now NiFi can again be very, very valuable.
> >
> > Summary
> > I want to use NiFi but at the same time I must follow our strict
> test/live
> > environment rules.
> > Or NiFi would not be accepted at all.
> >
> >
> > Thanks
> > toivo
> >
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p801.html
> > Sent from the Apache NiFi (incubating) Developer List mailing list
> archive
> > at Nabble.com.
> >
>
>
>
> --
> Corey Flowers
> Vice President, Onyx Point, Inc
> (410) 541-6699
> [hidden email]
>
> -- This account not approved for unencrypted proprietary information --
>
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Mark Payne
In reply to this post by Joe Witt
Toivo,


I think Corey’s e-mail here is a response directly to your comment of “Before even single simple change can be applied to Live system, strict
change management procedure must be pass through. Yes, this takes usually 1-2 days when everything is OK.” I think the only thing that I can add in this regard is that changing your flow is NOT a change to software. It is simply a change to software configuration. If a configuration change requires days then you may have an uphill battle, but just as Corey is recommending, I still think it’s worth a fight.


You made the point that “development must use scrambled test data.” I don’t think this is necessarily a problem. We suggested teeing off a copy of the data and pushing to both a production database and a test database. That doesn’t mean that the exact same data must go to both. You could easily have a processor ahead of the “Put to Test Database” processor that performs the obfuscation for you. For example, if the data is XML or JSON, perhaps the standard processors will be all you need to obfuscate it. Otherwise, it’s possible that you'd need to create a new processor that can handle the obfuscation. Then your flow would send the actual data to the production database and then send a modified version to the test database. This way, while you will not see the actual live data as-is, you will see as close as possible to the real thing. This allows you still to operate on the live stream of data as it comes in so that you can determine your ability to handle the load as well as handle the structure of the data (though account numbers, etc. would be randomly generated numbers, for instance, the JSON would be the same structure, at least). This still is far better than “test data” that was generated a while ago.


However, I am arguing that this obfuscated test data is great for testing new processors and new code. It should not be needed to test the flow. The “test" version of the flow should be built alongside the production flow and after it has been verified, you just make the switch and delete the old version.


I come from an extremely stringent environment as well. This ability of NiFi to take a live stream of data, process it to meet production needs, and then modify the data to send to a test environment (or perhaps filter data and only send data that is deemed OK as-is to a test environment) is in my opinion one of the most powerful features.


Does this make sense?








From: Corey Flowers
Sent: ‎Sunday‎, ‎February‎ ‎22‎, ‎2015 ‎6‎:‎01‎ ‎PM
To: [hidden email]





Good afternoon Toivo,

        I work nifi operations/integration daily, on very vital datasets,
and can tell you that we too had to change the views and procedures of our
customers/leadership to accept this type of thinking. NIFI is a step
forward in the evolution of software development not a software to be
placed in previous software development methodologies. Our team has had
this fight numerous times, from canned data, to test environments to
upgrading policies. We even had one customer tell us, we needed a 17 day
notice just to restart NIFI because all previous versions of their software
had to be started in a sequence, previously took 2 hours to start back up
and this was their current operating posture. In our daily activities our
operations personnel work hand in hand with the developers to integrate new
processors into production environments, usually by cloning production data
into grouped development processors on the production graph. This allows us
to expedite the integration process, save money on building test
environments and also allows us to see real load on the production
suite/cluster. There is no doubt, this is not only a change in development
processes but also the mindsets around software development in general. I
can only assure you the fight is well worth it in the end.

Corey

On Sun, Feb 22, 2015 at 4:46 PM, Toivo Adams <[hidden email]> wrote:

> Joe,
>
> Thank you for explanation.
>
> I hope I understand NiFi main idea.
> And it's really well thought and implemented.
> I like it very much.
>
> But.
> “Ultimately the idea of production versus test often means fairly large
> protracted cycles from idea to production outcome.”
> Welcome to my world.
> I work for financial institution.
>
> Maybe I am naive but I do see NiFi as general way how to do software
> development in many business domains.
> I hate to hear NiFi is not usable in financial institution.
>
> I've seen a lot of different business applications.
> One of worst of them are EJB style monolithic web applications.
> I have learned that software should be implemented as side-effect free
> components which will do only one thing but do it very well. Software
> development is very costly and reusing components is the key to
> keep development cost at reasonable level.
> Also debugging and scalability is much simple using such components.
> So NiFi is perfect fit.
>
> IDEAL world
> Development and operations are separated.
> Almost none are allowed to see live data because it contains highly
> sensitive customer data.
> Breaking rules may lead immediate firing. (this is not theory, it has
> happen
> few times)
> So development must use scrambled test data.
> Before even single simple change can be applied to Live system, strict
> change management procedure must be pass through. Yes, this takes usually
> 1-2 days when everything is OK.
> And deploy will be done by operations (not by developers).
> Positive is what all changes to live systems are recorded and anytime roll
> back can be done quickly (s*it happens).
> When whatever new application or change is in live, it might be running
> many
> years without any efforts from development team.
>
> REAL world
> Sometimes something still goes wrong whatever the reason is.
> Usually monitoring will get alert and will forward problem to predetermined
> person(persons)
> When operations (administrations) are unable to resolve problem this will
> end up to some developer leader desk. In this case developer is authorized
> to be use live data to solve problem quickly.
> Now NiFi can again be very, very valuable.
>
> Summary
> I want to use NiFi but at the same time I must follow our strict test/live
> environment rules.
> Or NiFi would not be accepted at all.
>
>
> Thanks
> toivo
>
>
>
>
> --
> View this message in context:
> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p801.html
> Sent from the Apache NiFi (incubating) Developer List mailing list archive
> at Nabble.com.
>



--
Corey Flowers
Vice President, Onyx Point, Inc
(410) 541-6699
[hidden email]

-- This account not approved for unencrypted proprietary information --
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Mark Payne
In reply to this post by Joe Witt
Bryan,


This is largely what I was trying to touch on in my last response to Toivo. It’s also what I was touching on in the last blog that I wrote: https://blogs.apache.org/nifi/entry/say_good_bye_to_canned


This is largely how I test all of my software at my “day job.” I spawn a duplicate feed to push to a test instance. That duplicate feed could then be modified/obfuscated as needed for the test cluster. Or you could filter out only part of it to send to the test cluster.


Does this make sense? Or am I just confusing people? 😊








From: Bryan Bende
Sent: ‎Sunday‎, ‎February‎ ‎22‎, ‎2015 ‎7‎:‎44‎ ‎PM
To: [hidden email]





What if you are making significant changes to a custom processor?

Using Toivo's example, if there is a custom processor for inserting to a
relational database, and the graph has one instance inserting to a
production database, and another inserting to a test database, is there a
good way to test your latest Nar and not affect the production part of the
flow?

-Bryan


On Sun, Feb 22, 2015 at 5:59 PM, Corey Flowers <[hidden email]>
wrote:

> Good afternoon Toivo,
>
>         I work nifi operations/integration daily, on very vital datasets,
> and can tell you that we too had to change the views and procedures of our
> customers/leadership to accept this type of thinking. NIFI is a step
> forward in the evolution of software development not a software to be
> placed in previous software development methodologies. Our team has had
> this fight numerous times, from canned data, to test environments to
> upgrading policies. We even had one customer tell us, we needed a 17 day
> notice just to restart NIFI because all previous versions of their software
> had to be started in a sequence, previously took 2 hours to start back up
> and this was their current operating posture. In our daily activities our
> operations personnel work hand in hand with the developers to integrate new
> processors into production environments, usually by cloning production data
> into grouped development processors on the production graph. This allows us
> to expedite the integration process, save money on building test
> environments and also allows us to see real load on the production
> suite/cluster. There is no doubt, this is not only a change in development
> processes but also the mindsets around software development in general. I
> can only assure you the fight is well worth it in the end.
>
> Corey
>
> On Sun, Feb 22, 2015 at 4:46 PM, Toivo Adams <[hidden email]>
> wrote:
>
> > Joe,
> >
> > Thank you for explanation.
> >
> > I hope I understand NiFi main idea.
> > And it's really well thought and implemented.
> > I like it very much.
> >
> > But.
> > “Ultimately the idea of production versus test often means fairly large
> > protracted cycles from idea to production outcome.”
> > Welcome to my world.
> > I work for financial institution.
> >
> > Maybe I am naive but I do see NiFi as general way how to do software
> > development in many business domains.
> > I hate to hear NiFi is not usable in financial institution.
> >
> > I've seen a lot of different business applications.
> > One of worst of them are EJB style monolithic web applications.
> > I have learned that software should be implemented as side-effect free
> > components which will do only one thing but do it very well. Software
> > development is very costly and reusing components is the key to
> > keep development cost at reasonable level.
> > Also debugging and scalability is much simple using such components.
> > So NiFi is perfect fit.
> >
> > IDEAL world
> > Development and operations are separated.
> > Almost none are allowed to see live data because it contains highly
> > sensitive customer data.
> > Breaking rules may lead immediate firing. (this is not theory, it has
> > happen
> > few times)
> > So development must use scrambled test data.
> > Before even single simple change can be applied to Live system, strict
> > change management procedure must be pass through. Yes, this takes usually
> > 1-2 days when everything is OK.
> > And deploy will be done by operations (not by developers).
> > Positive is what all changes to live systems are recorded and anytime
> roll
> > back can be done quickly (s*it happens).
> > When whatever new application or change is in live, it might be running
> > many
> > years without any efforts from development team.
> >
> > REAL world
> > Sometimes something still goes wrong whatever the reason is.
> > Usually monitoring will get alert and will forward problem to
> predetermined
> > person(persons)
> > When operations (administrations) are unable to resolve problem this will
> > end up to some developer leader desk. In this case developer is
> authorized
> > to be use live data to solve problem quickly.
> > Now NiFi can again be very, very valuable.
> >
> > Summary
> > I want to use NiFi but at the same time I must follow our strict
> test/live
> > environment rules.
> > Or NiFi would not be accepted at all.
> >
> >
> > Thanks
> > toivo
> >
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p801.html
> > Sent from the Apache NiFi (incubating) Developer List mailing list
> archive
> > at Nabble.com.
> >
>
>
>
> --
> Corey Flowers
> Vice President, Onyx Point, Inc
> (410) 541-6699
> [hidden email]
>
> -- This account not approved for unencrypted proprietary information --
>
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Mark Payne
In reply to this post by Joe Witt
I think one of the things here that may be confusing people is that we are talking about “testing” here in general terms. There are really two types of testing that are relevant here: testing of code (new nars), and testing of dataflow configuration.


To test a new dataflow configuration, I would create the test alongside the production flow on the production instance. When you are happy with what you see, you can switch the production flow to the new one and kill off the old flow.


To test new code, as you are discussing here, I would run the new code in a test instance of NiFi. I would then have the production instance push a copy of the production data (possibly a filtered subset of the data or an obfuscated version of the data, as requirements may dictate) to the test instance, and verify that the code works well. This is the approach that I discussed in the blog post.


Thanks

-Mark





From: Mark Payne
Sent: ‎Sunday‎, ‎February‎ ‎22‎, ‎2015 ‎8‎:‎00‎ ‎PM
To: [hidden email]





Bryan,


This is largely what I was trying to touch on in my last response to Toivo. It’s also what I was touching on in the last blog that I wrote: https://blogs.apache.org/nifi/entry/say_good_bye_to_canned


This is largely how I test all of my software at my “day job.” I spawn a duplicate feed to push to a test instance. That duplicate feed could then be modified/obfuscated as needed for the test cluster. Or you could filter out only part of it to send to the test cluster.


Does this make sense? Or am I just confusing people? 😊








From: Bryan Bende
Sent: ‎Sunday‎, ‎February‎ ‎22‎, ‎2015 ‎7‎:‎44‎ ‎PM
To: [hidden email]





What if you are making significant changes to a custom processor?

Using Toivo's example, if there is a custom processor for inserting to a
relational database, and the graph has one instance inserting to a
production database, and another inserting to a test database, is there a
good way to test your latest Nar and not affect the production part of the
flow?

-Bryan


On Sun, Feb 22, 2015 at 5:59 PM, Corey Flowers <[hidden email]>
wrote:

> Good afternoon Toivo,
>
>         I work nifi operations/integration daily, on very vital datasets,
> and can tell you that we too had to change the views and procedures of our
> customers/leadership to accept this type of thinking. NIFI is a step
> forward in the evolution of software development not a software to be
> placed in previous software development methodologies. Our team has had
> this fight numerous times, from canned data, to test environments to
> upgrading policies. We even had one customer tell us, we needed a 17 day
> notice just to restart NIFI because all previous versions of their software
> had to be started in a sequence, previously took 2 hours to start back up
> and this was their current operating posture. In our daily activities our
> operations personnel work hand in hand with the developers to integrate new
> processors into production environments, usually by cloning production data
> into grouped development processors on the production graph. This allows us
> to expedite the integration process, save money on building test
> environments and also allows us to see real load on the production
> suite/cluster. There is no doubt, this is not only a change in development
> processes but also the mindsets around software development in general. I
> can only assure you the fight is well worth it in the end.
>
> Corey
>
> On Sun, Feb 22, 2015 at 4:46 PM, Toivo Adams <[hidden email]>
> wrote:
>
> > Joe,
> >
> > Thank you for explanation.
> >
> > I hope I understand NiFi main idea.
> > And it's really well thought and implemented.
> > I like it very much.
> >
> > But.
> > “Ultimately the idea of production versus test often means fairly large
> > protracted cycles from idea to production outcome.”
> > Welcome to my world.
> > I work for financial institution.
> >
> > Maybe I am naive but I do see NiFi as general way how to do software
> > development in many business domains.
> > I hate to hear NiFi is not usable in financial institution.
> >
> > I've seen a lot of different business applications.
> > One of worst of them are EJB style monolithic web applications.
> > I have learned that software should be implemented as side-effect free
> > components which will do only one thing but do it very well. Software
> > development is very costly and reusing components is the key to
> > keep development cost at reasonable level.
> > Also debugging and scalability is much simple using such components.
> > So NiFi is perfect fit.
> >
> > IDEAL world
> > Development and operations are separated.
> > Almost none are allowed to see live data because it contains highly
> > sensitive customer data.
> > Breaking rules may lead immediate firing. (this is not theory, it has
> > happen
> > few times)
> > So development must use scrambled test data.
> > Before even single simple change can be applied to Live system, strict
> > change management procedure must be pass through. Yes, this takes usually
> > 1-2 days when everything is OK.
> > And deploy will be done by operations (not by developers).
> > Positive is what all changes to live systems are recorded and anytime
> roll
> > back can be done quickly (s*it happens).
> > When whatever new application or change is in live, it might be running
> > many
> > years without any efforts from development team.
> >
> > REAL world
> > Sometimes something still goes wrong whatever the reason is.
> > Usually monitoring will get alert and will forward problem to
> predetermined
> > person(persons)
> > When operations (administrations) are unable to resolve problem this will
> > end up to some developer leader desk. In this case developer is
> authorized
> > to be use live data to solve problem quickly.
> > Now NiFi can again be very, very valuable.
> >
> > Summary
> > I want to use NiFi but at the same time I must follow our strict
> test/live
> > environment rules.
> > Or NiFi would not be accepted at all.
> >
> >
> > Thanks
> > toivo
> >
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p801.html
> > Sent from the Apache NiFi (incubating) Developer List mailing list
> archive
> > at Nabble.com.
> >
>
>
>
> --
> Corey Flowers
> Vice President, Onyx Point, Inc
> (410) 541-6699
> [hidden email]
>
> -- This account not approved for unencrypted proprietary information --
>
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Joe Witt
In reply to this post by Mark Payne
Toivo: Congrats and thanks for kicking off a really important thread.
This is going to be an important thing for us to understand and
communicate effectively.  I think several of us have been doing it
differently now for a while and so we think and talk about it
differently as a result.

I totally accept that the scenario and needs you describe are real.
Just want to really understand the details and see if there is a more
'with the grain' approach.  I definitely hope NiFi works out to be a
really useful tool for your case and if we can find a clean and
consistent way to support your scenario great.  We're trying to be
awesome at the sensor driven use cases.  There are lots of cases to
suggest that we have strength as a general dataflow solution that but
we'll see.  So as long as you'll keep working with us to understand
we'll explore it and figure it out.  We need to be good as a community
at both accepting new ideas and also accepting that some things just
don't fit well.

Keep in mind that our base design being oriented on Flow Based
Programming encourages highly cohesive and loosely coupled processes.
This focus on cohesion tends to dramatically ease the testing
challenges that exist.  True scale tests are almost impossible to
emulate until you're in the full heat of the production fire.  But in
any event the FBP model does a great job of helping you knock down the
vast majority of the issues through truly effective unit testing.
These cohesive elements are also great for reuse as you mentioned
previously.

For now is the only thing you're requesting that you'd like to see
variables that can be substituted?  We've talked about doing something
like this before not necessarily for replacement purposes but because
we wanted people to be able to set a value once and reuse it often.
But i could see that supporting this too.  I want to really understand
your case better before we head down that route.

Thanks
Joe

On Sun, Feb 22, 2015 at 7:57 PM, Mark Payne <[hidden email]> wrote:

> Bryan,
>
>
> This is largely what I was trying to touch on in my last response to Toivo. It’s also what I was touching on in the last blog that I wrote: https://blogs.apache.org/nifi/entry/say_good_bye_to_canned
>
>
> This is largely how I test all of my software at my “day job.” I spawn a duplicate feed to push to a test instance. That duplicate feed could then be modified/obfuscated as needed for the test cluster. Or you could filter out only part of it to send to the test cluster.
>
>
> Does this make sense? Or am I just confusing people? 😊
>
>
>
>
>
>
>
>
> From: Bryan Bende
> Sent: ‎Sunday‎, ‎February‎ ‎22‎, ‎2015 ‎7‎:‎44‎ ‎PM
> To: [hidden email]
>
>
>
>
>
> What if you are making significant changes to a custom processor?
>
> Using Toivo's example, if there is a custom processor for inserting to a
> relational database, and the graph has one instance inserting to a
> production database, and another inserting to a test database, is there a
> good way to test your latest Nar and not affect the production part of the
> flow?
>
> -Bryan
>
>
> On Sun, Feb 22, 2015 at 5:59 PM, Corey Flowers <[hidden email]>
> wrote:
>
>> Good afternoon Toivo,
>>
>>         I work nifi operations/integration daily, on very vital datasets,
>> and can tell you that we too had to change the views and procedures of our
>> customers/leadership to accept this type of thinking. NIFI is a step
>> forward in the evolution of software development not a software to be
>> placed in previous software development methodologies. Our team has had
>> this fight numerous times, from canned data, to test environments to
>> upgrading policies. We even had one customer tell us, we needed a 17 day
>> notice just to restart NIFI because all previous versions of their software
>> had to be started in a sequence, previously took 2 hours to start back up
>> and this was their current operating posture. In our daily activities our
>> operations personnel work hand in hand with the developers to integrate new
>> processors into production environments, usually by cloning production data
>> into grouped development processors on the production graph. This allows us
>> to expedite the integration process, save money on building test
>> environments and also allows us to see real load on the production
>> suite/cluster. There is no doubt, this is not only a change in development
>> processes but also the mindsets around software development in general. I
>> can only assure you the fight is well worth it in the end.
>>
>> Corey
>>
>> On Sun, Feb 22, 2015 at 4:46 PM, Toivo Adams <[hidden email]>
>> wrote:
>>
>> > Joe,
>> >
>> > Thank you for explanation.
>> >
>> > I hope I understand NiFi main idea.
>> > And it's really well thought and implemented.
>> > I like it very much.
>> >
>> > But.
>> > “Ultimately the idea of production versus test often means fairly large
>> > protracted cycles from idea to production outcome.”
>> > Welcome to my world.
>> > I work for financial institution.
>> >
>> > Maybe I am naive but I do see NiFi as general way how to do software
>> > development in many business domains.
>> > I hate to hear NiFi is not usable in financial institution.
>> >
>> > I've seen a lot of different business applications.
>> > One of worst of them are EJB style monolithic web applications.
>> > I have learned that software should be implemented as side-effect free
>> > components which will do only one thing but do it very well. Software
>> > development is very costly and reusing components is the key to
>> > keep development cost at reasonable level.
>> > Also debugging and scalability is much simple using such components.
>> > So NiFi is perfect fit.
>> >
>> > IDEAL world
>> > Development and operations are separated.
>> > Almost none are allowed to see live data because it contains highly
>> > sensitive customer data.
>> > Breaking rules may lead immediate firing. (this is not theory, it has
>> > happen
>> > few times)
>> > So development must use scrambled test data.
>> > Before even single simple change can be applied to Live system, strict
>> > change management procedure must be pass through. Yes, this takes usually
>> > 1-2 days when everything is OK.
>> > And deploy will be done by operations (not by developers).
>> > Positive is what all changes to live systems are recorded and anytime
>> roll
>> > back can be done quickly (s*it happens).
>> > When whatever new application or change is in live, it might be running
>> > many
>> > years without any efforts from development team.
>> >
>> > REAL world
>> > Sometimes something still goes wrong whatever the reason is.
>> > Usually monitoring will get alert and will forward problem to
>> predetermined
>> > person(persons)
>> > When operations (administrations) are unable to resolve problem this will
>> > end up to some developer leader desk. In this case developer is
>> authorized
>> > to be use live data to solve problem quickly.
>> > Now NiFi can again be very, very valuable.
>> >
>> > Summary
>> > I want to use NiFi but at the same time I must follow our strict
>> test/live
>> > environment rules.
>> > Or NiFi would not be accepted at all.
>> >
>> >
>> > Thanks
>> > toivo
>> >
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> >
>> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p801.html
>> > Sent from the Apache NiFi (incubating) Developer List mailing list
>> archive
>> > at Nabble.com.
>> >
>>
>>
>>
>> --
>> Corey Flowers
>> Vice President, Onyx Point, Inc
>> (410) 541-6699
>> [hidden email]
>>
>> -- This account not approved for unencrypted proprietary information --
>>
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Toivo Adams
Corey,
Thank you for sharing your experience.

Bryan,
This is often the case.
Changed and tested custom processor must be tested in test environment using flow template.
And only after successful testing and bureaucracy new nar can be moved to live system.
Template (flow) itself may not be changed at all.

Mark,
Right, adding Scrambling component to flow is good idea to get automatically data for testing and problem solving. I think our security will accept this.
But mixing test and live environments is strictly prohibited.
Often this is not even possible, live and test environments are behind different firewalls and direct links are not possible.
So we can get data from live system, but testing must be done in test environment.

“If there's one thing that I've learned in my years as a software developer, it's that no matter how diligent we are in testing our code, we get data in production that we just haven't accounted for.”
Yes, painfully true and happens too often.
It's especially embarrassing after spending so many hours for testing. And you must explain to business people why well tested application has problems in live.
:)
But rules are rules and we must follow the rules.
Over the time when NiFi is accepted hopefully some rules will be changed.
Before changing any rule reason must be well explained, benefits, risks, etc.

Joe,
Some of our current processes are implemented using Spring framework.
Spring xml configuration file describes how Java beans are tied together to create sort of 'proccesing flow'.
Of course such configuration is very static and primitive compared to NiFi.
We keep such configuration files in test environment and separate configuration files in live environment. And sometimes they are going out of sync. Keeping them is sync is manual work and people do make mistakes.

We have few new applications under development where we have concept of Test.conf and Live.conf.
Application configuration is exactly the same for test and live. Only Test.conf and Live.conf contains different information, mostly only 'endpoints'. For example DATABASE.
When application is in test environment it will pick Test.conf automatically.
This is very handy. Test.conf and Live.conf change very rarely.

So now I'll examine the possibilities.
I'd like always to reduce possibilities to make mistakes and reduce development time.
Remove error-prone manual work, eliminate writing boilerplate code, DRY, prefer immutable data, side-effect free code, etc.

“For now is the only thing you're requesting that you'd like to see
variables that can be substituted? ”
Yes, nice to have something like that.
It's not a show-stopper.
Don't know how this should be implemented.
One workaround is possibly to create special Controller Service (services)?
For example https://issues.apache.org/jira/browse/NIFI-322
New Database Connection Pooling Controller Service
will offer some functionality.

I have other ideas as well, but let's move step by step.
:)

Thank for thinking and support
toivo
Reply | Threaded
Open this post in threaded view
|

Re: Great question on #nifi IRC room today: NiFi , BPM, sharing configuration

Joe Witt
Toivo,

Please consider the following as an alternative to using substitution
of variables for things like PROD db and TEST db:

Part of the power of NiFi is the real-time command and control
enabling immediate fine-grained modification but also clear and
concise visualization of what is happening in the flow.  There are
often little variations between a flow in production and a flow in
test environments because it is indeed so difficult to emulate it all
perfectly.  So, please consider setting up your flow such that you use
a RouteOnAttribute processor to send data to your 'production'
database or your 'test' database based on an environment variable.

So your flow could be something like

<do stuff>  --> Route Based on Environment
                                                         --> Production DB
                                                         --> Test DB

The 'Route Based On Environment' processor could be:

Add a property with name:
    production
And value:
   ${PROD_ENV:matches('TRUE'}


That way out of that processor when you have an environment variable
set called "PROD_ENV" and it is set to "TRUE" the flow will go to
'production' relationship and otherwise it will go to 'no match" which
you can route to test.

This way it is the same flow configuration in both environments but
the path the data will take will vary based on your environment.  This
has the same effect as substitution but is more intentional and visual
and it supports more complex differences between prod and test
environments.

Thanks
Joe

On Mon, Feb 23, 2015 at 9:13 AM, Toivo Adams <[hidden email]> wrote:

> Corey,
> Thank you for sharing your experience.
>
> Bryan,
> This is often the case.
> Changed and tested custom processor must be tested in test environment using
> flow template.
> And only after successful testing and bureaucracy new nar can be moved to
> live system.
> Template (flow) itself may not be changed at all.
>
> Mark,
> Right, adding Scrambling component to flow is good idea to get automatically
> data for testing and problem solving. I think our security will accept this.
> But mixing test and live environments is strictly prohibited.
> Often this is not even possible, live and test environments are behind
> different firewalls and direct links are not possible.
> So we can get data from live system, but testing must be done in test
> environment.
>
> “If there's one thing that I've learned in my years as a software developer,
> it's that no matter how diligent we are in testing our code, we get data in
> production that we just haven't accounted for.”
> Yes, painfully true and happens too often.
> It's especially embarrassing after spending so many hours for testing. And
> you must explain to business people why well tested application has problems
> in live.
> :)
> But rules are rules and we must follow the rules.
> Over the time when NiFi is accepted hopefully some rules will be changed.
> Before changing any rule reason must be well explained, benefits, risks,
> etc.
>
> Joe,
> Some of our current processes are implemented using Spring framework.
> Spring xml configuration file describes how Java beans are tied together to
> create sort of 'proccesing flow'.
> Of course such configuration is very static and primitive compared to NiFi.
> We keep such configuration files in test environment and separate
> configuration files in live environment. And sometimes they are going out of
> sync. Keeping them is sync is manual work and people do make mistakes.
>
> We have few new applications under development where we have concept of
> Test.conf and Live.conf.
> Application configuration is exactly the same for test and live. Only
> Test.conf and Live.conf contains different information, mostly only
> 'endpoints'. For example DATABASE.
> When application is in test environment it will pick Test.conf
> automatically.
> This is very handy. Test.conf and Live.conf change very rarely.
>
> So now I'll examine the possibilities.
> I'd like always to reduce possibilities to make mistakes and reduce
> development time.
> Remove error-prone manual work, eliminate writing boilerplate code, DRY,
> prefer immutable data, side-effect free code, etc.
>
> “For now is the only thing you're requesting that you'd like to see
> variables that can be substituted? ”
> Yes, nice to have something like that.
> It's not a show-stopper.
> Don't know how this should be implemented.
> One workaround is possibly to create special Controller Service (services)?
> For example https://issues.apache.org/jira/browse/NIFI-322
> New Database Connection Pooling Controller Service
> will offer some functionality.
>
> I have other ideas as well, but let's move step by step.
> :)
>
> Thank for thinking and support
> toivo
>
>
>
>
> --
> View this message in context: http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Great-question-on-nifi-IRC-room-today-NiFi-BPM-sharing-configuration-tp787p811.html
> Sent from the Apache NiFi (incubating) Developer List mailing list archive at Nabble.com.