Thread Management within Processors

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Thread Management within Processors

Brian Ghigiarelli
Hi all,

Is there a consensus on creating thread pools and managing them within
processors (as opposed to using the Concurrent Tasks to handle thread
management)?

In particular, we've rolled a different version of the GetKafka processor,
but it doesn't use the getConcurrentTasks() like the current version does.
Instead, we extend AbstractSessionFactoryProcessor, create our own thread
pool in onTrigger, and handle shutdowns / restarts with the remaining
lifecycle hooks.  In any case, it's outside of the context of the
concurrent tasks managed in NiFi.  Goodness / badness?

Thanks,

--
Brian Ghigiarelli
Reply | Threaded
Open this post in threaded view
|

Re: Thread Management within Processors

Mark Payne
Brian,

I'd say managing your own thread pool in general is bad -- but not
necessarily a super terrible thing.
We've done similar things before. It really just comes down to: do the
benefits outweigh the cons?

Thanks
-Mark

------ Original Message ------
From: "Brian Ghigiarelli" <[hidden email]>
To: [hidden email]
Sent: 4/29/2015 9:14:24 PM
Subject: Thread Management within Processors

>Hi all,
>
>Is there a consensus on creating thread pools and managing them within
>processors (as opposed to using the Concurrent Tasks to handle thread
>management)?
>
>In particular, we've rolled a different version of the GetKafka
>processor,
>but it doesn't use the getConcurrentTasks() like the current version
>does.
>Instead, we extend AbstractSessionFactoryProcessor, create our own
>thread
>pool in onTrigger, and handle shutdowns / restarts with the remaining
>lifecycle hooks. In any case, it's outside of the context of the
>concurrent tasks managed in NiFi. Goodness / badness?
>
>Thanks,
>
>--
>Brian Ghigiarelli
Reply | Threaded
Open this post in threaded view
|

Re: Thread Management within Processors

Adam Taft
In reply to this post by Brian Ghigiarelli
Note that, it's not uncommon for an ingress processor to have its own
thread pool.  For example, any processor which opens a listening server
socket would likely maintain its own thread pool -- jetty, for example,
uses its own thread pool.

If the processor developer wants to control the pace of ingress and "knows
better" how to control the threads, I don't think it's terrible to run your
own thread pool (ideally with daemon threads, just in case).

In general, an "internal" processor sitting between the boundaries of the
dataflow (neither ingress or egress) should likely only ever use the NIFI
configured threads. But for ingress/egress processors sitting on the
border, there might very well be justification for your own thread pool.

That's my two cents,

Adam


On Wed, Apr 29, 2015 at 9:14 PM, Brian Ghigiarelli <[hidden email]>
wrote:

> Hi all,
>
> Is there a consensus on creating thread pools and managing them within
> processors (as opposed to using the Concurrent Tasks to handle thread
> management)?
>
> In particular, we've rolled a different version of the GetKafka processor,
> but it doesn't use the getConcurrentTasks() like the current version does.
> Instead, we extend AbstractSessionFactoryProcessor, create our own thread
> pool in onTrigger, and handle shutdowns / restarts with the remaining
> lifecycle hooks.  In any case, it's outside of the context of the
> concurrent tasks managed in NiFi.  Goodness / badness?
>
> Thanks,
>
> --
> Brian Ghigiarelli
>
Reply | Threaded
Open this post in threaded view
|

Re: Thread Management within Processors

Brian Ghigiarelli
Thanks for the responses, guys.  Coincides with our own understanding as
well, and it's nice to have the community weigh in for a sanity check.

On Thu, Apr 30, 2015 at 4:22 PM, Adam Taft <[hidden email]> wrote:

> Note that, it's not uncommon for an ingress processor to have its own
> thread pool.  For example, any processor which opens a listening server
> socket would likely maintain its own thread pool -- jetty, for example,
> uses its own thread pool.
>
> If the processor developer wants to control the pace of ingress and "knows
> better" how to control the threads, I don't think it's terrible to run your
> own thread pool (ideally with daemon threads, just in case).
>
> In general, an "internal" processor sitting between the boundaries of the
> dataflow (neither ingress or egress) should likely only ever use the NIFI
> configured threads. But for ingress/egress processors sitting on the
> border, there might very well be justification for your own thread pool.
>
> That's my two cents,
>
> Adam
>
>
> On Wed, Apr 29, 2015 at 9:14 PM, Brian Ghigiarelli <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > Is there a consensus on creating thread pools and managing them within
> > processors (as opposed to using the Concurrent Tasks to handle thread
> > management)?
> >
> > In particular, we've rolled a different version of the GetKafka
> processor,
> > but it doesn't use the getConcurrentTasks() like the current version
> does.
> > Instead, we extend AbstractSessionFactoryProcessor, create our own thread
> > pool in onTrigger, and handle shutdowns / restarts with the remaining
> > lifecycle hooks.  In any case, it's outside of the context of the
> > concurrent tasks managed in NiFi.  Goodness / badness?
> >
> > Thanks,
> >
> > --
> > Brian Ghigiarelli
> >
>



--
Brian Ghigiarelli
570-878-9139