load balancer on nifi cluster

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

load balancer on nifi cluster

pradeepbill
This post was updated on .
hello folks, We have a 3 node cluster, and I was looking for a load balancer kind of scenario, where the load is distributed evenly between the nodes, but so far I could not find any guides on how to set up a load balancer.
My idea is when all the  logs get sent to a load balancer, from there on there is an equal split of load across the cluster, instead now , we get logs on individual nodes of the cluster, example , log sources 1-10 get sent to node1, logs 11-20 gets sent to node 2, and logs 21-30 get sent to node3.I am not sure if all the nodes are automatically load balanced without a load balancer.

Please advice.

Thanks
Pradeep
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

Bryan Bende
Hi Pradeep,

Can you clarify what you are trying to load balance? How is data
entering your flow?

I'm going to assume you are talking about load balancing data coming
in to the system, and not end users accessing the web UI, but correct
me if I am wrong.

Load balancing the data depends a lot on how the data is being brought
into the system. This article [1] explains approaches for how to
distribute the data across the cluster.

Thanks,

Bryan

[1] https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html

On Fri, Apr 21, 2017 at 10:15 AM, pradeepbill <[hidden email]> wrote:

> hello folks, We have a 3 node cluster, and I was looking for a load balancer
> kind of scenario, where the load is distributed evenly between the nodes,
> but so far I could not find any guides on how to set up a load balancer.Can
> you please advice.
>
> Thanks
> Pradeep
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

pradeepbill
This post was updated on .
thanks for the article Bryan,

ours is a push mechanism, data gets sent to an IP:PORT of a node in  the cluster, and we use listensyslog processor to listen on that port, Yes ,we are trying to load balance data, and not to over work a particular node in the cluster.From the article it says we can create a load balancer before the cluster, where can I get info on that ?.

Also , the article says

In an Apache NiFi cluster, every node runs the same dataflow and data is divided between the nodes . Does this mean we do not have to load balance ?, as we the same processor running on all the nodes .

Thanks again.
Pradeep
Bryan Bende wrote
Hi Pradeep,

Can you clarify what you are trying to load balance? How is data
entering your flow?

I'm going to assume you are talking about load balancing data coming
in to the system, and not end users accessing the web UI, but correct
me if I am wrong.

Load balancing the data depends a lot on how the data is being brought
into the system. This article [1] explains approaches for how to
distribute the data across the cluster.

Thanks,

Bryan

[1] https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html

On Fri, Apr 21, 2017 at 10:15 AM, pradeepbill <[hidden email]> wrote:
> hello folks, We have a 3 node cluster, and I was looking for a load balancer
> kind of scenario, where the load is distributed evenly between the nodes,
> but so far I could not find any guides on how to set up a load balancer.Can
> you please advice.
>
> Thanks
> Pradeep
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Bryan Bende wrote
Hi Pradeep,

Can you clarify what you are trying to load balance? How is data
entering your flow?

I'm going to assume you are talking about load balancing data coming
in to the system, and not end users accessing the web UI, but correct
me if I am wrong.

Load balancing the data depends a lot on how the data is being brought
into the system. This article [1] explains approaches for how to
distribute the data across the cluster.

Thanks,

Bryan

[1] https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html

On Fri, Apr 21, 2017 at 10:15 AM, pradeepbill <[hidden email]> wrote:
> hello folks, We have a 3 node cluster, and I was looking for a load balancer
> kind of scenario, where the load is distributed evenly between the nodes,
> but so far I could not find any guides on how to set up a load balancer.Can
> you please advice.
>
> Thanks
> Pradeep
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

Bryan Bende
Correct, depending how you are using ListenSyslog, you need a load
balancer that supports TCP or UDP.

NGINX is one of them, although I am not sure if the free version
supports TCP/UDP:

https://www.nginx.com/resources/admin-guide/tcp-load-balancing/#upstream

HAProxy supports TCP:

http://www.haproxy.org/



On Fri, Apr 21, 2017 at 10:50 AM, pradeepbill <[hidden email]> wrote:

> thanks for the article Bryan,
>
> ours is a push mechanism, data gets sent to an IP:PORT of a node in  the
> cluster, and we use listensyslog processor to listen on that port, Yes ,we
> are trying to load balance data, and not to over work a particular node in
> the cluster.From the article it says we can create a load balancer before
> the cluster, where can I get info on that ?.
>
> Thanks again.
> Pradeep
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511p15513.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

Pierre Villard
Other resources that could be useful:

-
https://pierrevillard.com/2017/02/10/haproxy-load-balancing-in-front-of-apache-nifi/
- http://ijokarumawak.github.io/nifi/2016/11/01/nifi-cluster-lb/

Pierre



2017-04-21 17:15 GMT+02:00 Bryan Bende <[hidden email]>:

> Correct, depending how you are using ListenSyslog, you need a load
> balancer that supports TCP or UDP.
>
> NGINX is one of them, although I am not sure if the free version
> supports TCP/UDP:
>
> https://www.nginx.com/resources/admin-guide/tcp-load-balancing/#upstream
>
> HAProxy supports TCP:
>
> http://www.haproxy.org/
>
>
>
> On Fri, Apr 21, 2017 at 10:50 AM, pradeepbill <[hidden email]>
> wrote:
> > thanks for the article Bryan,
> >
> > ours is a push mechanism, data gets sent to an IP:PORT of a node in  the
> > cluster, and we use listensyslog processor to listen on that port, Yes
> ,we
> > are trying to load balance data, and not to over work a particular node
> in
> > the cluster.From the article it says we can create a load balancer before
> > the cluster, where can I get info on that ?.
> >
> > Thanks again.
> > Pradeep
> >
> >
> >
> > --
> > View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511p15513.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

pradeepbill
This post was updated on .
In reply to this post by Bryan Bende
thanks bryan, also from your article , one more clarification needed please

In an Apache NiFi cluster, every node runs the same dataflow and data is divided between the nodes., does this mean the nodes are auto load balanced ?, as we will have the same listensyslog component running on all the nodes.
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

Bryan Bende
No there isn't auto-load balancing...

That statement means that NiFi is not taking the graph of processors
and running some processors on one node and some on another node (i.e.
like Storm and other stream processing systems).

What you see in the UI is running on every node and you have to design
the data flow in a way that makes use of the cluster appropriately,
how you do that depends on how you bring data into the system.

On Fri, Apr 21, 2017 at 11:22 AM, pradeepbill <[hidden email]> wrote:

> also from your article , one more clarification needed please
>
> *In an Apache NiFi cluster, every node runs the same dataflow and data is
> divided between the nodes.*, does this mean the nodes are auto load balanced
> ?, as we will have the same listensyslog component running on all the nodes.
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511p15516.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

pradeepbill
thanks Bryan.
Reply | Threaded
Open this post in threaded view
|

Re: load balancer on nifi cluster

Andre
In reply to this post by pradeepbill
Pradeep,

As Bryan mentioned, load balancing of clusters boils down to the input
protocol you are using and desired load balance outcome.

In general if using TCP based load balancing you will have to make up your
mind on how to configure your load balancer. Common approaches are round
robin, byte counts, time to response or health cheacks.

Since NiFi ListenSyslog processor doesn't have any particular requirements
for load balancing (and assuming you are happy with a round robin setup)
you could simply put your TCP based load balancer (F5, HAProxy, ELB, etc)
in front of the cluster nodes as assign each new session to one of the
nodes as you would do with a TCP server.

One of our commiters Koji has put together a great example on how to
achieve load balancing with NiFi and HA proxy (adjusting it to ELB or F5s
shouldn't be very hard) and docker.

http://ijokarumawak.github.io/nifi/2016/11/01/nifi-cluster-lb/


Round robin by itself would not solve the issue of spreading the load
across nodes but if I recall correctly, you could use something like

ListenSyslog -> Remote Process Group (i.e. Input Port)

Input port -> "Do something" -> "Do more stuff" -> etc


The rationale here is since ListenSyslog is reasonably lightweight you may
still be able to complement a sub-optimal load balancing strategy, (e.g.
round robin)  with Site-to-Site cluster aware load balancing mechanisms.

Cheers


On Sat, Apr 22, 2017 at 12:50 AM, pradeepbill <[hidden email]>
wrote:

> thanks for the article Bryan,
>
> ours is a push mechanism, data gets sent to an IP:PORT of a node in  the
> cluster, and we use listensyslog processor to listen on that port, Yes ,we
> are trying to load balance data, and not to over work a particular node in
> the cluster.From the article it says we can create a load balancer before
> the cluster, where can I get info on that ?.
>
> Thanks again.
> Pradeep
>
>
>
> --
> View this message in context: http://apache-nifi-developer-l
> ist.39713.n7.nabble.com/load-balancer-on-nifi-cluster-tp15511p15513.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>