NiFi kubectl for launching container jobs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

NiFi kubectl for launching container jobs

Erik Anderson
I have heard about NiFi-Fn (B23 Kubernetes Operator for NiFi-Fn)

Has anyone built a NiFi kubectl processor and possibly a nice NiFi "remote jobs" base docker container that can be used to control a remote nifi processor/job that conforms to Apache NiFi input and output mechanisms (flow file format)?

I know we would need a way to marshal the NiFi flowfile format in and out of a container, but if we did we can launch remote Python processes that scale well via using cloud native mechanisms (DevOps).

We built a native Python 2.7/3.7 NiFi processor that allows you to quickly chain together Java and Python flows. This is powerful because most data infrastructure is in python, not Java, especially Geospatial data. Of course this wont scale because of the number of Python processors that can potentially run on a NiFi node, but it allows you to quickly get things working. 2 days and you can do some amazing things.

If I can now offload that Python processing, via Kubernetes kubectl, we can use automated DevOps scaling for some really large jobs. Possibly using a NiFi processor that wraps https://github.com/kubernetes-client/java

Why all this jazz?
Real Use Case: Geospatial data (GeoJSON, ESRI Shapefile, etc). It requires standard python "pip install blah-blah" packages to process it.

Thoughts? Please throw tomatoes at the idea. I welcome constructive and destructive criticism because that means people care.

Erik Anderson
Bloomberg

Reply | Threaded
Open this post in threaded view
|

Re: NiFi kubectl for launching container jobs

Joe Witt
Erik

The pattern/concept described is definitely a thing and a powerful model.
The stateless-nifi construct is a key enabler of this combined with
seamless integration of traditional NiFi to it combined with the registry
combined with a powerful Kubernetes operator.

Thanks

On Fri, Jun 28, 2019 at 9:10 AM Erik Anderson <[hidden email]> wrote:

> I have heard about NiFi-Fn (B23 Kubernetes Operator for NiFi-Fn)
>
> Has anyone built a NiFi kubectl processor and possibly a nice NiFi "remote
> jobs" base docker container that can be used to control a remote nifi
> processor/job that conforms to Apache NiFi input and output mechanisms
> (flow file format)?
>
> I know we would need a way to marshal the NiFi flowfile format in and out
> of a container, but if we did we can launch remote Python processes that
> scale well via using cloud native mechanisms (DevOps).
>
> We built a native Python 2.7/3.7 NiFi processor that allows you to quickly
> chain together Java and Python flows. This is powerful because most data
> infrastructure is in python, not Java, especially Geospatial data. Of
> course this wont scale because of the number of Python processors that can
> potentially run on a NiFi node, but it allows you to quickly get things
> working. 2 days and you can do some amazing things.
>
> If I can now offload that Python processing, via Kubernetes kubectl, we
> can use automated DevOps scaling for some really large jobs. Possibly using
> a NiFi processor that wraps https://github.com/kubernetes-client/java
>
> Why all this jazz?
> Real Use Case: Geospatial data (GeoJSON, ESRI Shapefile, etc). It requires
> standard python "pip install blah-blah" packages to process it.
>
> Thoughts? Please throw tomatoes at the idea. I welcome constructive and
> destructive criticism because that means people care.
>
> Erik Anderson
> Bloomberg
>
>