TL;DR: How to setup Apache NiFi on Kubernetes in 1 minute
Apache NiFi is one of the most used tools data processing tools. With a great web base GUI you can design and deploy complex datapath workflows easily.
The core package includes a lot of operators (connectors). You can get tweets for a specific hashtag, load file from S3, call a HTTP API Rest service or send a email, for example.
It’s a Java software with a Web UI. It’s very simple to launch and start to build and deploy pipelines.
One of its main advantages are the queues between processing nodes. If one processing node is stopped or busy processing previous data, the data is enqueued.
On the other hand, it’s scalability capabilities are very limited.
Basically you can’t execute processing units in different computer nodes.
The new kid on the block, Apache Airflow, has a lot of operators too and it can scale out easily. It’s powerful and is gaining ground on Apache NiFi. We’ll explore it in some other time.
What we need
We need a Kubernetes cluster with an ingress service. You can follow these post to create one.
How to create a Kubernetes cluster with Rancher on Hetzner
TL;DR: In 15 minutes you can have a lab cluster ready to test or to deploy your projects cheap and easily.
Free SSL certificate for your Kubernetes cluster with Rancher
Thanks to Let’s encrypt we can create SSL certificates for our HTTPS or TLS services for.. FREE!
It’s very very simple. The only thing that is a little tricky is the ingress setup.
The following gist create the NiFi deployment, service and ingress.