Deploying Node-RED on Kubernetes with Persistent Storage and Data Synchronization

Apr 07, 2024

Deploying applications on Kubernetes offers scalability, flexibility, and efficiency for managing containerized workloads. However, ensuring persistent storage and consistent data across cluster nodes can be challenging. This post details how to deploy Node-RED in a Kubernetes cluster, ensure persistent storage with Kubernetes Persistent Volumes (PV) and Persistent Volume Claims (PVC), and synchronize data across nodes to maintain consistency.

Setting Up Node-RED with Persistent Storage on Kubernetes

Node-RED is a popular programming tool for wiring together hardware devices, APIs, and online services. Here's how to get it up and running on Kubernetes, with the data preserved across pod restarts and migrations.

1. Create a Node-RED Deployment

First, define a Kubernetes deployment for Node-RED. This deployment specifies the container image and configures it to use a persistent volume for data storage.

Create nodered-deployment.yaml with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nodered
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nodered
  template:
    metadata:
      labels:
        app: nodered
    spec:
      containers:
      - name: nodered
        image: nodered/node-red:latest
        ports:
        - containerPort: 1880
        volumeMounts:
        - name: nodered-storage
          mountPath: /data
      volumes:
      - name: nodered-storage
        persistentVolumeClaim:
          claimName: nodered-pvc

This configuration mounts the persistent volume at /data, Node-RED's default directory for user data.

2. Define the Persistent Storage

Next, ensure data persists across pod restarts by defining a PV and a PVC.

PersistentVolume (nodered-pv.yaml):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nodered-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: /home/pi/kubeconfigs/data/node-red

PersistentVolumeClaim (nodered-pvc.yaml):

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nodered-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: manual

Apply these configurations with kubectl apply, ensuring Node-RED has the necessary persistent storage setup.

Synchronizing Data Across Nodes

To maintain data consistency across all nodes, we employ a synchronization mechanism using rsync and cron. This approach periodically syncs the /home/pi/kubeconfigs/data directory across all nodes.

Setting Up Passwordless SSH

To enable seamless data synchronization across your Kubernetes cluster nodes using rsync, it's crucial to establish passwordless SSH access among the nodes. This setup allows scripts to remotely execute commands without manual password entry, facilitating automated synchronization tasks. Here’s how you can set up passwordless SSH:

1. Generate an SSH Key Pair

On the node from which you'll initiate the synchronization (your primary node), generate a new SSH key pair if you don't already have one:

ssh-keygen -t rsa -b 2048

When prompted to "Enter a file in which to save the key," press Enter to accept the default location (~/.ssh/id_rsa). When asked for a passphrase, press Enter to leave it empty, ensuring truly passwordless SSH access.

2. Copy the SSH Public Key to Other Nodes

Next, use ssh-copy-id to copy the public SSH key to each of the other nodes in your cluster. This command appends the public key to the ~/.ssh/authorized_keys file on the target node, granting the keyholder access.

ssh-copy-id pi@node2

Replace pi@node2 with the username and hostname or IP address of each target node. You will be prompted to enter the user's password on the target node this one time.

Repeat this step for each node in your cluster that you wish to synchronize data with.

3. Verify Passwordless SSH Access

To ensure that passwordless SSH has been correctly set up, attempt to SSH into each of the other nodes from your primary node:

ssh pi@node2

If the setup was successful, you should gain SSH access to node2 without being prompted for a password. Repeat this test for each node you configured.

Creating the Sync Script

Create a script sync_data.sh to use rsync for synchronization. This script checks for changes and syncs the data directory across specified nodes.

#!/bin/bash

NODES=("kube1" "kube2" "kube3") # Use your actual node hostnames
SOURCE_DIR="/home/pi/kubeconfigs/data/node-red/"
LOG_FILE="/home/pi/kubeconfigs/data/node-red/sync_data.log"
CURRENT_NODE=$(hostname)

# Adjust the grep pattern to match your Node-RED pods' naming convention
if /snap/bin/microk8s.kubectl get pods -o wide | grep nodered | grep -q "$CURRENT_NODE"; then
    echo "(1.0) Node-RED is running on the current node ($CURRENT_NODE). Proceeding with sync..." >> "$LOG_FILE"
    for NODE in "${NODES[@]}"; do
        if [ "$CURRENT_NODE" != "$NODE" ]; then
            #echo "Syncing to $NODE" >> "$LOG_FILE"
            rsync -avz --delete --exclude 'sync_data.log' "$SOURCE_DIR" pi@"$NODE":"$SOURCE_DIR" >> "$LOG_FILE" 2>&1
        fi
    done
else
    echo "(1.0) Node-RED is not running on the current node ($CURRENT_NODE). Skipping sync..." # >> "$LOG_FILE"
fi

Make the script executable with chmod u+x sync_data.sh.

Automating Synchronization with Cron

Set up a cron job to run the script every 10 minutes:

*/10 * * * * /home/pi/sync_data.sh >> /home/pi/sync_data.log 2>&1

This setup ensures that Node-RED's data directory is kept consistent across all nodes, enhancing the resilience and availability of your Node-RED deployment in Kubernetes.

Ensuring Proper User Permissions for MicroK8s

When managing a Kubernetes cluster with MicroK8s, especially in a multi-node setup like ours with kube1, kube2, and kube3, it's crucial that users executing MicroK8s commands have the appropriate permissions. This is essential not just for interactive command-line operations but also for scripts running automation tasks, such as our data synchronization script.

By default, MicroK8s requires sudo privileges for its commands. This can be inconvenient and potentially problematic for automated scripts. To streamline operations and enhance security, we grant specific users permission to execute MicroK8s commands without sudo.

Granting Permission to the `pi` User

On each of our nodes, we have a user named pi that we use for administration and script execution. To allow pi to run MicroK8s commands without sudo, we add pi to the microk8s group. Here's the command to do this:

sudo usermod -a -G microk8s pi

This command modifies the user pi, appending it to the microk8s group (-a -G). Here’s a breakdown:

sudo: Executes the command with superuser privileges.
usermod: A utility for modifying a user's system account.
-a -G microk8s pi: Appends (-a) the user pi to the group microk8s (-G).

After running this command, for the change to take effect, pi will need to log out and log back in. Alternatively, the user can execute newgrp microk8s to start a new shell session with the updated group membership.

Applying the Change Across Nodes

To maintain consistency and ensure our scripts run smoothly across all nodes (kube1, kube2, and kube3), we replicate this permission setup on each node. This harmonizes our environment, ensuring that wherever our Node-RED synchronization script runs, it does so with the necessary permissions to interact with MicroK8s effectively.

Verifying the Setup

After adding pi to the microk8s group, we verify that the user can execute MicroK8s commands without sudo. This test ensures our setup is correct and our scripts can run as intended:

microk8s kubectl get nodes

JudeLabs’ Substack

Discussion about this post