Predictive maintenance is not new. Factories have been using vibration analysis and oil sampling to predict equipment failures for decades. What has changed is where the intelligence runs. Instead of sending all your sensor data to the cloud and waiting for a prediction, you can now run machine learning models directly on a small device sitting next to the pump.
This matters because pumps do not wait for your cloud server to respond. When a bearing starts to degrade, the vibration pattern changes in milliseconds. When a seal starts leaking, the pressure drop is immediate. An on-device model catches these changes in real time, with no network dependency and no cloud processing delay.
This guide walks through how to implement on-device ML for pump predictive maintenance, from sensor selection to model deployment.
Why pumps are the perfect starting point
If you are going to invest in on-device ML for your factory, pumps are the best place to start. Here is why.
Pumps are everywhere. A typical mid-sized manufacturing plant has 20 to 100 pumps. Water treatment plants have even more. The sheer count means even a small improvement per pump multiplies across the fleet.
Pump failures are expensive. A single unplanned pump failure in a process plant costs anywhere from 1 lakh to 10 lakh rupees in downtime, emergency repairs, and lost production. Centrifugal pump repairs alone account for nearly 30 percent of maintenance budgets in many plants.
Pump failures are predictable. Unlike random electrical faults, mechanical pump failures follow degradation curves. Bearings wear gradually. Impellers erode over weeks. Seal leaks grow progressively. These are exactly the kind of slow-developing faults that ML models excel at detecting.
And most importantly, the sensor data from pumps is well understood. Vibration, temperature, pressure, and flow rate are the four measurements that capture 90 percent of pump health. You do not need exotic sensors or complex data fusion.
The sensor setup
For on-device predictive maintenance on a centrifugal pump, you need four sensors.
A triaxial accelerometer mounted on the bearing housing. This is your primary sensor. It captures vibration in three axes at sampling rates of 1 to 10 kHz. MEMS accelerometers like the ADXL355 or the IIS3DWB are well suited for industrial vibration monitoring. Mount it as close to the bearing as possible, on a flat machined surface, using a stud mount or industrial adhesive.
A temperature sensor on the bearing housing and one on the motor body. PT100 RTDs or industrial grade thermocouples give you accurate readings. Temperature rise in the bearing housing is one of the earliest indicators of lubrication problems or misalignment.
A pressure sensor on the pump discharge. This tells you about impeller health, cavitation, and flow blockages. A 4-20mA pressure transmitter with the right range for your system is standard.
Optionally, a flow sensor on the discharge line. Flow rate combined with pressure gives you pump efficiency, which is a leading indicator of impeller wear and internal recirculation. If you already have flow meters installed for process control, you can tap into those signals.
Choosing the edge compute hardware
The ML model runs on an edge device connected to these sensors. The choice of hardware depends on model complexity.
For simple models like anomaly detection using statistical methods or small neural networks with fewer than 50,000 parameters, a microcontroller is enough. An ESP32-S3 or an STM32H7 with a few hundred KB of RAM can run TinyML inference comfortably. Power consumption is low, cost is under 1,000 rupees, and the device fits inside the same enclosure as your sensor node.
For more complex models that include FFT processing, spectral analysis, and multi-sensor fusion, a Linux-capable single board computer is better. A Raspberry Pi 4 or a BeagleBone with 1-2 GB of RAM gives you room for Python-based inference, proper signal processing libraries like NumPy and SciPy, and enough storage for model versioning. Cost is 3,000 to 5,000 rupees.
For fleet-wide inference where one device handles models for 10 to 20 pumps simultaneously, an industrial edge gateway or NVIDIA Jetson Nano provides the necessary compute density. These cost 10,000 to 25,000 rupees but serve multiple assets.
The Akran IQ platform supports all three tiers and handles OTA model updates so you can retrain and deploy improved models without physical access to the device.
Building the ML model
The predictive maintenance model for pumps has three stages, and each one builds on the previous.
Stage 1: Baseline learning
Before your model can detect problems, it needs to understand what normal looks like for each specific pump. Pumps are not identical even when they are the same model. Installation differences, piping layout, fluid properties, and load patterns make each pump unique.
Run your sensors for 2 to 4 weeks while the pump is in normal healthy operation. Collect vibration, temperature, pressure, and flow data at your target sampling rate. This becomes your training dataset for the normal operating envelope.
Feature extraction during this phase is critical. From the raw vibration signal, compute: RMS amplitude, peak-to-peak value, crest factor, kurtosis, and frequency domain features via FFT. Track the energy in specific frequency bands: 1x running speed (imbalance), 2x running speed (misalignment), bearing defect frequencies (BPFO, BPFI, BSF, FTF), and vane pass frequency for the impeller.
Stage 2: Anomaly detection
The simplest and most robust approach for on-device deployment is an autoencoder trained on normal data. The autoencoder learns to reconstruct normal vibration signatures. When the actual signature starts deviating from normal, the reconstruction error increases, and you have your anomaly score.
A small autoencoder with 3 dense layers (input 64, hidden 16, output 64) trained on frequency domain features runs comfortably on an ESP32-S3. It processes one inference per second using less than 100 KB of memory.
For the microcontroller deployment, convert your trained model using TensorFlow Lite for Microcontrollers or use Edge Impulse which handles the conversion and optimization automatically. The key constraint is model size. Keep it under 200 KB for microcontrollers.
Stage 3: Failure classification and remaining life estimation
Once you detect an anomaly, the next question is what kind of failure is developing and how much time do you have. This requires a classification model trained on labeled failure data.
If you have historical maintenance records with failure modes (bearing inner race, outer race, misalignment, cavitation, seal leak), you can train a classifier on the vibration signatures associated with each failure type. A random forest or a small 1D convolutional neural network works well for this.
Remaining useful life (RUL) estimation uses the anomaly trend. Plot the anomaly score over time. Fit a degradation model (linear, exponential, or a learned curve) and extrapolate to the failure threshold. This gives your maintenance team a window: "Schedule replacement in the next 15 to 20 days."
This is the same principle behind cognitive sensor layers that transform raw measurements into actionable maintenance intelligence.
Deploying the model on-device
Once your model is trained, deploying it to the edge device follows a clear process.
Export the trained model in the right format. For microcontrollers, use TensorFlow Lite (.tflite) quantized to INT8. For Linux devices, use ONNX Runtime or TensorFlow Lite for ARM. Quantization reduces model size by 4x with minimal accuracy loss.
Write the inference loop. On the edge device, the loop reads sensor data at the configured interval, extracts features (RMS, FFT, statistical features), runs the model, and takes action based on the output. Actions include: logging the prediction locally, sending an alert via MQTT if the anomaly score exceeds a threshold, and updating the trend data for RUL estimation.
Handle edge cases. What happens when the pump is off? When it is running at unusual speed? When a sensor fails? Your inference loop needs guards for these scenarios. A simple state machine (pump off, starting, running, shutting down) prevents false alerts during transient conditions.
Test with known faults. If you can safely simulate faults in a test environment (imbalance by adding weight, misalignment by shimming the motor, bearing damage using a test rig), validate that your model detects them. If you cannot simulate, validate against historical data where faults were later confirmed by maintenance records.
Connecting to the cloud for fleet intelligence
On-device ML handles individual pump predictions. But the real power comes from connecting all your pump models to a cloud platform for fleet-wide intelligence.
The edge device sends only predictions, anomaly scores, and summary statistics to the cloud. Not raw sensor data. This reduces bandwidth by 80 to 90 percent compared to cloud-only architectures.
The cloud layer aggregates predictions across all pumps and identifies patterns. If three pumps on the same cooling loop all show temperature anomalies, it is probably a cooling system problem, not three individual pump issues. This kind of cross-asset reasoning is what makes cognitive IoT different from simple edge AI.
The Akran IQ platform handles this aggregation. It collects edge predictions from your entire pump fleet, runs fleet-level analytics, retrains models based on confirmed outcomes, and pushes improved models back to the edge devices. Your models get smarter over time without manual intervention.
Real numbers from the field
On-device ML for pump predictive maintenance is not theoretical. Here are typical results from deployments across industrial facilities.
Detection lead time: 2 to 6 weeks before failure. This is enough to order parts and schedule maintenance during planned shutdowns instead of emergency stops.
False positive rate: under 5 percent after the first 3 months of operation. The model gets more accurate as it accumulates data specific to your pumps.
Cost savings: 40 to 60 percent reduction in pump maintenance costs. The savings come from fewer emergency repairs, no overtime labor, no expedited parts shipping, and no secondary damage from running a failing pump to breakdown.
Downtime reduction: 25 to 35 percent for pump-related outages. This number is consistent across manufacturing, water treatment, and process industries.
Getting started
Start with one critical pump. Install the sensors. Deploy the edge device. Run baseline learning for 2 to 4 weeks. Then let the model work. You will see results within the first quarter.
Contact our team and tell us about your pump fleet. We will recommend the right sensor and edge hardware configuration and get your first predictive maintenance deployment running within 3 to 4 weeks.
