Xomnia’s Machine Learning Engineers Michiel Bonnee, Martijn Beeks, and Ludger Visser have completed the AI for Wind Energy Challenge. Together, they formed a modeling team, and worked alongside data engineering and feature engineering teams to develop two models that can help in performing predictive maintenance on the blades of wind turbines.
One of the proposed models employed an XGBoost model, and the other used an autoencoder. The two proposed models aim to predict failures in blades of wind turbines right before or just after they happen, hence minimizing the downtime of a wind turbine and increasing the life-time expectancy of the blades.
The models were trained using two datasets provided by Tarruca, an IoT startup focused on wind energy and composite materials industries. The initiative was organized by FruitPunch, a community of data specialists who use AI to come up with solutions for challenges related to the Sustainable Development Goals. Xomnia is one of the partners of FruitPunch, and we are proud to regularly collaborate together to conduct different AI For Good challenges.
The challenge: Anomaly detection on wind turbine blades using photonic sensors
Most of the wind turbines in the Netherlands are located in the sea. Detecting damages in the blades of a turbine is done through on-site inspection or using drones that scan the blades. Such monitoring approaches are expensive and time-consuming, but currently make up a large part of the maintenance costs.
Tarruca has developed an innovative idea that uses photonic sensors to analyze vibration signals reverberating across a blade, thus detecting any damage or potential failures. This idea is currently unique in the market and provides a promising business case to reduce maintenance and inspection costs.
The challenge for the three participating teams was to develop a model that can accurately indicate whether a specific wind turbine blade is damaged or not to a certain degree. To develop this solution, Tarruca provided the following datasets, which were collected from blades that were fitted with sensors and actuators:
- Dataset A contained vibration data collected in an experimental indoor environment from an undamaged and a damaged blade
- Dataset B was collected from an actual wind turbine blade in an undamaged and a damaged state
Tarruca also supplied metadata about the surrounding environment at the time that both datasets were collected, such as weather conditions, temperature, etc. In many other academic approaches, metadata is very valuable to models because operational factors can influence vibration signals significantly.
The predictive models
The modeling team, made up of Xomnia's machine learning engineers and supervised by Xomnia’s Machine Learning Engineer Vincent Roest, used dataset B to develop two models. They also trained the models with and without the metadata, which resulted in 4 potential outcomes.
In other more academic approaches, different research institutions have used an actuator to excite a certain force on the blade to observe differences in structural characteristics. This actuator mechanism has been attached on top of the wind turbine blade. For this project, the provided dataset contained such signals, but only signals without an actuator have been used. This was done to make sure that the models could be productionized, as a real-world wind turbine setup with an actuator is not possible.
The team started with a binary and multiclass classification approach. The binary classification gave a result of zero (undamaged input) or a 1 (damaged input). The multiclass model involved 5 classes: undamaged, slightly damaged, damaged, very damaged, and repaired.
Model 1: XGBoost
This is a traditional classification approach that involves engineering a number of features that the team wanted to classify between the damaged vs undamaged cases. These features have been engineered from raw signal data. As this raw signal data contains over 500k data points per 120 second experiment, the sequence has been transformed into the frequency domain using a Fast Fourier Transformation.
Based on a preliminary analysis, the low frequency spectrum of the sequence contained the most interesting information for the classification approach. This low frequency part has been used to compute the head and tail, and is used to compile a variety of features that describe the underlying pattern. Using these features, an XGBoost classification approach has been trained and evaluated on a holdout set. The performance of this approach provided us with excellent results on both the binary classification as the multi-classification approach, where the latter provided an f1-score of 91%.
Model 2: Autoencoder
Performing anomaly detection by using an autoencoder can be done by training a neural network on signals from an undamaged wind turbine blade. The idea is that an autoencoder reconstructs the input signal and thereby captures details of an ‘undamaged’ signal. Thus, when a damaged signal is provided, the reconstruction loss should be significantly higher than when reconstructing a signal from an undamaged blade. These principles have been displayed in the figure below:
As the current state-of-the-art in signal processing involves a lot of feature engineering using domain knowledge, this approach tries to perform automatic feature engineering using large neural networks. By encoding the data and forcing it to shallower parts of the network, the networks learn interesting parts of the signals to capture.
“We did this to extract the really valuable data first, then reconstruct the signal while only keeping this part of the underlying data structure,” comments Machine Learning Engineer Martijn Beeks.
The MLEs trained a certain architecture on patterns from an undamaged wind turbine, and subsequently provided input of signals from a damaged wind turbine. Using their results, they set a threshold to classify whether a signal came from a damaged or undamaged wind turbine. This approach worked quite well with an f1-score of 80% on a binary classification approach.
Limitations to be addressed
The proposed models are still in the early stages of development. In order to become a minimum viable product, some challenges need to be addressed. The first challenge is making sure that the models show a reliable classification without the need to involve metadata, in order to prevent any inaccuracy resulting from data leakage. The second challenge is training the models using data sets that cover all possible damage scenarios, rather than only specific amounts of damage.
“We had datasets from blades that had 15 cm, 30 cm, and 45 cm long cracks, but nothing in between,” explained Michiel Bonnee, an MLE from the modeling team. “You don’t really know what happens between those damages, for instance, how a crack goes from 0 to 15 cm, which is essential data to conduct anomaly detection.”