 Original research
 Open Access
 Published:
A dynamicmodelbased fault diagnosis method for a wind turbine planetary gearbox using a deep learning network
Protection and Control of Modern Power Systems volumeÂ 7, ArticleÂ number:Â 22 (2022)
Abstract
The planetary gearbox is a critical part of wind turbines, and has great significance for their safety and reliability. Intelligent fault diagnosis methods for these gearboxes have made some achievements based on the availability of large quantities of labeled data. However, the data collected from the diagnosed devices are always unlabeled, and the acquisition of fault data from real gearboxes is timeconsuming and laborious. As some gearbox faults can be conveniently simulated by a relatively precise dynamic model, the data from dynamic simulation containing some features are related to those from the actual machines. As a potential tool, transfer learning adapts a network trained in a source domain to its application in a target domain. Therefore, a novel fault diagnosis method combining transfer learning with dynamic model is proposed to identify the health conditions of planetary gearboxes. In the method, a modified lumpedparameter dynamic model of a planetary gear train is established to simulate the resultant vibration signal, while an optimized deep transfer learning network based on a onedimensional convolutional neural network is built to extract domaininvariant features from different domains to achieve fault classification. Various groups of transfer diagnosis experiments of planetary gearboxes are carried out, and the experimental results demonstrate the effectiveness and the reliability of both the dynamic model and the proposed method.
1 Introduction
Wind energy has become one of the vital energy sources in the world, while wind power generation systems have been widely studied and applied [1]. The planetary gearbox is one of the critical components in the transmission system of wind turbines (WTs) because of its advantages of compact structure, high power density and desirable transmission efficiency [2]. However, in operation, planetary gearboxes are prone to failure and have high maintenance costs under dynamic load and frequently changing operating conditions [3]. Therefore, accurate gearbox fault diagnosis is of great significance to improve the safety, reliability and economy of WTs [4].
In recent years, many intelligent methods have been investigated for gearbox fault diagnosis [5,6,7,8,9], while the proposed methods have two assumptions: (1) the training and testing data are derived from the same probability distribution; (2) enough labeled history data with fault information can be obtained [10]. However, in industrial applications, it is impractical to satisfy those two assumptions because of operating condition change, equipment wear degradation, and environmental noise interference, leading to differences of data in the probability distribution [11] and unlabeled data collected from the diagnosed devices [12].
Therefore, to solve the above two disadvantages, some studies have introduced transfer learning into fault diagnosis of mechanical equipment [13,14,15,16]. The transfer learning tasks consist of two datasets, one from the source domain and the other from the target domain. The data in the target domain is distributed differently from the data in the source domain but contains relevant knowledge. Thus, the goal of transfer learning is to improve the property of the predictive model for the target domain by using the common knowledge of the source and target domains. With the theoretical research of deep learning, deep hierarchical models are applied to learn transferable features from the crossdomain data automatically [17,18,19].
Hence, transfer learning can use the learned common knowledge from the source domain to solve a related task in the target domain [20,21,22]. Accordingly, some transferlearningbased methods that mainly concentrate on the transfer tasks between different operation conditions are applied in [11, 23, 24]. Also, the transfer fault diagnoses among different devices have been studied. Reference [12] proposes a transfer learning method for bearing fault diagnosis, and its effectiveness is verified by the datasets acquired from three different machines. In [25], a transfer method is presented and the health conditions of bearings used in actual devices are classified with the help of the diagnosis knowledge from those used in the laboratory. Based on such methods, the fault diagnosis model trained with labeled data obtained from one machine can be generalized to the unlabeled data obtained from other similar machines. However, in the fault diagnosis of WT gearboxes, the above methods will encounter the following two problems:

(1)
Labeled fault data from similar machines are hard to obtain. The planetary gearboxes in WTs will not be allowed to run to failure since such a fault could lead to the breakdown of a WT or even serious accidents. In addition, gearboxes often undergo a long degradation process from normal to failure. Therefore, the acquisition of fault data is timeconsuming and laborious.

(2)
Experimental data acquisition of WT gearboxes is costly. WTs are usually large in size, so it is expensive to build experimental platforms similar to the actual ones, while in the laboratory, when the type and extent of the faults are changed, new components are required and this is costly.
Such problems lead to insufficient samples in the actual fault diagnosis task. As a result, the performance of the deep transfer learning models will deteriorate and even fail to complete the diagnostic task. To solve the problems, an easier method is needed to get signals containing actual fault features. To gain an insight into the signal characteristics of the gearboxes, various dynamic models have been presented, including multibody models and lumped parameter models [26,27,28]. Dynamic simulation of the planetary gearbox with different faults has been realized, and it proves that the dynamic model can show many features of the actual signals [29,30,31,32,33]. The above studies have a common approach of introducing the influences of gear faults into the dynamic models by changing the mesh stiffness function of the mesh pair. In order to get a more accurate vibration response, reference [32] constructs a vibration signal model that can express the effect of transmission path by using a modified Hamming function.
In this paper, a dynamicmodelbased method for WT planetary gearbox fault diagnosis using a deep transfer learning network (DTLN) is proposed. A modified lumpedparameter dynamic model is established to simulate the vibration signals of a planetary gear train, and the resultant vibration response is analysed by considering the transmission path of the signals. Then, an optimized DTLN based on a onedimensional deep convolutional neural network (1D CNN) is built. The DTLN comprises three modules: health condition recognition module, domain classifier and distribution discrepancy metrics. With the proposed three modules, the DTLN can extract domaininvariant features from the simulation data and the actual data, and the fault classification of actual datasets is realized. The introduction of simulation datasets makes up for the possible influence of insufficient samples in fault diagnosis models. Finally, multiple transfer diagnosis experiments are performed to verify the feasibility of the proposed method.
The main insights and contributions of this paper are summarized as follows.

(1)
A novel fault diagnosis method combining transfer learning with the dynamic model is proposed. This aims to remove the difficulty in obtaining enough labeled fault samples in applications. The crossdomaininvariant features of the simulation signal and the actual signal are learned by a deep transfer learning network, so as to realize the fault diagnosis of the actual signals.

(2)
The optimized DTLN comprises three parts: health condition recognition module, domain classifier and distribution discrepancy metrics. The health condition recognition module is based on a 1D CNN built to learn the deep features of the input data, while the domain classifier and distribution discrepancy metrics are applied to help the network learn more domaininvariant features.

(3)
The proposed diagnosis method is based on unsupervised transfer learning theory. The labeled samples are not necessarily needed in the target domain. In practical applications of fault diagnosis, the data obtained from the devices to be diagnosed are always unlabeled. Therefore, it is appropriate for actual realtime diagnostic scenarios.

(4)
A model and datadriven approach is proposed. The application of traditional artificial intelligence methods relies on a large number of labeled samples of devices to be diagnosed. Compared with the traditional artificial intelligence method which only relies on the datadriven, the proposed method requires only unlabeled samples to be diagnosed, and the required number of samples in the target domain is greatly reduced. Therefore, the requirement of the dataset is reduced and its value in practical application is increased.
The rest of this paper is organized as follows. The dynamic model is presented in Sect. 2, and the proposed fault diagnosis framework is described in Sect. 3. In Sect. 4, the proposed dynamic model is validated, and its feasibility is validated on various experimental scenarios. Section 5 draws the conclusions.
2 Dynamic model of planetary gearbox
In this section, a modified lumpedparameter dynamic model is established. The model has the following characteristics: (1) the horizontal and vertical displacements of ring, planet and carrier that have limited influence on the resultant vibration response are ignored, (2) the lumped virtual springdamping units are adopted in the model, (3) the effects of planet gear faults are introduced into the model by modifying the mesh stiffness function of the mesh pairs.
2.1 Lumpedparameter model for a single stage planetary gear train
The lumpedparameter model is shown in Fig. 1. The system consists of one ring gear â€˜râ€™, one sun gear â€˜sâ€™, one carrier â€˜câ€™ and N equally spaced planet gears â€˜p_{n}â€™. Herein, Oxy is the coordinate system rotating at the speed of Ï‰_{c} with the x axis going through the center of p_{1}. The sun gear has three degrees of freedom, i.e., two lateral motions (x, y) and one torsional motion (u). The other components have only torsional motion (u). According to Newtonâ€™s second law, the motion of a planetary gear train can be written as several secondorder differential equations:
with:
The secondorder nonlinear differential equations of motion can be solved by a fourthorder variablestep Rungeâ€“Kutta method after nondimensionalization.
2.2 Timevarying mesh stiffness
Timevarying mesh stiffness is one of the main sources of vibration response in a dynamic system. When the gearbox is free of any defects, the meshing stiffness of the gear is a function of its angular displacement and can be approximated by a square waveform. If a tooth is defective, partial contact loss will occur when the faulty tooth engages, leading to a local reduction of the mesh stiffness function [27]. Four planetary gear conditions are considered, including normal condition (NC), chipped tooth fault (CTF), surface wear fault (SWF) and missing tooth fault (MTF). The mesh stiffness losses denoted as Î”K are different under diverse faults. As the meshing stiffness is periodic, the meshing stiffness can be written as a Fourier series defined by (6) and (7), and the effects of the fault gears can then be introduced into the system.
Because of the partial reduction of meshing stiffness, the amplitude and phase modulation effects appear in the vibration response spectrum in the form of sidebands, whose frequency locations depend on the fault location and fault type. These sidebands are also reflected in the vibration signals of actual planetary gearboxes. This is discussed in detail in Sect. 4.
2.3 Resultant signal model
In a planetary gearbox, the mesh vibrations along the torsional motion action lines are the main vibration sources. Consequently, the mesh vibration acceleration signals of sunplanet and ringplanet mesh pairs are chosen to establish the resultant vibration signal model. The transmission path of a vibration signal is composed of two parts [29]: the first part is from the meshing vibration sources to the case, while the second part is the case to the transducer location. The influence of the first part on the vibration signals can be modeled by an attenuation coefficient, while the second part can be modeled by a modified Hamming function. Therefore, the resultant vibration signals of a planetary gearbox at the sensor location can be described as:
where the Hamming function W_{n}â€‰=â€‰0.54â€“0.46cos(Ï‰_{c}tâ€‰+â€‰Ïˆ_{n}). a_{spn} and a_{rpn} are the acceleration signals of sunplanet and ringplanet mesh pairs, respectively, and S_{spn} and S_{rpn} are the attenuation coefficients from the mesh pairs to the case. Î¾ is used to control the bandwidth of the Hamming function.
3 Proposed fault diagnosis framework
In this section, the proposed fault diagnosis framework based on the dynamic model and DTLN are introduced in detail.
3.1 Transfer learning problem definition
In order to clearly describe the problem, some concepts are introduced as follows. We take the source domain as D_{S}â€‰=â€‰{(xS i,yS i)}, where xS iâ€‰âˆˆâ€‰Ï‡_{S} is a data sample and yS iâ€‰âˆˆâ€‰Y_{S} is its corresponding label, and the target domain as D_{T}â€‰=â€‰{(xT i)}, where xT iâ€‰âˆˆâ€‰Ï‡_{T} is a data sample. D_{S} and D_{T} are drawn from distribution P_{S}(X) and P_{T}(X), and P_{S}(X)â€‰â‰ â€‰P_{T}(X) because of the domain bias. The same label space is used in different domains, i.e., Y_{T}â€‰=â€‰Y_{S}. In fault diagnosis, the goal of transfer learning is to improve the probabilistic prediction function of the domain D_{T} using the knowledge that can be learned in the domain D_{S}.
In this fault diagnosis, the target domain samples are the data obtained from the equipment to be diagnosed. In practical application, these data are unlabeled. The source domain samples are the available failure experimental data of similar equipment or the simulation data from the simulation model of the equipment to be diagnosed. These are labeled. The target task can be described as realizing the condition recognition of the target domain samples, that is, adding condition labels to the samples to be diagnosed.
3.2 Structure and training process of the DTLN
As shown in Fig. 2, the optimized DTLN consists of three parts: health condition recognition module, domain classifier and distribution discrepancy metrics. These are briefly described below.
3.2.1 Health condition recognition module
The health condition recognition module is based on a 1D CNN, which has the function of feature extraction and condition classification. In the 14layer 1D CNN, the first 13 layers are used for feature extraction and collation, and the last layer can be regarded as the condition classifier. In the convolutional layer, feature extraction is carried out, where the rectified linear unit is used as an activation function. Then a maximum pooling operation is introduced to reduce the feature dimension and enhance the feature robustly. The full connected layer and softmax regression are used at the end of the network to perform classification tasks. In summary, the output of the health condition recognition module can be defined as the output probability of the softmax function:
where f_{2} is the output of the full connected layer FC2, w_{i} denotes the weight matrix that concatenates to the i^{th} output neuron, b is the bias vector, and K is the number of health condition categories of the dataset.
3.2.2 Domain classifier
The domain classifier is a binary classifier that distinguishes source domains from target domains. As shown in Fig. 2, the domain classifier consists of a fully connected layer and a binary output layer. The binary classifier setting with logistics regression is employed to distinguish between the source domain and target domain. The logistics regression is calculated as:
where w_{d} is the weight matrix of the classifier, b_{d} is the corresponding bias vector and f_{3} is the output of the layer FC3.
3.2.3 Distribution discrepancy metrics
In order to realize the extraction of domaininvariant features, a metric is required to represent the distribution difference between the features extracted from the source domain and those from the target domain. Here we use the Wasserstein distance to measure the distribution discrepancy between the two datasets. Let P(f_{2}^{(S)}) and Q(f_{2}^{(T)}) be the probability distributions where f_{2}^{(S)} and f_{2}^{(T)} are the features learned by 1D CNN from the source domain and the target domain, respectively, according to the Kantorovichâ€“Rubinstein dual theorem, The Wasserstein distance between the two distributions is computed as:
where G_{L} is the 1Lipschitz function.
For the three components of the DTLN introduced above, each corresponds to an optimization object.
3.2.4 Object 1
Minimize the health condition classification error of the softmax classifier on source data. The objective function can be defined as the regression loss of a standard softmax classifier, as:
where m is the batch size of the data samples, k is the number of health condition categories, w_{i} denotes the weight matrix that concatenates to the i^{th} output neuron, and I[Â·] is an indicator function.
3.2.5 Object 2
Maximize the domain classification error on the source and target domain datasets. The loss function of the binary classifier can be represented as:
where l_{i} denotes the real domain label, and d(x_{i}) is a function that represents whether x_{i} comes from the source domain or the target domain. The objective function can be written as:
where f_{2}(S) i and f_{2}(T) j are the features learned from the source domain and the target domain, respectively.
3.2.6 Object 3
Minimize the Wasserstein distance between features extracted from the source and target domain datasets. Considering the gradient penalty item, the calculation formula is given as:
where Î³ is the tradeoff parameter, n_{s} and n_{t} are the respective numbers of training samples from the source domain and target domain, and \(\hat{H}\) is a uniform sampling from the feature representations.
In conclusion, in order to extract as many crossdomaininvariant features as possible, the final optimization object can be combined as:
where Î» and Î¼ are the hyperparameters, Î¸_{f}, Î¸_{c}, and Î¸_{d} are the parameters of the feature extractor, health condition classifier, and domain classifier, respectively.
Based on (16), in the backpropagation process, the parameters Î¸_{f}, Î¸_{c}, and Î¸_{d} are updated as:
where Îµ denotes the learning rate.
After training, the classifier can recognize the unlabeled samples from the target domain even if the learned domaininvariant features have equivocal domain categories and domain discrepancy. As shown in Fig. 3, DTLN uses labeled samples from the source domain and unlabeled samples from the target domain for training. The invariant features of the domain are learned first, and then the classifier determines the category based on the learned features. After the training, the trained network will be tested by the sample set from the target domain.
3.3 Proposed fault diagnosis framework
The framework of the proposed method is illustrated in Fig. 4. As shown, the method includes three parts, as introduced below.
3.3.1 Part 1: data acquisition and preprocessing
In this part, the source domain and target domain are constructed, where the target domain data samples are obtained from the gearbox to be diagnosed, and the source domain data samples are obtained by analyzing the dynamic model. It is worth noting that the relevant parameters of the dynamic model are taken from the device to be diagnosed. After acquiring the vibration signal, the samples are processed and the frequencydomain samples are used as the input of the DTLN. This is because frequencydomain samples are more robust to noise than timedomain samples and contain more domaininvariant features. This will be demonstrated in detail in Sect. 4.
3.3.2 Part 2: network training and fault classification
In order to extract more domaininvariant features of the source domain and target domain, frequency domain samples are used to train the DTLN, and the trained network can be obtained. The training process is based on (17â€“19). The trained network is tested by unlabeled testing samples from the target domain and outputs classified results.
3.3.3 Part 3: output of the diagnostic results
The trained diagnostic model is applied to the fault diagnosis of experimental equipment to output the diagnosis results. In order to show the feasibility of the proposed method, the above classification results are analyzed visually.
According to the above fault diagnosis framework, the dynamic model is used to construct the source domain in the transfer learning method. This is helpful for fault diagnosis of true devices. In the following, the rationality and advantages of the proposed method are verified.
4 Experimental results and comparisons
In this section, similarities between the simulation signal and the actual signal are analyzed. Multiple experiments are performed to validate the network and fault diagnosis framework.
4.1 Validation and analysis of the simulation model
4.1.1 Planetary gearbox fault experiment
The gearbox dataset is collected from the drivetrain dynamic simulator (DDS) shown in Fig. 5a. The planetary gearbox has two stages, and the faults of planet gears in the first stage are studied. For vibration signal acquisition, an acceleration transducer is mounted. Experiments are carried out on planet gears in four healthy conditions, shown from left to right in Fig. 5b as NC, CTF, SWF and MTF, respectively. The sampling frequency of the transducer is set at 12 kHz.
4.1.2 Simulation parameters
The basic design parameters are listed in Table 1. The planetary gearbox has three planet gears (Nâ€‰=â€‰3) with a fixed ring gear. The mesh damping of sunplanet pair and ringplanet are set as 242.6 NÂ·s/m and 410.3 NÂ·s/m, respectively. The bearing stiffness and damping of sun gear are assumed to be 15 NÂ·mm^{âˆ’1} and 9.2 NÂ·sÂ·m^{âˆ’1}, respectively. The constant torque acting on the carrier is 1.26 Nm, and the sampling frequency of the simulation signal is 12 kHz. In the resultant signal model, according to the structure of DDS, S_{spn} and S_{rpn} are set as 0.4 and 0.9, while Î¾ is âˆ’1.
To simplify the analysis, the order spectra are represented by normalizing with rotational frequency of the carrier. The mesh order H_{m}â€‰=â€‰Z_{r} denotes the mesh frequency f_{m} (H_{m}â€‰=â€‰f_{m}/f_{c}â€‰=â€‰Z_{r}f_{c}/f_{c}â€‰=â€‰Z_{r}, where Z_{r} is the teeth of ring gear). The rotation period of the carrier T_{c} is equal to Z_{r}T_{m}, where T_{m} is the mesh period. For the planetary gear train, Z_{r}â€‰=â€‰100, f_{m}â€‰=â€‰100f_{c}, T_{c}â€‰=â€‰100T_{m}. With the above settings, the vibration response can be determined.
4.1.3 Spectrum analysis of simulation signal
As described in Sect. 2, modulation effects appear in vibration response because of the partial reduction of meshing stiffness, in the form of sidebands in the vibration spectrum. Therefore, the frequency spectra are analyzed.
When planet gear faults occur, fault features will appear near the meshing frequency, while the frequency locations depend on the fault location and type. As shown in Fig. 6a, when working under NC, the sidebands locate at f_{m}â€‰Â±â€‰nf_{c} (n is an integer) because of the modulation of the transmission path. After introducing failures of the planet, some impulsive signals appear. As a result, the spectrum contains some additional frequency components. As shown in Fig. 7c and d, when the planet gear has local faults such as CTF and MTF, these sidebands are at the locations of f_{m}â€‰Â±â€‰mf_{p}â€‰Â±â€‰nf_{c} (m is an integer), which also exist in the actual signal. This signifies that the signals are modulated by the fault of planet gear and the transmission path. It is worth noting that MTF causes more characteristic frequencies than CTF in both simulation and experiment. As shown in Fig. 6b, the global fault causes the characteristic frequency f_{s} of the sun gear. The amplitudes of the order spectrum locate at the f_{m}â€‰Â±â€‰kf_{s}â€‰Â±â€‰nf_{c} (k is an integer). As seen, the simulation and experimental results are consistent. From the above analyses, it is verified that the simulation results of the dynamic model contain some features of the actual signals. This is the basis for the applicability of transfer learning theory, i.e., the source domain and target domain contain common diagnostic knowledge. In addition, fault features can be detected from the order spectrum. Therefore, the fault identification of frequency domain signals can help the overall diagnosis decision.
4.2 Transfer fault diagnosis experiments
The three datasets required for validation and 16 diagnosis experiments are described in this section.

(1)
A: Experimental Planetary Gearbox Dataset The dataset is collected from DDS, which contains four working conditions with various motor speeds and a certain load: 1200r/min (A_{1}), 1800r/min (A_{2}). Each health condition, i.e., NC, CTF, MTF and SWF, has 800 samples. Thus, this dataset has a total of 3200 samples, each of which has 2000 data points.

(2)
B: Planetary Gearbox Dataset Used by Another Group The dataset is collected from a similar planetary gearbox under different working conditions. This was provided by Yanâ€™s group [34]. Four healthy conditions in the dataset are selected, and the working conditions are investigated with the rotating speed system load set at 20 Hzâ€“0 V (B_{1}) and 30 Hzâ€“2 V (B_{2}). Similarly, 800 samples for each condition are intercepted, and each sample contains 2,000 points.

(3)
C: Dataset Acquired by Simulation The dataset is acquired by dynamic simulation, and rotation frequencies of the sun gear are set at 20 Hz (C_{1}) and 30 Hz (C_{2}), each of which also has four healthy conditions. The conditions and properties of the samples are the same as those of datasets A and B.
Sixteen transfer fault diagnosis experiments are shown in Table 2. Taking the task A_{1}â€‰â†’â€‰B_{1} for example, A_{1} is the source domain, and B_{1} is the target domain. The standard assessment protocol for unsupervised transfer learning missions is adopted. In each transfer task, the training dataset consists of all labeled data samples from the source domain and half of the unlabeled data samples from the target domain, while the testing dataset is composed of the other half of samples from the target domain. Among the 16 experiments, the groups of Class 1 are the transfer diagnosis from one machine to another under similar working conditions, while those in Class 2 are under different working conditions. Class 3 is the transfer fault diagnosis between dynamic model and actual machines at the same speed, while Class 4 is the diagnostic experiments of the proposed method at different rotational speeds.
The detailed parameters of the DTLN can be found in Table 3, in which 64â€‰Ã—â€‰1 conv denotes the size of the convolutional kernel, 2â€‰Ã—â€‰1 maxpool stands for the size of maxpooling operation, and 16[2000â€‰Ã—â€‰1] represents 16 feature maps of size 2000â€‰Ã—â€‰1. In order to restrain noise and extract useful knowledge, a wide kernel is used in C1. As shown in Fig. 7a, the hyperparameters Î» and Î¼ in (16) are set to gradually increase from 0 to 1, and the calculation formula is 2/(1â€‰+â€‰exp(10â€‰Ã—â€‰p))âˆ’1, where p denotes the training progress. In order to minimize the loss function, the Adam is used as an optimization algorithm and the learning rate is set as 0.001. The batch size is set as 512, and the epoch of the training is 3000. Taking the experiment C_{1}â€‰â†’â€‰A_{1} for example, the loss function during the training process is drawn in Fig. 7b. It is clear that the loss function converges after about 1500 steps.
To reduce contingency and particularity of results, each transfer fault diagnosis experiment is carried out 10 times, and the results are shown in Table 4. It is worth noting that DTLNT indicates that the input of the network is timedomain samples, while DTLNF indicates that the input is frequencydomain samples. The figures in Table 4 represent the average accuracy rate and standard deviation of 10 repeated classification experiments. The accuracy rate reflects the reliability of the method, and the standard deviation reflects the stability of the method. In the transfer experiments between different devices, i.e., Aâ€‰â†’â€‰B, Bâ€‰â†’â€‰A, the diagnostic accuracies of the proposed method are over 91%. In addition, in the transfer experiments from dynamic model to actual devices, i.e., Câ€‰â†’â€‰A, Câ€‰â†’â€‰B, the average diagnostic accuracy of Class 3 is 90.9%, which indicates that the proposed method is feasible.
4.3 Results comparison and visual analysis
In order to demonstrate the effectiveness and the feasibility of the proposed method, two other networks are chosen for comparison. Among them, the basic convolutional network has the same structure and parameters as the 1D CNN introduced above, and uses source data for training and then tries to classify target data. In addition, a domain adversarial neural network (DANN) which is a commonly used transfer learning method is also tested, and its parameter setting refers to [35]. The learning rate of the CNN is 0.01, and is 2â€‰Ã—â€‰10^{â€“4} for the DANN. The Adam algorithm is used as optimization algorithm in both methods. It is worth noting that the inputs of the CNN, DANN and DTLNF are all frequencydomain samples. The accuracies and standard deviations on the 16 transfer fault diagnosis experiments are shown in Table 4. Classification accuracies of average tenfold cross validation on the subclass in Table 2 are shown in Fig. 8. Taking the task C_{1}â€‰â†’â€‰A_{1,} for example, the classification accuracy and standard deviation of various methods are compared in Fig. 9. By comparing the results, three observations can be made:

(1)
For the transfer fault diagnosis missions where unlabeled data are retrievable in the target domain, the networks based on transfer learning are superior. It suggests that transfer learning can be an effective instrument to facilitate the practical application of intelligent diagnostics. Additionally, compared with DANN, DTLN obtains higher classification accuracies and lower standard deviations, as shown in Fig. 10. It indicates that DTLN reduces the distribution discrepancy between different domains more effectively and is relatively stable.

(2)
Compared with experiments where the networks are trained by data from machines, the accuracies of experiments that replace the actual labeled data with the simulation data are reduced. This indicates that the fault signals of similar equipment contain more domaininvariant features, but the simulated signals contain fewer. However, compared with the timedomain samples, the classification accuracy obtained when the frequencydomain samples are used as the input is higher, as shown in Fig. 8. Therefore, it can be inferred that when the samples are in the frequency domain, it is more favorable for the DTLN to extract the domaininvariant features. Therefore, the method proposed in this paper adopts the frequencydomain samples as the input. As a result, the transfer experiments from simulation model to actual devices can realize relatively high accuracy. When the set speed of the dynamic model is the same as the actual speed, the average classification accuracy is 90.9%. Under different speed settings, the average classification accuracy is 88.9%. This proves that the diagnosis method combining the dynamic model with a deep transfer learning network has practical value. In similar crossdomain transfer diagnosis experiments, an 84.32% recognition rate for bearing faults in mechanical equipment is achieved in [25], while reference [12] achieves an 86.3% diagnosis accuracy. Therefore, compared with the existing research results, the proposed method has certain advantages.

(3)
The accuracies obtained from the transfer learning between datasets under similar working conditions are higher. This is exemplified by the fact that the accuracy of Class 2 is lower than that of Class 1, and Class 4 is lower than that of Class 3. Therefore, the working condition is one of the important influencing factors of transfer learning in practical application. In order to improve the accuracy of diagnosis, the rotational speed of the dynamic model can be adjusted to be the same as that of the actual equipment to be diagnosed, so as to obtain more accurate fault discrimination.
In order to intuitively show the classification effect, a tdistributed stochastic neighbor embedding (tSNE) algorithm is introduced. This can map the highdimensional features into 2D space and the distribution of features can be plotted directly. Taking the task C_{1}â€‰â†’â€‰A_{1} for example, the transferable features learned by CNN, DANN, DTLNT and DTLNF are shown in Fig. 10 via tSNE. In addition, the confusion matrices for transfer results on dataset A_{1} can be explored. These are shown in Fig. 11.
From Fig. 10a, the features learned by CNN have clear distribution discrepancy. As a result, when CNN is trained with C_{1}, its recognition for A_{1} is close to surmise. As for DANN, the crossdomain distribution discrepancy is amended to a certain extent as shown in Fig. 10b, so the accuracy of DANN for A_{1} is much higher than that of CNN. From Fig. 10c and d, the proposed DTLN is able to amend the distribution discrepancy between the learned features of different datasets. However, because of the difference in transferability between the subclass samples of the source domain and target domain, the distribution discrepancy of the subclass samples is corrected asymmetrically. For example, in Fig. 10c, the distribution discrepancy of the crossdomain samples with CTF is still severe after the correcting of DTLNT. To illustrate with confusion matrices, shown in Fig. 11c and d, the classification effects of DTLNT and DTLNF are defective, whereas the classification of DTLNF is better.
5 Conclusion
In this paper, a dynamicmodelbased transfer learning fault diagnosis method for WT planetary gearboxes is proposed. This method introduces a dynamic simulation dataset into the application of transfer learning and produces a diagnosis of unlabeled fault data obtained from actual machines. To verify the feasibility of the proposed method, spectrum analysis of the simulated and experimental signals is carried out, and 16 groups of transfer fault diagnosis experiments are completed. From the results, the following conclusions can be drawn.

(1)
Through the spectrum analysis, the vibration response solved by the dynamic model contains some features of the actual fault vibration signal, i.e., domaininvariant features required by transfer learning.

(2)
The proposed DTLN can effectively realize the recognition of unlabeled fault data from the target domain. In the application of transfer fault diagnosis, the classification accuracy and stability of the DTLN are better than those of the DANN.

(3)
The proposed method combining a dynamic model with the deep transfer learning network can identify four kinds of faults of the planetary gearbox. When the set speed of the dynamic model is the same as the actual speed, the average classification accuracy is 90.9%.
The results indicate that the proposed method that combines the transfer learning theory with dynamic model is feasible, whereas the dynamic model proposed can be further optimized. After introducing the dynamic model, varied labeled fault data can be obtained, and the fault setting is more independent and convenient. This leads to practical application value.
There has been rapid development of artificial intelligence algorithms. However, because of the operating environment, working condition, data acquisition difficulty etc., artificial intelligence methods in the field of fault diagnosis are developing slowly. Therefore, how to combine artificial intelligence methods with practical applications of condition recognition to achieve higher accuracy is the direction of future research.
Availability of data and materials
Data and materials are not public, but can be uploaded and made public if necessary.
Abbreviations
 m _{ s } :

Mass of the sun gear.
 x _{s}, y _{s} :

Translational displacement of xaxis and yaxis for sun gear.
 u _{ i } :

Torsional displacements for each component (iâ€‰=â€‰s,r,c,p_{n}).
 k _{ij}, c _{ij} :

Supporting stiffness and damping for each component at different motions (jâ€‰=â€‰x,y,u).
 k _{spn}(t), k _{rpn}(t):

Stiffness function between sunplanet and ringplanet meshing pairs (nâ€‰=â€‰1,Â·Â·Â·,N).
 c _{spn}, c _{rpn} :

Constant damping between sunplanet and ringplanet meshing pairs.
 e _{spn}(t), e _{rpn}(t):

Transmission errors of the nth sunplanet and ringplanet pairs.
 x _{spn}, x _{r} _{p} _{n} :

Relative displacements along the torsional motion action lines of sunplanet and ringplanet
 I _{ i } :

Inertias of each component.
 r _{ i } :

Base circle radii of each component.
 T _{s}(t), T _{c}(t):

External torques on the sun gear and the carrier.
 Î± _{ s } :

Meshing angles of the sunplanet gear mesh.
 Ïˆ _{ n } :

Circumferential position angle of the nth planet gear.
 q :

Number of harmonic terms.
 k _{ spnm } _{,} k _{ rpnm } :

Mean values of timevarying mesh stiffness of sunplanet and ringplanet meshing pairs.
 Ï‰ _{ m } :

Mesh frequency.
 Î³ _{ spn } :

Relative phase between the n^{th} sunplanet mesh and the first sunplanet mesh
 Î³ _{ rpn } :

Relative phase between the n^{th} ringplanet mesh and the first sunplanet mesh.
 Ï• _{ ek } :

Phase difference between e_{spn}(t) and k_{spn}(t), and between e_{rpn}(t) and k_{rpn}(t).
 K(l) spna, K(l) spnb, K(l) rpna, K(l) rpnb :

Harmonic coefficients of Fourier series
 WT:

Wind turbine
 DTLN:

Deep transfer learning network
 1D CNN:

Onedimensional deep convolutional neural network
 NC:

Normal condition
 CTF:

Chipped tooth fault
 SWF:

Surface wear fault
 MTF:

Miss tooth fault
 DDS:

Drivetrain dynamic simulator
 DANN:

Domain adversarial neural network
 tSNE:

Tdistributed stochastic neighbor embedding
References
Nadour, M., Essadki, A., & Nasser, T. (2020). Improving lowvoltage ridethrough capability of a multimegawatt DFIG based wind turbine under grid faults. Protection and Control of Modern Power Systems, 5(4), 102â€“114.
Salameh, J. P., Cauet, S., Etien, E., Sakout, A., & Rambault, L. (2018). Gearbox condition monitoring in wind turbines: A review. Mechanical Systems and Signal Processing, 111, 251â€“264.
Leite, G. D. N. P., AraÃºjo, A. M., & Rosas, P. A. C. (2018). Prognostic techniques applied to maintenance of wind turbines: A concise and specific review. Renewable and Sustainable Energy Reviews, 81, 1917â€“1925.
Desai, J. P., & Makwana, V. H. (2021). A novel out of step relaying algorithm based on wavelet transform and a deep learning machine model. Protection and Control of Modern Power Systems, 6(4), 500â€“511.
Jiang, G., He, H., Yan, J., & Xie, P. (2019). Multiscale convolutional neural networks for fault diagnosis of wind turbine gearbox. IEEE Transactions on Industrial Electronics, 66(4), 3196â€“3207.
Saufi, S. R., Ahmad, Z. A. B., Leong, M. S., & Lim, M. H. (2020). Gearbox fault diagnosis using a deep learning model with limited data sample. IEEE Transactions on Industrial Informatics, 16(10), 6263â€“6271.
Liu, R., Yang, B., Zio, E., & Chen, X. (2018). Artificial intelligence for fault diagnosis of rotating machinery: A review. Mechanical Systems and Signal Processing, 108, 33â€“47.
Zhao, M., Kang, M., Tang, B., & Pecht, M. (2018). Deep residual networks with dynamically weighted wavelet coefficients for fault diagnosis of planetary gearboxes. IEEE Transactions on Industrial Electronics, 65(5), 4290â€“4300.
Su, X., Shan, Y., Zhou, W., & Fu, Y. (2021). GRU and attention mechanismbased condition monitoring of an offshore wind turbine gearbox. Power System Protection and Control, 49(24), 141â€“149. (in Chinese).
Ding, S., Li, X., Hang, J., Wang, Y., & Wang, Q. (2020). Deep learning theory and its application to fault diagnosis of an electric machine. Power System Protection and Control, 48(8), 172â€“187. (in Chinese).
Jiao, J., Zhao, M., & Lin, J. (2020). Unsupervised adversarial adaptation network for intelligent fault diagnosis. IEEE Transactions on Industrial Electronics, 67(11), 9904â€“9913.
Guo, L., Lei, Y., Xing, S., Yan, T., & Li, N. (2019). Deep convolutional transfer learning network: A new method for intelligent fault diagnosis of machines with unlabeled data. IEEE Transactions on Industrial Electronics, 66(9), 7316â€“7325.
Mao, W., Liu, Y., Ding, L., Safian, A., & Liang, X. (2021). A New structured domain adversarial neural network for transfer fault diagnosis of rolling bearings under different working conditions. ieEE Transactions on Instrumentation and Measurement, 70, 1â€“13.
Chen, Z., He, G., Li, J., Liao, Y., Gryllias, K., & Li, W. (2020). Domain adversarial transfer network for crossdomain fault diagnosis of rotary machinery. IEEE Transactions on Instrumentation and Measurement, 69, 8702â€“8712.
Shen, C., Wang, X., Wang, D., Li, Y., Zhu, J., & Gong, M. (2021). Dynamic joint distribution alignment network for bearing fault diagnosis under variable working conditions. IEEE Transactions on Instrumentation and Measurement, 70, 1â€“13.
Feng, L., & Zhao, C. (2021). Fault description based attribute transfer for zerosample industrial fault diagnosis. IEEE Transactions on Industrial Informatics, 17, 1852â€“1862.
Li, X., Zhang, W., Ding, Q., & Li, X. (2020). Diagnosing rotating machines with weakly supervised data using deep transfer learning. IEEE Transactions on Industrial Informatics, 16, 1688â€“1697.
Xu, G., Liu, M., Jiang, Z., Shen, W., & Huang, C. (2020). Online fault diagnosis method based on transfer convolutional neural networks. IEEE Transactions on Instrumentation and Measurement, 69, 509â€“520.
Wang, J., Zhao, R., & Gao, R. X. (2020). Probabilistic transfer factor analysis for machinery autonomous diagnosis cross various operating conditions. IEEE Transactions on Instrumentation and Measurement, 69, 5335â€“5344.
Yang, B., Lei, Y., Jia, F., Li, N., & Du, Z. (2020). A polynomial Kernel induced distance metric to improve deep transfer learning for fault diagnosis of machines. IEEE Transactions on Industrial Electronics, 67(11), 9747â€“9757.
Persello, C., & Bruzzone, L. (2016). Kernelbased domaininvariant feature selection in hyperspectral images for transfer learning. IEEE Transactions on Geoscience and Remote Sensing, 54(5), 2615â€“2626.
Chen, Z., Gryllias, K., & Li, W. (2020). Intelligent fault diagnosis for rotary machinery using transferable convolutional neural network. IEEE Transactions on Industrial Informatics, 16(1), 339â€“349.
Lu, W., Liang, B., Cheng, Y., Meng, D., Yang, J., & Zhang, T. (2017). Deep model based domain adaptation for fault diagnosis. IEEE Transactions on Industrial Electronics, 64(3), 2296â€“2305.
Xie, J., Zhang, L., Duan, L., and Wang, J. (2016). On crossdomain feature fusion in gearbox fault diagnosis under various operating conditions based on Transfer Component Analysis. In Proc. IEEE Int. Conf. Prognostics Health Manage, pp. 1â€“6.
Yang, B., Lei, Y., Jia, F., & Xing, S. (2019). An intelligent fault diagnosis approach based on transfer learning from laboratory bearings to locomotive bearings. Mechanical Systems and Signal Processing, 122, 692â€“706.
Inalpolat, M., & Kahraman, A. (2010). A dynamic model to predict modulation sidebands of a planetary gear set having manufacturing errors. Journal of Sound and Vibration, 329(4), 371â€“393.
Hong, L., Dhupia, J. S., & Sheng, S. (2014). An explanation of frequency features enabling detection of faults in equally spaced planetary gearbox. Mechanism and Machine Theory, 73, 169â€“183.
Eritenel, T., & Parker, R. G. (2012). An investigation of tooth mesh nonlinearity and partial contact loss in gear pairs using a lumpedparameter model. Mechanism and Machine Theory, 56, 28â€“51.
Liu, X., Yang, Y., & Zhang, J. (2018). Resultant vibration signal modelbased fault diagnosis of a single stage planetary gear train with an incipient tooth crack on the sun gear. Renewable Energy, 122, 65â€“79.
Luo, Y., Baddour, N., & Liang, M. (2019). Dynamical modeling and experimental validation for tooth pitting and spalling in spur gears. Mechanical Systems and Signal Processing, 119, 155â€“181.
Parra, J., & VicuÃ±a, C. M. (2017). Two methods for modeling vibrations of planetary gearboxes including faults: Comparison and validation. Mechanical Systems and Signal Processing, 92, 213â€“225.
Liang, X., Zuo, M. J., & Liu, L. (2016). A windowing and mapping strategy for gear tooth fault detection of a planetary gearbox. Mechanical Systems and Signal Processing, 80, 445â€“459.
Park, J., Ha, J. M., Oh, H., Youn, B. D., Choi, J., & Kim, N. H. (2016). Modelbased fault diagnosis of a planetary gear: A novel approach using transmission error. IEEE Transactions on Reliability, 65(4), 1830â€“1841.
Shao, S., McAleer, S., Yan, R., & Baldi, P. (2019). Highly accurate machine fault diagnosis using deep transfer learning. IEEE Transactions on Industrial Informatics, 15(4), 2446â€“2455.
Han, T., Liu, C., Yang, W., & Jiang, D. (2019). A novel adversarial learning framework in deep convolutional neural network for intelligent diagnosis of mechanical faults. KnowledgeBased Systems, 165, 474â€“487.
Acknowledgements
Not applicable.
Author's information
Dongdong Li received his B.S. and Ph.D. degrees from Zhejiang University and Shanghai Jiao Tong University both in electrical engineering in 1998 and 2005, respectively. He is currently a professor and dean of College of Electric Engineering in Shanghai University of Electric Power, Shanghai, China. His current research interests include analysis of electric power system, renewable energy system, smart grid and power electronization of power system.
Yang Zhao received his B.S. degree from Shanghai University of Electric Power in 2019. He is currently pursuing the M.S. degree in electrical engineering at Shanghai University of Electric Power. His current research interests include artificial intelligence algorithm, fault diagnosis of wind turbine planetary gearbox.
Yao Zhao received the B.S. degree in automation from Anhui University, Hefei, China, in 2009, the M.S. degree in electrical engineering from Shanghai Maritime University, Shanghai, China, in 2011, and the and Ph.D. degree in from the Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2016. He is currently a Lecturer with the Shanghai University of Electric Power, Shanghai, China. His main research interests include electrical machines and power electronization of power system.
Funding
Natural Science Foundation of Shanghai (21ZR1425400), Shanghai RisingStar Program (21QC1400200), National Natural Science Foundation of China (51977128), Shanghai Science and Technology Project (20142202600).
Author information
Authors and Affiliations
Contributions
Not applicable.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, D., Zhao, Y. & Zhao, Y. A dynamicmodelbased fault diagnosis method for a wind turbine planetary gearbox using a deep learning network. Prot Control Mod Power Syst 7, 22 (2022). https://doi.org/10.1186/s4160102200244z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4160102200244z
Keywords
 Wind turbine planetary gearbox
 Lumpedparameter dynamic model
 Intelligent fault diagnosis
 Convolutional neural network
 Transfer learning theory