A novel out of step relaying algorithm based on wavelet transform and a deep learning machine model

Out-of-step protection of one or a group of synchronous generators is unreliable in a power system which has significant renewable power penetration. In this work, an innovative out-of-step protection algorithm using wavelet transform and deep learning is presented to protect synchronous generators and transmission lines. The specific patterns are generated from both stable and unstable power swing, and three-phase fault using the wavelet transform technique. Data containing 27,008 continuous samples of 48 different features is used to train a two-layer feed-forward network. The proposed algorithm gives an automatic, setting free and highly accurate classification for the three-phase fault, stable power swing, and unstable power swing through pattern recognition within a half cycle. The proposed algorithm uses the Kundur 2-area system and a 29-bus electric network for testing under different swing center locations and levels of renewable power penetration. Hardware-in-the-loop (HIL) tests show the hardware compatibility of the developed out-of-step algorithm. The proposed algorithm is also compared with recently reported algorithms. The comparison and test results on different large-scale systems show that the proposed algorithm is simple, fast, accurate, and HIL tested, and not affected by changes in power system parameters.


Introduction
Power swing is a phenomenon that usually occurs because of the sudden disconnection of heavy loads or tripping of transmission lines because of faults in the system. Protective elements must accurately and quickly detect the power swing condition, while the consequences of unstable power swings are mostly in the maloperation of transmission line distance relays, and damage to the generator and turbine-generator units. Unstable power swings can also cause cascade failure of numerous transmission lines, transformers, and generators. Thus, additional devices and functions, namely out-of-step protection systems, are usually augmented to avoid the consequences of unstable power swings.
The essential parts of out-of-step protection relaying are the power swing blocking (PSB) function, the out-of-step (OOS) tripping function of the transmission line, and OOS protection of the synchronous generator. It is difficult to detect a symmetrical fault during a power swing [1], while the low frequency oscillation of the power system can result in a loss of stability or a blackout.
Low frequency oscillations in a power swing are detected in [2] by analyzing the Daubechies-4 (db4) wavelet, and compared with the Prony and Eigenvalue analysis. The proposed discrete wavelet transform-(DWT) based approaches in [2] can identify the onset of the initial disturbance in the power system and the presented modes during low frequency power oscillation. The future scope of wavelets for blackout

Open Access
Protection and Control of Modern Power Systems Desai and Makwana Protection and Control of Modern Power Systems (2021) 6:40 prevention by avoiding unnecessary tripping using wavelet-based protection is described in [3], and different wavelet families are considered and compared. This shows that high accuracy can be achieved using the db4 mother wavelet. Reference [1] also projects that the transient energy can be captured in level d1 to d4 of the voltage waveform during the events, and the detail coefficient-9 (d9) of the current waveform can track the variation in current. The proposed algorithm in [1] thus uses detail coefficient-1 (d1) to coefficient-4 (d4) for the detection of the fault, and d9 as an indicator of the power swing. Some thresholds are also used for a final decision, though the identification of an unstable power swing using a detail coefficient is lacking in [1]. In addition, the calculation of such thresholds is not described.
The algorithm in [1] also uses some fixed settings to make it rigid but it needs close attention. The difficulties of using only a few specific features and threshold settings are explained in Sect. 3 of this paper. In contrast, an optimized deep learning (DL) machine model is designed, which works on pattern recognition.
The blinder-based out-of-step relay for the synchronous generator protection is typically designed using fixed settings. The work in [3] describes the impact of integrating solar and wind power generation on small-signal oscillation in the modern power system. Reference [4] proposes a protection scheme that differentiates the type of fault from load change events. The harmonic magnitudes of the voltage signal and its fundamental part provide sufficient information to discriminate fault detection, location, classification, and zone identification. An artificial neural network (ANN) is used once the support vector machine (SVM) identifies the fault type. However, the two unique techniques make the model complicated and inefficient for hardware. The algorithm in [4] cannot detect unstable power swings and is used only as a PSB relay. Furthermore, the algorithm in [4] has not been validated in extensive scale systems with significant renewable power penetration.
ANN offers more prompt responses and requires a quarter of the fault signal cycle to identify the type of fault [5]. Thus, an ANN-based distance relay can provide fast and precise operation [6]. Nevertheless, reference [6] does not address the other distance relay problems, such as maloperation of a distance relay during a power swing and for a system with large renewable integration.
In [1][2][3][4][5][6], once the power swing is initially identified from the fault, further identification is lacking. However, this is the most important part in a modern power system. Stable power swings can be damped by advanced power system controllers and there is no need for rapid disconnection of affected elements. However, advanced power system controllers cannot damp out unstable power swings, so rapid and correct detection is required.
Reference [7] proposes the use of voltage and current signals for feature extraction for the machine model, and recommends avoiding the Fast Fourier Transform (FFT) signal in order to reduce hardware complexity. The ANN model in [8] uses multi-layered perception with a back-propagation algorithm. It describes the neural network structure and shows that the use of ANN for online power system dynamic security and vulnerability assessment is quite realistic. ANN also improves performance in terms of adaptiveness and relay coordination [9]. The two significant areas of wavelet transform are power system protection and power quality. Reference [10] chooses db5 as a mother wavelet for detecting a short duration and fast decaying fault-generated transient signal, and uses both high and low frequency approximations to avoid confusion between fault and non-fault events. It also shows some limitations, such as specific system structure and the algorithm requiring more adjustment, indicating the non-adaptability of the method.
With incorrect input data, the proposed method in [11] produces dubious output and lacks information, and thus user intervention is necessary. The process can be used for significant system fault section identification as it does not depend on the size of the electric network. The identification of a failed device helps to restore the system quickly after faults. A multilayer perceptron neural network is used to solve failed device identification problem in [12]. However, the proposed network only handles 32 alarms, so the ANN needs to be developed further to deal with complex system emergencies. Reference [13] uses multivariate analysis and data mining techniques for synchronous generator islanding protection, while a phasor measurement unit (PMU)-based adaptive out-ofstep protection algorithm is presented in [14]. The use of a PMU increases the accuracy and reliability of the outof-step protection, but at the expense of increased overall protection cost.
Reference [15] presents an adaptive concentric power swing blocker with two concentric circles. It shows that the current signal's static phasor estimation error can be used to find the first pair of concentric circle locations. However, the rate of change of impedance used in [15] depends on power system parameters such as voltage regulator, governor, fault type, and renewable power penetration [18]. Furthermore, the conventional power swing blocking (CPSB) used in [15] may maloperate under renewable integration [16].
In the present work, the signal is analyzed up to the 12th level using the db-4 mother wavelet. An algorithm is designed such that with modification of the appropriate mother wavelet, it is sufficient to use for protective relaying with other conditions such as topological change, loading, and fault locations. The MATLAB environment is used for the testing and development of the proposed ANN-based algorithm using a PC with an Intel i5 processor, 8 GB of RAM, and a 64-bit operating system with 256 GB solid-state drive (SSD). The rest of the paper is organized as follows. Section 2 describes the system and power swing conditions. In Sect. 3, the uniqueness in the training data selection and data pre-processing is explained, and the unique mathematical modeling of the DL machine model and the final proposed algorithm are presented. Unknown disturbances are manually applied to test the proposed technique in Sect. 4. The development and large-scale validation results are presented in Sect. 5. Comparison of the proposed algorithm with recently reported algorithms is provided in Sect. 6, with the results expressed in four categories: (1) results during training, testing, and validation, (2) results of testing with unknown disturbances, (3) hardware-in-loop (HIL) test results, (4) extensive scale validation. Finally, Sect. 7 draws conclusions.

Testing system and power swing conditions
The Kundur two-area system is considered for the wavelet analysis of three-phase faults, and stable and unstable power swings, as shown in Fig. 1 [17]. The unstable power swing is produced by applying three-phase faults longer than the CCT (Critical clearing time) near the HV (high voltage) of the GTU (Generator transformer unit) of the G 1 (generator-1). The three-phase current and voltage waveforms produced at bus B 1 during an unstable power swing after the fault is removed are shown in Figs. 2 and 3, respectively.
The process of developing the proposed relay is explained in four steps: (1) Training data selection and pre-processing; (2) Design of the mathematical structure of the deep learning neural network; (3) Design of the proposed relaying algorithm; (4) Training, validation, and testing.

Training data selection and pre-processing
The fault events are transients, which are reflected in voltage and current waveforms by the change in their frequency and magnitude. The wavelet transform of a signal gives information about both frequency and time of the transient event. However, it is challenging to detect three-phase faults during a power swing as it provides a minimal reflection at the transient stage of the voltage signal. These changes depend on the location of the fault, type of fault, and instant of fault, while the frequency of a transient is much higher than the nominal frequency of the system. Further, during ground faults, the phaseto-ground voltage magnitude of the faulty phase is near zero, while the current increases considerably. Hence, the wavelet scale covering the fault frequency has higher energy than the scale covering the current wave's nominal frequency, whereas the power swing has a relatively low frequency ranging from 3 to 7 Hz [18]. The energy distribution up to the 12th level during the three-phase fault, stable power swing, and unstable power swing are shown in Table 1 using the current waveform and in Table 2 using the voltage waveform.
The wavelet scales must be selected such that they cover the lower frequency, which detects patterns of the low energy levels, and high frequency, which sees patterns of high energy levels. The db4 wavelet decomposes the current/voltage signals up to the 12th level  with a sampling rate of 20 kHz, which gives enough resolution of time-frequency variation in current and voltage during events.
During the three-phase faults, the minor energy lies in the current wave from 302 to 646 Hz, reflected in detail coefficient d5, which is absent in a power swing for the same ranges. Further, the minor energy lies in the voltage waveform in the range of 302 Hz and 646 Hz for a three-phase fault and is absent for a power swing in the same range. Analysis of detail coefficients at levels 3 to 5, 9 and 10 shows the unique patterns of unstable and stable power swings, and a three-phase fault. Figure 4 shows the significant pattern differences between the three-phase fault and power swing in the level 5 current signal's detailed resolution. Similarly, the timefrequency resolution of the voltage waveform at level 9 shows the significant pattern differences between stable and unstable swing, as shown in Fig. 5. A minor pattern difference at each level between fault, stable swing and unstable swing also exist, as shown in Tables 1 and 2. The minor pattern difference is also useful for training the machine model and should not be ignored.
Further analysis of all features verifies that the mean and median value of d7 and a12 coefficients of the current signals can differentiate between a three-phase fault and a power swing, as shown in Table 3. The stable and unstable swings can be classified using mean and median values a1 and d12 of the voltage waveform, as shown in Table 4. First, a few selected details and approximate coefficients of the voltage and current waveforms are used to train the DL algorithm using Tables 3 and 4. However, it has been observed that a few selected coefficients are not able to classify all the events, and some chosen variable feature extraction techniques have no significant difference between the stable and unstable power swings. Thus, more coefficients using Tables 1 and  2 are used and tried with the feature extraction method. This shows improvement of the pattern and some distinct differences in each pattern of fault and power swing.
Finally, in this work, the patterns of the detailed coefficients from d1 to d12 and approximate coefficients a1 to a12 are considered. The input vector gives the unique pattern for a 3-phase fault, and stable and unstable power swings. This pattern can differentiate them completely.

Design of the mathematical structure of deep learning neural network
The proposed deep learning machine model has a (48 × 1) input vector-matrix size, x 1 , x 2 , x 3 , and x 4 to x 48 , as shown in Fig. 6. The optimization in performance and training time uses a neural network of two hidden layers, one input layer and one output layer, and each hidden layer has ten neurons. The optimization is achieved by minimizing the cross-entropy by changing neurons at each layer. The input vector x (j) (at the jth sample) is weighted by respective weight and bias at the hidden layers, as shown in Fig. 6. The best possible weight and bias are determined such that they minimize the loss function. The proposed deep learning machine model uses a scale conjugate descent method for parameter estimation. The forwarding pass uses a linear combination with non-linear activation repetitively to each layer to get the prediction. Once a prediction is reached, the next job is to find the loss.
The loss is propagated in the reverse direction to calculate the gradient concerning the direct connections to the output layer. It then applies the chain rule of the derivative successively to find the losses at intermediate levels.
The proposed deep learning model uses an adaptive learning rate. This gives faster convergence than the classical machine learning algorithm. The loss function used is the non-convex type, and hence the proposed model uses a momentum-based strategy. It applies an early stopping technique, so the machine model stops early when generalization stops improving during training. Each unit of neuron has two parts of activation: (1) Linear combination (2) Non-linear activation The linear function at the first hidden layer for the ith neuron and the jth sample is given by: where b i is the bias at the ith neuron, w i is the weight at the ith neuron, and x j i is the input vector at the ith neuron for the jth sample.
The non-linear activation function used at the first layer is a tan-sigmoid transfer function, which for the ith neuron and jth sample is described as: The linear function at the first hidden layer for the ith neuron and jth sample is:    Layer 2 has a non-linear activation function known as a SoftMax transfer function. The SoftMax transfer function for the ith neuron and jth sample is described as: Finally, the output is given for the jth sample as: where y (j) is the output at the jth sample. y (j) has three sets of binary outputs, i.e., [1 0 0], [0 1 0], and [0 0 1] for unstable swing, stable swing and fault, respectively. The highest probability in the output set [y1, y2, y3] is considered equal to 1, while all other values are considered to be 0.

Design of the proposed relaying algorithm
The detail and approximate coefficients are calculated with the input signals of voltage and current given by the current and voltage transformer respectively, as: where CA l and CD l represent the approximation coefficients and detail coefficients of the signal at level l.
The signals pass through a high pass filter (HPF) and a low pass filter (LPF), and the outputs from both filters are used to obtain the detail and approximation coefficients at level 1 (d1 and a1). The approximation coefficients are then sent to the second stage to repeat the procedure [2]. Finally, the signal decomposes at the 12th level, and the input vector is created. The training data is prepared using the input and output vectors given by: where x (j) is a set of wavelet coefficients (d1 (j) to d12 (j) , a1 (j) to a12 (j) ) at the jth sample for voltage and current waveforms at a sampling rate of 20 kHz or higher, and y (j) ∈ (0, 1). Figure 7 shows the flow chart of the proposed algorithm. The previously trained pattern recognition K j=1 e n j for i = 1, 2 . . . , K and n = (n 1 . . . , n K ) ∈ R K machine model senses the input x (j) , and gives three binary outputs [1 0 0], [0 1 0] and [0 0 1] for unstable swing, stable swing and three-phase fault, respectively. If an unstable swing is classified, the trip signal is sent to the associate breaker at the point of separation. In the case of a stable power swing, the algorithm produces a PSB command and sends it to the transmission line distance relay. Once the PSB command is sent, the algorithm starts sampling the next data and continues until further identification. If the fault is classified, the tripping decision of the distance relay is allowed with its zone delay settings for transmission line protection. Table 5 shows the performance in terms of percentage error during training, validation, and testing. The 0% error indicates that no sample is miss-classified.

Training, validation, and testing
The cross-entropy needs to be minimized during training, validation, and testing. The development stops when the plots of training, validation, and testing intersect at minimum cross-entropy. After 614 epochs, the best validation is achieved. This gives a cross-entropy of 0.00013, as shown in Fig. 8.
The confusion matrix presented in Fig. 9 shows the performance in terms of the output class matrix to the target class. As seen, the output class and target class completely match during training, validation, and testing in the confusion matrix. If the algorithm is confused, the value is shown in the off-diagonal place; otherwise, it is placed at a diagonal location in the confusion matrix.

Performance validation by unknown events
The signals that are used for the swing center arise at the HV terminal of GTU of G 1 during machine model development. After developing the proposed algorithm with the required performance, many unknown signal data of stable swing, unstable swing, and three-phase fault are considered. The modified Kundur two-area system is used to find the effect of different renewable penetration levels on the proposed out-of-step relay. Four identical DFIGs (Doubly Fed Induction Generators) are connected at buses B G1 , B G2 , B G3 , B G4 such that the total power flow remains the same from area 1 to area 2 in Fig. 1 [16].

Test using unknown data of different fault locations
The following test cases produce strange signals for testing: 1. Swing center arises at HV terminal of GTU of G 2 2. Swing center appears at HV terminal of GTU of G 3 3. Renewable power penetration at a different level 4. Swing center appears at the middle of the transmission line An unknown input vector of features from the whole bunch of extensive sampled signal data is applied to the algorithm. The output of the algorithm is rounded to its closest prediction using the following condition: (1) if the binary output is fractional value < 0.9, then consider 0 (2) if the binary output is fractional value ≥ 0.9, then consider 1 If the same binary result continues for a half cycle, it indicates its final prediction. It is worth noting that the same algorithm can also be expanded for fault classification. Table 6 shows the list of test cases to verify the proposed algorithm for the unknown input data.

Hardware-in-the-loop test of the proposed out of step relay
The hardware-in-the-loop (HIL) test is an essential step before committing to the relay's final hardware design. Simulink with the Waijung open-source Simulink package is the desktop software for generating both C (Compiler) and HDL (Hardware Description Language) code used for real-time testing. The real-time hardware system (Arm Cortex-M4) is used as a hardware relay for running code from Simulink models using Simulink Real-Time [19]. The Simulink relay consists of an input vector (48 × 1) applied to the DL machine model. This generates the binary output.  The hardware relay only consists of the coded DL machine model using Simulink real-time. The data transmitter and receiver are connected at the PD8 and PD9 pins of the hardware relay, respectively. The sampling time for data transmission is 0.005 s. Figure 10 compares the outputs between the developed Simulink-based and hardware-based relays. The tripping command of the hardware relay is delayed by 0.005 s because of the sampling time set in data transmission. If this delay is considered, the software and hardware relays have the same instantaneous response once the event has been confirmed after continuous same binary output for a half cycle. Figure 11 shows the hardware setup used for real-time HIL testing. This newly developed wavelet and deep learning-based machine model is tested with unknown power swings and three-phase fault signals. The results show that the proposed algorithm development procedure is a standard one and can be used to any scale of system. Also, the detection of power swings and three-phase faults are highly accurate with an unknown input vector. It is found that the overall accuracy is reduced by 1.4% during the training, testing, and validation process on the 29-bus system. The details of errors in each stage are shown in Table 5, and the training, testing, and validation results on the 29-bus system are shown using the confusion matrix in Fig. 13. The cross-entropy found on the 29-bus

Comparison between different algorithms
Comparisons of the proposed algorithm with the wavelet-based algorithm in [1], and the SVM and ANN-based algorithms in [4] are summarized in Table 7. Both PSB and OOS tripping are very important for a synchronous generator. The proposed algorithm can identify the type of power swing once a power swing is detected. This is not possible in the methods reported in [1] and [4]. Hence, the methods in [1] and [4] are suitable for power swing blocking (PSB) but not for OOS tripping. Further, it is imperative for out-of-step relay to be adaptive to the changes in system topology, power flow, and renewable power penetration level. The proposed scheme is validated with different levels of renewable power penetration, whereas the methods in [1] and [4] are only tested on a minimal system (fewer than nine buses) without the integration of renewable power resources.
The proposed algorithm does not require the threshold calculation that the method in [1] does. The method in [4] needs detection of the 3rd, 5th and 7th harmonics, and a fault classifier for detecting the type of fault. These are more likely to mal-operate under renewable integration. The operating times of the method in [4] and the proposed one are the same, while the relay operating time is not fixed for the technique in [1] as it depends on the fault location.
The proposed DL model development steps are standard ones irrespective of the scale of the power system. However, in [1] and [4], the algorithms have not been tested at larger scales for development standardization. Similarly, the out-of-step relay performance in series and shunt compensated lines needs to be evaluated. This is lacking in [1] and [4], whereas the proposed method works correctly under this condition as verified on the 29-bus system. In addition, the proposed algorithm works independently while the method in [1] uses the conventional scheme, and [4] uses the fault classifier.
The proposed algorithm works on patterns, so protection engineers do not need to study the system topology, system parameters, source nature, and fault location. However, the methods in [1] and [4] are more complex, as the threshold settings fluctuate with fault location [1], and the Kalman filter design changes under increased harmonic injection by non-linear loads or unknown renewable sources [4].
In addition, HIL test results of the proposed algorithm are provided to show the hardware suitability and actual operation speed where the methods in [1] and [4] have not been tested for HIL.

Discussion
The proposed pattern recognition machine model using wavelet transform gives favorable results during development. Tests using events that are not a part of the algorithm's development are carried out. Table 6 shows all the different unknown event cases, including stable power swing, unstable power swing, three-phase faults, and sudden load change. The proposed algorithm correctly identifies each class of power swing and threephase fault for all the unexplained events, and only needs half a cycle or less to decide the type of power swing. The training time of the algorithm using the high-end processor is almost a fraction of a second, while the total time required for training, validation, and testing is around 2 s. The proposed algorithm's sampling time can handle a higher rate depending on the processor in which it is deployed. When the proposed algorithm is applied in a hardware relay, it gives detection and tripping within 0.01 s with high accuracy. HIL tests confirm that the developed relaying algorithm is ready for the hardware production stage and provides the same response as the Simulink-based model. The proposed algorithm's accuracy on the 29-bus system is reduced slightly during training, testing, and validation.

Conclusion
The detail and approximate coefficients captured using the db4 wavelet up to the 12th level of resolution during unstable and stable power swings, and three-phase fault provide a unique pattern of each event with an input vector of d1 to d12 and a1 to a12 of current and voltage in the given order. The deep learning machine model is designed to recognize the pattern and to discriminate stable and unstable swings, and three-phase fault events automatically with high accuracy. The proposed algorithm is not affected by strange circumstances as it does not use signal features while the pattern's nature remains the same. The proposed algorithm is based on wavelet transform with the DL machine model. This can detect any uncommon power swings that are due to the impact of renewable power integration, while strange power swings  Table 7 Comparison of the proposed algorithm with existing methods

Point of comparison
The proposed algorithm