A simple decision tree-based disturbance monitoring system for VSC-based HVDC transmission link integrating a DFIG wind farm

Fault detection and classification is a key challenge for the protection of High Voltage DC (HVDC) transmission lines. In this paper, the Teager–Kaiser Energy Operator (TKEO) algorithm associated with a decision tree-based fault classifier is proposed to detect and classify various DC faults. The Change Identification Filter is applied to the average and differential current components, to detect the first instant of fault occurrence (above threshold) and register a Change Identified Point (CIP). Further, if a CIP is registered for a positive or negative line, only three samples of currents (i.e., CIP and each side of CIP) are sent to the proposed TKEO algorithm, which produces their respective 8 indices through which the, fault can be detected along with its classification. The new approach enables quicker detection allowing utility grids to be restored as soon as possible. This novel approach also reduces computing complexity and the time required to identify faults with classification. The importance and accuracy of the proposed scheme are also thoroughly tested and compared with other methods for various faults on HVDC transmission lines.


Introduction
Almost all industrial processes, as well as various aspects of daily life, rely on electrical energy or electricity [1,2]. Electricity consumption is ever increasing, and in particular energy demands are even higher during peak hours, making it difficult to guarantee supply to consumers [1]. The adoption of distributed energy resources(DER), such as wind, solar, and fuel cells, has been proved to be a realistic alternative given several concerns, including rising energy consumption, exhaustion of conventional energy resources (such as fossil fuels and coal) and pollution [3]. Addressing the above problem, integrated wind farms have been proposed. HVDC transmission systems outperform HVAC transmission systems in high power ratings [4]. The losses in HVAC transmission lines increase as transmission distance increases because of increased resistance, inductance and capacitance [5], and thus the transmission efficiency is reduced for long transmission length while the skin effect and corona loss are also observed in HVAC [6,7].
The use of an HVDC transmission system addresses the above mentioned loss issue. One of the most critical issues in an HVDC transmission system is fault identification and classification [3].The entire power system could fail if the fault current on the HVDC transmission link is not interrupted for an extended period. It is challenging to distinguish the faulty system from the healthy components if proper methodology is not used [8,9]. To restore system stability and limit economic losses, the type of fault and its classification on the transmission line should be determined as soon as possible. The purpose of this study is to detect and examine the four types of faults that can occur on HVDC transmission lines and to evaluate the robustness of the Teager-Kaiser Energy Operator (TKEO) algorithm with a Simple Decision Tree-based mechanism for accurate results with low computing complexity and a reduced time for fault identification, and classification.

Literature review
For HVDC transmission systems, many fault detection and classification approaches have been proposed. However, because of the aforementioned problems, techniques for protecting HVDC transmission lines are more limited than methods for conventional transmission systems [10]. A detailed literature review is provided here to have a better understanding of the proposed fault detection and classification methods for HVDC transmission lines. From the survey, a gap in the available fault detection and classification systems for HVDC transmission lines is identified. As fault detection and classification methods in HVDC transmission lines are influenced by a variety of parameters, these factors are investigated from several perspectives, with each one being examined separately. To ensure a fair review, the methods are divided into two categories, i.e. model-based and data-driven-based strategies.

Group A: data-driven-based techniques
Examining data pertinent to a system or determining the relationship between input and output state variables are the roots of data-driven approaches [11]. Because of the complexity of and necessity for a large quantity of data, real-time protections based on these technologies are not commonly used in HVDC transmission lines [12]. However, because of a lack of deep knowledge of the system, these methods are sometimes adopted to detect abnormalities that model-based methods may not be able to detect.

Fuzzy-based techniques
For fault identification in an HVDC transmission line, a combination of wavelet singular entropy and fuzzy logic is described in [13]. Similarly, in [14,15], differential protection techniques based on fuzzy inference processors are proposed. However, the following are some of the challenges associated with the fuzzy method in fault detection and classification: (i) Finding accurate membership functions and fuzzy rules is difficult; (ii) To evaluate and validate the fuzzy-based system, extensive hardware testing is required.

Decision tree and ANN-based techniques
To detect the faults, reference [16] employs local current measurements with wavelet transform and a decision tree. In addition, for fault classification, a sequence analyzer is employed to extract negative and zero sequence components. For HVDC transmission lines, a data-mining-based technique on two decision trees is described in [17], and artificial neural networks (ANNs) are used to detect and classify faults in HVDC transmission lines in [18]. The current signal is sent to two distinct ANNs that have been trained to detect and classify faults in [19]. However, the following are some of the potential drawbacks of using these methods: (i) Extensive data are necessary for the training stage; (ii) There may be inadequate (or missing) training data to derive estimates in the majority of cases.

Group B: model-based techniques
Model-based approaches aim to determine if the evaluated variables are consistent with the model, in [20]. These methods can be further categorized based on the detection method used to identify the fault in various modes of operation [21].

Differential-based techniques
In [22], a differential protection strategy is proposed, in which two time-frequency transformations, i.e., Hil-bert_Huang and the S-Transform, are compared to calculate the difference in the spectral energy content of modified contours on two sides of a feeder. Using the average cumulative sum and transient estimation methods, a differential transient current-based fault detection method for HVDC transmission lines is proposed in [23]. The following are the primary issues that differential-based techniques face: (i) In the event of a communication breakdown, backup protection is needed to protect the HVDC transmission line; (ii) The system cost increase as a result of the communication systems.

Local variable-based methods
In [24], the loop type HVDC transmission lines are protected using the inherent characteristics of the local variable current and its derivation. The inverter output current is used as a local variable to calculate the recursive least squares and mathematical morphology (MM). However, the local variable-based approaches, in general, have some drawbacks:

Adaptive methods
The updated mode of operation is checked at the relay point in adaptive-based approaches when the configuration changes. Current signals are obtained with current transformers (CTs) in [25] and compared using the cycleby-cycle comparison method. The following are the key drawbacks for adaptive approaches: (i) When the HVDC transmission system changes between different modes of operation, it is necessary to re-adjust the settings of the protection devices; (ii) It is costly to use communication channels for setting updates and monitoring; (iii) All feasible HVDC transmission configurations must be known prior to operation.

Traveling wave-based techniques
In [26], a traveling wave-based protection mechanism based on MM is proposed. Because MM technology only executes a few summations and subtractions, the introduced approach offers quick fault detection. However, high sampling-rate measuring equipment is required for traveling wave-based approaches. Although these procedures are quick and accurate, using high samplingrate measurement instruments significantly reduces the benefits.

Aims and contribution
The aim of the proposed method is quicker detection and classification of the fault and to reduce computational complexity. A novel protection mechanism for HVDC transmission lines is proposed to detect and classify disturbances. When there is no fault, the differential current is zero, while it varies when there is a fault. The method begins with the calculation of differential and average currents when CIP is identified. The numerical values at different faults are determined using a TKEO-based scheme in the next stage.The method is based on the "Teager Energy" tracked by the TKEO algorithm. The 8 indices are extracted from "Teager Energy".
In addition, TKEO calculates only three samples of data (at CIP and either side of CIP), resulting in a low computing burden and good time resolution. The next step is to generate eight separate indices based on "Teager energy" of differential and average currents to distinguish faulty from healthy sections, as well as the type of fault and the faulty line. Simulation and experimental systems are used to test the proposed method. The following are the main contributions of the proposed method: (i) Processing only the current signal with simple rules; (ii) Low computing burden and cost efficiency because no communication lines are required; (iii) When evaluating the procedure, it takes into account a variety of challenging conditions.

Research gaps identified
Following a review of the literature, the following research gaps for fault analysis of an HVDC transmission system are identified: • Distributed energy sources (DER) are the future of electricity production, but research on how to transmit this energy through HVDC lines with different design models is limited. • A few studies used raw signal data processing units, which take much longer to analyse the fault classification and fault detection. • According to several studies, the fault detection efficiency is limited to around 83%. Data is generated by a limited number of algorithms, and this may mislead the entire system. • Several studies have concentrated on short distance HVDC lines only.
• The majority of studies fail to account for the computational burden and time required to diagnose the fault with classification. • For fault analysis, very few little research has concentrated on the noise interface defective signal issue.

Novelties of the paper
The above-mentioned research gaps must be addressed to acquire the exact identification of faults as well as the classification of the HVDC transmission system. The gaps and limits described above must be addressed to conduct a realistic fault analysis of an HVDC transmission line. In contrast, the technique proposed can allow quick identification of the type of fault corresponding with classification in HVDC transmission link, as: • The required time to detect the fault and its classification is only about 10 ms. • The efficiency is improved to 98.75% in terms of precise fault type and classification. • For the first time in an HVDC transmission system, the study uses the TKEO method in combination with a Simple Decision Tree-based fault classifier for fault analysis. • The proposed strategy overcomes the disadvantages in existing methods, such as computational complexity and requiring a long time to find the fault with classification. • When performing a fluctuating DC analysis, determining the magnitude of the threshold setting value is quite challenging. Figure 1 depicts a single line diagram of a bipolar HVDC transmission system using a Voltage Source Converter (VSC) that is fed by a combination of offshore wind farms.

Description of the designed model
As can be seen 150 km HVDC transmission lines are presented. The fault analysis performed on the 150 km HVDC transmission line is the topic of this research. The HVDC system is fed by a wind farm of four units where each unit has a capacity of 9 MW. Each unit has 6 sub units of capacity1.5 MW i.e. 6 × 1.5 = 9 MW. So the total generating capacity is 36 MW i.e., 4 × 9 = 36 MW. During simulation, the wind speed is kept constant at 15 m/s. A 30 km transmission line (TL1) with a 47 MVA step up transformer (T1) of 120 kV/25 kV transmits from the offshore to on shore. A 150 km transmission line (TL2) with capacity of 200 MVA ± 100 kV connects the two converter stations with two 8 mH smoothing reactors. The AC voltage from the HVDC inverter is connected at bus B4 and to the utility via transformer T3.

Various faults on HVDC Transmission link
The four types of faults that can occur on the HVDC lines are:  In Figs. 6, 7, 8 and 9 depict the simulation results of the DC differential currents (I diff ) for the PG, PN, NG,   is shown on the X-axis and the DC differential current (I diff ) on the Y-axis. As can be seen, the four faults result in different magnitudes of the DC differential currents at the same location at 25 km. The TKEO algorithm generates different "Teager Energies" from which 8 indices are generated. All four types of faults are simulated for each kilometer, and the resulting graphs and numerical data are saved. For simple presentation, only one graph of each fault at the same location 25 km is given.
The initialization of the proposed system is depicted in Fig. 10. The AC voltage from the wind generator is converted to a DC voltage to check that the designed model is in a stable condition. The converted wind farm DC voltage is not same as the DC voltage of transmission line. As can be seen, the DC output steady-state is reached after 1 s. Only once the designed system has reached a stable condition can fault analysis be performed. The designed system DC output voltage wave is not stable initially, because of the transient behavior which can last up to 1 s. The values of voltage and currents are in per unit. Figure 11 shows the three-phase current and voltage waveforms before connecting the proposed system to the HVDC rectifier. The values of voltage and currents are in per unit. As seen, the voltage waveform becomes stable after 0.15 s while the current is stable after 0.2 s. The HVDC rectifier is only enabled after the voltage and current become stable. Figure 12 shows the three-phase current and voltage before enabling the inverter. The voltage is stable from the beginning while the current waveform becomes stable after 0.15 s. The values of voltage and currents are in per unit.    Figure 14 shows the three-phase currents and voltages at the faulty inverter. For the duration of the fault the differential current spikes can be seen. Due to the transient effect (threshold value), the waveform is not stable at first, i.e., the current wave settles down after 0.05 s. The values of voltage and currents are in per unit.

Proposed methodology
The whole working methodology can be understood simply from Fig. 15.
The whole working method can be understood from Fig. 15. The DC differential current (I 1 −I 2 ) magnitude is almost zero at unfaulty conditions throughout the DC line. However, the magnitude increases [(I 1 −(−I 2 )] rapidly when there is a fault on the HVDC line at any distance. The methodology contains 5 steps as follows: Step 1 Whenever a fault occurs, the CIF technique is applied to HVDC transmission lines to detect the change in current wave form and registers CIP.
Step 2 Extracting "Teager Energy" either from differential or average currents at CIP at any distance on HVDC transmission line is the primary task for fault analysis of non-linear and non-stationary signals.
Step 3 TKEO, which tracks the "Teager Energy" of the respective signal at CIP with high time resolution is very efficient in terms of processing time as only three current samples (CIP and either side of CIP) are used for fault analysis.
Step 5 In the final step, a Simple Decision Tree-based fault classifier is used. This allows the numerical data sets of the 8 indices to pass through it for fault detection and classification.

Computation of average and differential current
The averaged and differential current inputs are the two important components that differential current relays require for functioning and supervision in a differential protection scheme. The HVDC instant average and differential currents are given as: (1) where f is the sampling instant.

Proposed algorithm
The proposed TKEO optimization technique avoids the shortcomings highlighted in the identified research gaps we cited previously, such as reducing computing complexity and the time required for fault identification and classification. This is because only three samples (CIP and either side of CIP) are being processed. TKEO is more sensitive to fluctuations in the signals under investigation. In this case, TKEO with the Simple Decision Tree-based classifier is a better alternative than other methods since it gives higher resolution and reduces the time taken to find faults. An "energy" tracing operator was invented by Teager and used by Kaiser to extract indices such as mean, energy, amplified energy, maximum amplitude, standard deviation, kurtosis, entropy, and variance from nonlinear signals. When compared to other commonly used algorithms, the TKEO algorithm outperforms them with high accuracy in fault detection and classification. The TKEO algorithm has formerly been employed in speech signal processing systems, but this is the first time it has shown promise in a nonstationary and nonlinear signal engineering application. Under any condition, TKEO is a simple method that is temporally localized, easy to compute, and capable of correctly monitoring the signal's instantaneous changes in amplitude with respect to time. In terms of fault detection and categorization, the existing methods are quite lacking. In addition, the conventional methods process a large amount of data and this takes longer and increases the computational burden for fault analysis.
(2)  Change Identification Filter (CIF) working process: Assume Z is the total number of samples for a signal x (I avrg or I diffe ) that has been sampled with S equal to 0 which is the initial sample. As a result, the CIF of the signal x may be described mathematically as: where j is the iteration number, m is the sample number starting from the 1 st sample, i.e., S + 1, while the initial CIF x i.e., CIF x (0) = 0. To identify the change on any HVDC line (positive or negative), the CIF formulation is applied to the current signal. The TKEO algorithm procedure is explained as: The numerical value of TKEO can be calculated using only 3 samples of the signal (CIP and either side of CIP).
The discrete energy of signal h(l) can be calculated as: where h(l-1) is the delayed sample and h(l + 1) is the advanced sample of h(l).
The time can be calculated as: The instantaneous amplitude is given by Indices Extraction from "Teager Energy": Consider a signal K(x) , which comprises x samples and x = 1, 2 . . . n then.
Energy (P1): The energy of the aforementioned samples, which is defined as the sum of the square of the sample, can be computed as: Amplified energy (P2): It is defined as the sum of the sample's product (x) and its square (K(x)), and can be calculated as:

Mean (P3):
The ratio of the sum of observations to the number of current signal samples (x), as: Standard Deviation (P4): It is the difference between readings acquired from repeated measurements. It is also a way to quantify the variance or scatter of data set values, which may be determined by: Kurtosis (P5) It is defined as a measure of the random variable's tiredness. The central moment is defined as the moment of the mean of a random variable (P3), which can be calculated as: where P4 = standard deviation, P3 = mean.
Entropy (P6): It is a measure of a random variable's randomness that can be calculated as: Variance (P7): It is defined as a measure as to how far a set of random numbers deviates from their mean value, which can be calculated as.
Maximum Amplitude (P8): It is described as a measure of the maximum amplitude value as:

Simple Decision Tree-based fault classifier
Equations (1) and (2) are used to determine the differential and average current signals at CIP. The performance indices are calculated using the TKEO energy from the differential current signals. For fault type and identification, the proposed approach allows the indices described above to pass through an event classifier. The proposed method, as shown in the flow chart in Fig. 16, is used to handle the eight indices stated above. On a VSC-based HVDC bipolar transmission system, the Decision Treebased fault classifier can quickly identify the fault type and its classification. The fault classifier's decision-making process is described in detail in the tree diagram   data sets, one for each kilometer of the 150 km HVDC link. The two currents (differential and average currents) are used to tabulate each index data set at the faulty condition, each comprising 596 × 8 feature data sets. For the system, a total of (596 × 8) × 2 feature data sets have been produced. The indices data sets for the four faults at fault positions at 30 km, 60 km, 90 km, 120 km, and 149 km are provided in the tables for ease of understanding. Tables 5, 6, 7 and 8 clearly indicate that the ranges of numerical values produced at various distances are associated with various types of indices, allowing for accurate fault detection and fault categorization.
For a better understanding of the proposed strategy, a few example distances and their respective values for all the eight indices (P1-P8) are provided. The fault detection and its classification can be accurately determined since different attributes have unique values. Every individual index has its own unique number that does not overlap with other indices, indicating that the fault detection and classification can be determined accurately more quickly.

Performance evaluation of the proposed strategy
The performance of the proposed classifier is evaluated using indices generated from the "Teager Energy" using the TKEO method. The Simple Decision Tree-based fault classifier framework has a training and testing mechanism. During the training phase, initial parameters are optimized, and these values are then tested. From a total of 1192 (298 × 4) differential current data samples (I diffe ), 477 (40%) are chosen at random throughout the testing phase. The fault-finding efficiency of the proposed approach is 98.75%. The efficiencies of the proposed approach are displayed in Table 9, which are calculated as: As previously described, the total fault samples of differential current (I diffe ) are 298 × 4 = 1192, of which 60% (715) are trained for system training purposes and the rest 40% (477) are examined to check the efficiency and accuracy of the detection of the fault and fault type. The remaining 1192 average samples are ignored to reduce computational burden for fault analysis. 1192 + 1192 = 2384 are the total data sets generated for the 8 indices using the differential and average currents at CIP at each km.
The procedure for calculating efficiency is as follows: • Step 1:Assign A = 0 if the test sample size is B and the number of correctly classified data sets is A. • Step2 A random number x, along with an initial guess, should be created. If x falls between data sets 1 and 298, the fault type is PG, and then u is set to 1. The fault is a PN type fault if x is between 299 and 596, and u is set to 2. The fault type is NG and u is set to 3, if x is between 597 and 894. Otherwise, the fault is of PNG type if x is between 895 and 1192, and u is set to 4. (17) Efficiency η% = Number of rightly classified data samples Randomly pickdup samples from total group set × 100      • If the algorithm determines that the fault is caused by a PG short circuit, set v to 1. • If the program detects a short circuit between the negative pole and the positive pole (PN), v is set to 2. • v is set to 3 if the program checks for a short circuit fault between ground and the negative pole. • v is set to 4 if the program checks for short circuit between grounds, negative pole, and positive pole (PNG).
Step5 To detect the fault, the efficiency is calculated as: The proposed approach is compared to existing methods in terms of efficiency to confirm that it provides improved protection efficiency, as illustrated in Table 10.
The following conventional methods are considered: 1. An approach based on Park theory and a wavelet transform [28]. Converting line voltage or current signals into dq 0 components and analyzing their behavior during faults to find patterns that signal the starting of a fault is part of the process. By filtering one of the dq 0 components using the wavelet transformation and isolating band frequencies of interest, the finite difference between samples of the filtered signal can be used to detect faults. 2. The method based on mathematical morphology [26]. This method detects and classifies faults by applying the MM concept's dilation and erosion median filters on current signal. 3. The method based on correlation concept [29]. To detect and classify the faults, this technique combines synchronized measured line currents and the correlation notion. The statistical cross-alienation coefficients for the measured current signals at the transmitting and receiving ends of each feeder are determined using this method. Changes in the synchronized and discretized waveforms of current signals inside a moveable window are taken into account in the fault detection and classification procedure (one-fourth cycle). 4. The method depended on reactive energy [30]to calculate the superimposed reactive energy (SRE).This technique employs the Hilbert transform. SRE is the integral of superimposed reactive power over a given time period. To identify faults in HVDC lines, several ratios are defined based on SRE.
The performance of the proposed strategy is compared in Table 11 to the aforementioned methods using the four criteria of accuracy, required average time, computational complexity and robustness to operate. Accuracy is defined as: Based on the results from Table 11, it is clear that the effectiveness of the proposed system has increased, and so has its computational complexity and classification efficiency. Table 12 shows that the average time taken to identify the fault with different fault resistances is 10 ms highlighting the superiority over other methods.

Conclusions
To detect and classify power system faults on an HVDC transmission link, a novel "Teager-Kaiser Energy Operator" (TKEO) method which is combined with a Simple Decision Tree-based fault classifier has been investigated. The differential and average current components are subjected to a Change Identification Filter (CIF), which detects the first instant of fault (greater than the threshold value) incidence (18) Accuracy % = 1 − Number of incorrect discrimination Number of whole cases from total group set × 100 This cutting-edge technology increases fault identification efficiency while improving fault classification with greater accuracy. This approach also reduces the computing complexity and the average time required to identify faults is 10 ms as only three samples are required. The importance and significance of the proposed scheme have also been thoroughly tested and compared with some conventional methods for various faults on HVDC transmission lines. The outputs are satisfactory, demonstrating the real-time applicability of the proposed scheme. This can be useful for broad area protection.