 Original Research
 Open
 Published:
kNN based fault detection and classification methods for power transmission systems
Protection and Control of Modern Power Systemsvolume 2, Article number: 32 (2017)
Abstract
This paper deals with two new methods, based on kNN algorithm, for fault detection and classification in distance protection. In these methods, by finding the distance between each sample and its fifth nearest neighbor in a predefault window, the fault occurrence time and the faulty phases are determined. The maximum value of the distances in case of detection and classification procedures is compared with predefined threshold values. The main advantages of these methods are: simplicity, low calculation burden, acceptable accuracy, and speed. The performance of the proposed scheme is tested on a typical system in MATLAB Simulink. Various possible fault types in different fault resistances, fault inception angles, fault locations, short circuit levels, X/R ratios, source load angles are simulated. In addition, the performance of similar six wellknown classification techniques is compared with the proposed classification method using plenty of simulation data.
Introduction
Distance protection is one of the major protections of power systems, utilized for detection, classification, and location of short circuit faults. In the detection stage, any change caused by different normal and abnormal conditions is recognized. Then in the classification stage, the type of faults (Ag, Bg, Cg, ABg, BCg, CAg, AB, BC and CA) is determined.
In the fault location stage, the distance between the fault and the relay is determined. Due to importance of speed and accuracy of fault detection and classification units, too many investigations have been dedicated to these fields.
When a fault occurs in the power system, variables such as current, power, power factor, voltage, impedance, and frequency change. Many detection techniques detect fault occurrence by comparing the postfault values of these variables with their values during system normal operation. Some of fault detection methods are based on Kalman filter [1], first derivative method, Fourier transform (FT), and least squares [2]. Some other methods are based on differential equations [2], travelling waves [3, 4], phasor measurement [5], discrete wavelet transform [6], fuzzy logic, genetic algorithm [7] and neural network [8].
Also, many efforts have been made in the field of fault classification, which can be broadly categorized in two main groups. First, methods that are based on signatures of the signals and definition of some criteria such as: discrete wavelet transform (DWT) [9,10,11,12,13], Fourier transform (FT), Stransform [14], adaptive Kalman filtering [15], sequential components [16, 17], and synchronized voltage and current samples [18]. The second group includes the methods based on artificial intelligence techniques such as: Artificial Neural Networks (ANN) [19,20,21], fuzzy logic [22, 23], Support Vector Machine (SVM) [24,25,26], and decisiontree [27].
In this paper, two new methods are presented for detection and classification of faults. A moving window with the length of half cycle of power frequency is considered and the RMS value of the current samples is computed in the window. The RMS value obtained in the last window before fault, in which the fault instant is the last sample, is saved. The current waveforms are divided by the saved RMS value. Then, kNN algorithm is applied to these normalized waveforms and their squares in classification and detection methods, respectively.
In the detection method, a moving window with the length of half cycle is considered. In the window, besides finding the fifth nearest neighbor for each point of the squared normalized currents, the distance between each point and its corresponding neighbor is found. By comparing the maximum distance in each window with an adaptive threshold, the fault is detected.
The classification method has a similar trend, but the kNN algorithm is applied to the instantaneous values of normalized threephase currents and length of the window is three quarters of a cycle.
Various scenarios including different fault types, fault inception angles, fault resistances, fault locations, sources phase angles, X/R ratios, and short circuit levels are used to evaluate the performance of the methods in a simulated typical fivebus power system. Also, in order to evaluate the performance of the proposed classification method, it is compared with six other similar methods. The methods are compared in terms of delay time and accuracy using a data set including 450 different cases. Beside the simplicity, the proposed techniques have small calculation burden and high accuracy. Moreover, the methods performance is preserved in different conditions.
The remainder of this paper is organized as follows: Section 2 presents the understudy power system. In Section 3, basis of kNN and its application for fault detection as well as an improved fault detection algorithm are presented. In Section 4, the proposed classification algorithm is introduced. The simulation results are presented in Section 5. A comparison between the performance of the proposed method and some other similar methods is presented in Section 6. Finally, the main conclusions are presented in Section 7.
Simulated power system
A fivebus power system is modeled in MATLAB Simulink. A schematic single line diagram of the under study system is presented in Fig. 1. The modeled system comprises of two generators, four transformers and active and reactive loads connected to buses 4 and 5. Detailed specification of the system components are as follows:

Generators: Rated line to line voltage is 20 kV, threephase shortcircuit power is 1000 MVA, frequency is 50 Hz, X/R ratio is 10. Also it is assumed that the angles of sources 1 and 2 are 0 and −10 degree, respectively.

Transformers: Rated power is 600 MVA, voltage ratio is 20/230 kV with deltastargrounded connection, its primary and secondary impedances are 0.06 + j0.3 Ω and 0.397 + j2.12 Ω.

Lines: All of line impedances are 0.02 + j0.15 Ω/km. Lines 1–2, 2–3, 3–4, 4–1, and 5–2 are 200, 70, 120, 40, and 50 km, respectively.

Loads: The active and reactive powers of load 1 are 400 MW and 100 MVAr, respectively. The active and reactive powers of load 2 are 100 MW and 50 MVAr, respectively.
Sampling frequency: It is equal to 10 kHz.
The proposed change detection scheme
kNearest Neighbor algorithm (kNN)
The kNN algorithm is a nonparametric classification method that can achieve high classification accuracy in problems with nonnormal and unknown distributions. For a particular sample, k closest points between the data and the sample are found. Usually, the Euclidean distance is used, where one point’s components are utilized to compare with the components of another point.
The basis of kNN algorithm is a data matrix that consists of N rows and M columns. Parameters N and M are the number of data points and dimension of each data point, respectively. Using the data matrix, a query point is provided and the closest k points are searched within this data matrix that are the closest to this query point.
In general, the Euclidean distance between the query and the rest of the points in the data matrix is calculated. After this operation, N Euclidean distances which symbolize the distances between the query with each corresponding point in the data set are achieved. Then, the k nearest points to the query can be simply searched by sorting the distances in ascending order and retrieving those k points that have the smallest distance between the data set and query.
The proposed fault detection algorithm
Considering fixed sampling frequency, Euclidean distance between each sample and other samples of a considered sliding window varies when a change occurs. In fact, Euclidean distance represents differences between the samples values. kNN algorithm can derive variation of the Euclidean distance for change detection. In this work, a sliding window with length of half cycle of power frequency is moved on squared normalized current waveform of each phase. Then, kNN algorithm is applied to the samples of each window and the fifth nearest neighbor for each sample and the distance between them is obtained. Finally, the maximum distance is selected for each phase named M_{a,D}, M_{b,D}, and M_{c,D}. Based on different simulations, it is confirmed that the fifth nearest neighbor gives the best accuracy. In addition to the derived fifth neighbor, the distance between each sample and its corresponding fifth neighbor is derived. Considering sampling frequency 10 kHz, there are 100 samples in each half cycle, result in 100 different distances. Among them, the maximum distance is compared with a certain threshold value to detect fault condition.
In case of change occurrence, the sample corresponding to the change enters the end of the window. It is observed that after three or four samples, the maximum distance of some or all of the phases exceed the threshold value. By considering an appropriate value for the threshold, it is possible to detect the fault after 0.2 ms to 0.4 ms. In this study, I_{th,D} = 0.0667 is selected for fault detection threshold. Flowchart of the proposed algorithm for change detection is shown in Fig. 2.
In Fig. 3, the proposed criterion for some different fault cases is presented. The instants of change occurrence and the relevant detection times, are shown.
The proposed fault classification scheme
The general approach for fault classification is the same as detection method. However, in the classification method the kNN algorithm is implemented in a window applied to normalized current waveforms with length of three quarters of a cycle, called analysis window. The considered k value and length of analysis window are selected based on different simulations to achieve the best accuracy and speed for the classification.
In Fig. 4, threephase distances values for some different fault types with negligible resistance and inception instant equal to 0.2002 s are presented. In these figures, the fifth nearest neighbor for each sample of the analysis window is shown.
It is obvious, the distance between each sample of current and its fifth neighbor is a suitable criterion for fault classification. By choosing the maximum distance for each phase (M_{a,C,} M_{b,C}, and M_{c,C}) and comparing it with a threshold value, the type of fault can be determined. It is obvious that the values of M_{a,C,} M_{b,C}, and M_{c,C} are obtained exactly the same as detection method, but in a window with the length of three quarters of a cycle. The best threshold value is selected using different simulations.
Some other considerations are taken into account for the classification method, which are as follows:

1.
For discrimination between two phase faults (LL) and grounded two phase faults (LLg), the means of three phases’ corresponding current samples in the analysis window is obtained and the maximum mean is utilized as follows:
$$ Mi=\max \left(\frac{ia+ ib+ ic}{3}\right)\kern0.5em in the analysis window $$
In case of grounded faults (LLg), Mi > 100 A and Mi < 1 A for two phase faults (LL). This criterion can discriminate between LL and LLg with a very high accuracy.

2.
In order to omit the initial transient behavior of the signal, twenty first samples of the window are not considered.
The flowchart of the classification method is presented in Fig. 5 . Threshold I_{th,C} is set to 0.1108.
Test cases and simulation results
Case 1: Various fault types
Different fault types are applied at the middle of line 1–2 of the power system shown in Fig. 1. The results are shown in Table 1. The faults are solid and applied at an identical inception instant 0.2002 s. Results including the discrimination criteria (Mi) and the maximum distance of each phase are presented in Table 1. From the results, one can conclude that the proposed method is able to classify different faults using the mentioned rules.
The results for each group of phasetoground, phasetophasetoground, and phasetophase faults are similar. Therefore, hereafter only four types of faults including: Ag, ABg, AB, and ABC are considered.
Case 2: Various inception instants
In Table 2, the results for different inception instants are presented for the mentioned faults. The inception instant is varied by step 3 ms. Faults are also considered solid type. The results confirm that the proposed method is able to classify faults at different inception instants.
Case 3: Various fault resistances
In Table 3, the results of this case study for fault resistances 10, 30, 50,70, and 90 Ω, are shown. The faults are applied at an identical inception instant 0.2002 s. From the results, it is confirmed that the proposed method has acceptable performance for fault resistance up to 90 Ω. Although the technique can also classify the faults with resistances more than 90 Ω, the performance may be less than the acceptable value.
Case 4: Various fault locations
One of the other challenges that should be considered for a fault identification technique is location of the fault in the transmission lines. In this test case, the system is analyzed with a fault applied at 0%, 20%, 40%, 60%, 80%, and 100% of the transmission line 1–2. Results of the four fault types are shown in Table 4. The faults are solid type and applied at an identical inception instant 0.2002 s.
In addition, several faults for locations more than 100% are simulated. The faults are applied at 105%, 110%, and 120% of the transmission line 2–5 at an identical inception instant 0.2002 s. The results are tabulated in Table 5.
From the results, it can be concluded that the performance of the proposed method is preserved even for locations more than 100%. It should be mentioned that the performance of the proposed method degrades for locations more than 120%.
Case 5: Various sources load angles
The results for various angles, according different inception instant, fault resistances, and fault types verify that proposed method classify the faults in different values of sources load angles. For abbreviation, the results relevant to this case are not presented.
Case 6: Various X/R ratios
Different X/R ratios impact on the performance of the proposed method is also investigated, considering different inception instant, fault resistances, and fault types. From the results, it can be concluded that accuracy of the proposed method is preserved for different values of X/R ratios.
Case 7: Various short circuit levels
The performance of the proposed method is also evaluated for various sources short circuit levels. The algorithm also has desirable performance for these cases.
Case 8: Various load levels
In Table 6, the results of some simulated cases for noload and loads with fraction of the nominal value are shown. It should be noted that for each load, different load values are considered in the condition of noload of the other one. All the faults are applied in the location of 80% of the transmission line 1–2. From the results, one can observe that the performance of the proposed method is preserved in different load levels.
Case 9: Current transformer saturation
The performance of the method is also evaluated during current transformer saturation. Two typical cases are considered. The faults are solid type and applied at an identical inception instant 0.2345 s. The classification criteria for both cases are shown in Fig. 6 and Table 7. It is observed that the proposed method is able to classify the faults during current transformer saturation.
A comparison with other techniques
The performance of the proposed method is compared with six other similar approaches in this Section. All of the methods are evaluated using an identical data set in similar conditions. The six methods are briefly reviewed as follows:
a. Sequence Component [16]: This technique classifies the faults using the phase differences between positive and negative sequences. Also, relative magnitudes of negative and zero sequences from prefault to the fault stage are used to distinguish between phasetophase (LL) and phasetophasetoground (LLg) faults.
b. Alienation Coefficients [28]: In this algorithm, alienation technique is applied to two half successive cycles with the same polarity. The alienation coefficients of the successive cycles as two dependent variables are calculated. This technique is capable of classification using only threephase current waveforms and its delay time is half cycle of power frequency. Also, another version of this approach is presented in [29].
c. Discrete Wavelet Transform [23]: Daubechies family of wavelet transform is used in this technique. Third level output among different decomposed levels is used and the summation of detailed current signals for each phase (S_{a}, S_{b,} and S_{c}) is obtained. If the summation of Sa, S_{b,} and S_{c} is equal to zero, then the fault type is either threephase or LL, otherwise, it is phasetoground (Lg) or LLg fault.
d. Fuzzy Logic [22]: The prerequisite of this technique is fault occurrence time. In this algorithm, using measured current samples, some specific characteristics for the samples are defined for the fault classification. The technique takes three quarters of a cycle to classify the fault.
e. Using RMS Values of current: A simple approach to classify the faults is based on comparing the RMS values of threephase current waveforms with a certain threshold. The RMS values of the phases are obtained using Fourier transform in a half cycle window after fault occurrence. Discrimination between LL and LLg is determined using zero sequence component of current, which is large for LLg and zero for LL.
f. Using RMS Values of Voltage: This technique is exactly the same as previous method for threephase voltage signals. Type of fault is determined when the RMS values of the voltages become less than a certain threshold.
The performance of the proposed method is compared with the abovementioned methods based on following factors; the results are tabulated in Table 8:

Fault resistances

Fault inception instants

Fault locations

Generators X/R ratios

Phase difference between two generators

Generators short circuit levels

Delay operation time

Error percentage
The number of the whole cases considered in this Section is 410; 200 cases for different fault resistances and inception instants, 50 cases for different fault locations, 70 cases for different sources X/R ratios, 50 cases for different sources angles, and 40 cases for different short circuit levels.
In Table 8, error percentages for the above mentioned factors are calculated as the ratio of number of malfunction operations to number of the relevant cases. Then, total error percentage for each method is calculated as ratio of number of whole malfunction operations to number of whole the cases.
Techniques a and d have a delay time 15 ms and techniques b, c, e, and f have a delay time 10 ms. Among the methods with delay time 15 ms, fuzzy logic has a very good performance with only 0.49% error.
The proposed technique has a good performance with error percentage of 1.95% and average delay time of 15 ms. Based on the calculated total error percentage and delay time, it is confirmed that the proposed method has acceptable performance in comparison with other methods.
Conclusion
Two simple methods for fault detection and classification are presented in this paper. The methods are based on kNN algorithm. Plenty of simulations were used in order to evaluate the performance of the methods. The performance of the proposed classification method is compared with six other similar methods. From the results, the good accuracy and speed of the methods are confirmed. The classification technique has accuracy about 98% for the considered data set with 15 ms average delay time.
References
 1.
Chowdhury, F. N., Christensen, J. P., & Aravena, J. L. (1991). Power system fault detection and state estimation using Kalman filter with hypothesis testing. IEEE Transactions on Power Delivery, 6(3), 1025–1030.
 2.
Öhrström, M., & Söder, L. (2002). Fast fault detection for power distribution systems. Power and energy systems (PES), Marina del Rey, USA, may 13–15.
 3.
Magnago, F. H., & Abur, A. (1999). A new fault location technique for radial distribution systems based on high frequency signals. IEEE in Power Engineering Society Summer Meeting, 1, 426–431.
 4.
Xiangjun, Z., Yuanyuan, W., Yao, X. (2010). Faults detection for power systems. INTECH Open Access Publisher. In W. Zhang (E.d.), Fault Detection (pp. 512). InTech. ISBN 9789533070377. doi:10.5772/56395. https://www.intechopen.com/books/faultdetection
 5.
Gopakumar, P., Reddy, M. J. B., & Mohanta, D. K. (2015). Transmission line fault detection and localisation methodology using PMU measurements. Journal of IET, Generation, Transmission & Distribution, 9(11), 1033–1042.
 6.
Bezerra Costa, F. (2014). Faultinduced transient detection based on realtime analysis of the wavelet coefficient energy. IEEE Transactions on Power Delivery, 29(1), 140–153.
 7.
Haghifam, M. R., Sedighi, A. R., & Malik, O. P. (2006). Development of a fuzzy inference system based on genetic algorithm for highimpedance fault detection. Journal of IEE ProceedingsGeneration, Transmission and Distribution, 153(3), 359–367.
 8.
Baqui, I., Zamora, I., Mazón, J., & Buigues, G. (2011). High impedance fault detection methodology using wavelet transform and artificial neural networks. Journal of Electric Power Systems Research, 81(7), 1325–1333.
 9.
Shaik, A. G., & Pulipaka, R. R. V. (2015). A new wavelet based fault detection, classification and location in transmission lines. International Journal of Electrical Power & Energy Systems, 64, 35–40.
 10.
Torabi, N., Karrari, M., Menhaj, M. B., Karrari, S. (2012). 'Wavelet Based Fault Classification for Partially Observable Power Systems. IEEE, In AsiaPacific Power and Energy Engineering Conference (APPEEC) (pp. 1–6).
 11.
Usama, Y., Lu, X., Imam, H., Sen, C., & Kar, N. (2013). Design and implementation of a wavelet analysisbased shunt fault detection and identification module for transmission lines application. IET Journal of Generation, Transmission & Distribution, 8(3), 431–444.
 12.
Guillen, D., Arrieta Paternina, M. R., Zamora, A., Ramirez, J. M., & Idarraga, G. (2015). Detection and classification of faults in transmission lines using the maximum wavelet singular value and Euclidean norm. IET Journal of Generation, Transmission & Distribution, 9(15), 2294–2302.
 13.
Liu, Z., Han, Z., Zhang, Y., & Zhang, Q. (2014). Multiwavelet packet entropy and its application in transmission line fault recognition and classification. IEEE Transactions on Neural Networks and Learning Systems, 25(11), 2043–2052.
 14.
Dash, P. K., Das, S., & Moirangthem, J. (2015). Distance protection of shunt compensated transmission line using a sparse Stransform. IET Journal of Generation, Transmission & Distribution, 9(12), 1264–1274.
 15.
Girgis, A., & Makram, E. B. (1988). Application of adaptive Kalman filtering in fault classification, distance protection, and fault location using microprocessors. IEEE Transactions on Power Systems, 3(1), 301–309.
 16.
Adu, T. (2002). An accurate fault classification technique for power system monitoring devices. IEEE Transactions on Power Delivery, 17(3), 684–690.
 17.
Rahmati, A., & Adhami, R. (2014). A fault detection and classification technique based on sequential components. IEEE Transactions on Industry Applications, 50(6), 4202–4209.
 18.
Esmaeilian, A., & Kezunovic, M. (2014). Transmissionline fault analysis using synchronized sampling. IEEE Transactions on Power Delivery, 29(2), 942–950.
 19.
Butler, K. L., Momoh, J. (1993). Detection and classification of line faults on power distribution systems using neural networks. IEEE Proceedings of the 36th Midwest Symposium, In Circuits and Systems. (pp. 368–371).
 20.
Upendar, J., Gupta, C. P., Singh, G. K. (2008). ANN based power system fault classification. IEEE, In Region 10 Conference (TENCON), November, (pp. 1–6).
 21.
Tayeb, E. B. M., Rhim, O. A. A. A. (2011). Transmission line faults detection, classification and location using artificial neural network. IEEE, international conference, utility exhibition on power and energy systems: Issues & prospects for Asia (ICUE), September.
 22.
Mahanty, R. N., & Gupta, P. D. (2007). A fuzzy logic based fault classification approach using current samples only. Journal of Electric power systems research, 77(5), 501–507.
 23.
Reddy, M. J., & Mohanta, D. K. (2007). A waveletfuzzy combined approach for classification and location of transmission line faults. International Journal of Electrical Power & Energy Systems, 29(9), 669–678.
 24.
Shahid, N., Aleem, S. A., Naqvi, I. H., Zaffar, N. (2012). Support vector machine based fault detection & classification in smart grids. IEEE, In Globecom Workshops (GC Wkshps), December, (pp. 1526–1531).
 25.
Livani, H., Evrenosoğlu, C. Y. (2012). A fault classification method in power systems using DWT and SVM classifier. IEEE PES, In Transmission and Distribution Conference and Exposition (T&D), May, 1–5.
 26.
Moravej, Z., Pazoki, M., & Khederzadeh, M. (2015). New patternrecognition method for fault analysis in transmission line with UPFC. IEEE Transactions on Power Delivery, 30(3), 1231–1242.
 27.
Swetapadma, A., & Yadav, A. (2015). Dataminingbased fault during power swing identification in power transmission system. Journal of IET Science, Measurement & Technology, 10(2), 130–139.
 28.
Masoud, M. E., & Mahfouz, M. M. A. (2010). Protection scheme for transmission lines based on alienation coefficients for current signals. IET Journal of Generation, transmission & distribution, 4(11), 1236–1244.
 29.
Samet, H., ShabanpourHaghighi, A., & Ghanbari, T. (2017). A fault classification technique for transmission lines using an improved alienation coefficients technique. doi:10.1002/etep.2235. http://onlinelibrary.wiley.com/doi/10.1002/etep.2235/abstract.
Author information
Affiliations
Contributions
All authors read and approved the final manuscript.
Corresponding author
Correspondence to Haidar Samet.
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Short circuit faults
 Fault detection
 Fault classification
 K nearest neighbor algorithm