Skip to main content

Human reliability analysis in maintenance team of power transmission system protection

Abstract

The requirement for reliable electrical energy supply increases continuously because of its vital role in our lives. However, events due to various factors in the power grid can cause energy supply to be interrupted. One of these factors is human error and thus human reliability analysis is a serious element in the industry. The first step is to identify the roots of human error, on which there has been limited research in this area. In this paper, the potential and actual causes of human error in maintenance teams of power transmission system protection are identified and predicted within a framework of human factors analysis and classification system method. Then, human error factors are ranked to help improve human reliability. The proposed method is implemented in the Fars Electricity Maintenance Company.

Introduction

Increase in electrical energy consumption requires more stable and reliable power systems, and any interruption or disturbance in the supply of modern sensitive loads may lead to high cost.

According to the annual reports from NERC [1] and In the Iran Grid Management Company [2], about 70% of electrical outages are due to equipment failures or problems in power grids, and about 9 to 17% of the outages are rooted in human error. Various studies have been carried out to identify the cause of these interruptions. Most of these analyses attempt to find the technical roots of equipment failures and solutions. However, less attention has been paid to the investigation of human error in the power transmission industry [3, 4]. Surveys show that human error can affect the safety of personnel and equipment, as well as reduce the reliability of the network. It can also affect the income of electricity companies through loss of energy transmission and electricity market penalties. The impact of human error on the safety of personnel in terms of the health, psychological, and social integrity aspects are much more important than the technical aspects of failures and errors. Human reliability analysis (HRA) to reduce the causes of human error is needed [5].

Research on HRA began in the 1950s. The probability of activities carried out correctly by a person over a given period under certain working conditions is called human reliability [6]. The first step in HRA is to identify the roots of human error, and many relevant studies have been carried out in different industries, especially those in nuclear, structure, aviation, and petroleum [7, 8]. However, in the power transmission industry, despite its wide range, human factor studies and root identification have not been carried out comprehensively.

The relevant studies on power systems are mainly limited to analyzing and monitoring human error and its effect on the failure of power transmission systems. However, as far as we know, there is no comprehensive report studying the root causes of human error. In this paper, a method for identifying the potential and actual root causes of human error in power transmission system maintenance is proposed.

To control the human factors, it is necessary to properly recognize the potential errors. Various models (such as Technique for Human Error Assessment (THEA), Predictive Human Error Analysis (PHEA), etc.) have been used in recent years to identify and analyze human error [9, 10]. Among these methods, the HFACS method conducts a systematic procedure to find the possible causes of human error, e.g., decision error, and is compatible for analyzing human error in a power system blackout. The main advantage of this method is the division of various human error factors into a comprehensive framework of errors made by persons involved in maintenance operations (repair workers, supervisors, managers, and administartion personnel. By analyzing past events and examining the conclusions of experts, HFACS can present a comprehensive framework of human error at the four levels of unsafe acts, precondition for unsafe acts, unsafe supervision, and organizational influence.

Background and related work

One of the main causes of events in most industries, including nuclear power plants, aviation, chemical industries, etc., is human error. These are unintended errors that could lead to sudden failure [11].

Colombia’s blackout in 2007 left 41 million people without power for 4.5 h. This was caused by a human error during the maintenance of a protective device in a 230 kV substation. In [12], the roots of human error in this event were identified as deficiencies related to operator training, protection settings and coordination, protection and control schemes, and instructions for scheduling and performing maintenance. Also, according to studies conducted in [13], 32% of major blackouts in some parts of the world from 2011 to 2019 were due to human error or equipment violations. Therefore, human error is an important factor affecting the reliability of power systems. Reference [14] proposes an evaluation model to assess the reliability of a power system, and includes human error and protection system failure. The evaluation results show that to have a reliable system, it is necessary to pay attention to human factors in the power grid and to manage human error. According to the above discussions on human reliability and human error in power grids, studies can mainly be described through five different topics as follows:

  1. 1.

    Identifying and investigating the causes of human error

Reference [15] examines the perspectives of five short notes reports on human factors in electric utility dispatch control centers. According to the analysis, the common denominators of all notes are the stressor and stressful conditions. Reference [16] investigates human reliability during events that occurred in the Chinese power system, and the human error factors corresponding to each event are identified using the CREAM method. The fuzzy-clonal method is then applied to classify the identified factors to determine the worst factor. However, no solution is proposed to solve the problem. Various factors such as environmental, organizational, job factors, personal characteristics, etc. that affect human reliability in maintenance are introduced in [17] and it shows the extent of human reliability being affected by changes in the factors. The research presented in [18] introduces “motivation” and “competence” as the most important human factors influencing the performance of power transmission maintenance personnel. In [19], fatigue, knowledge, experience, and time pressure are recognized as the most important human factors, while [20] shows that older operators’ unwillingness to use personal safety instructions or equipment due to over-reliance on their experience increases error. In [21], human factors including the complexity of human-machine interaction, conscious and unconscious human error, are considered as one of the five risk elements in the development of a new energy power system. It also shows that the complexity of human-machine interaction is more important than the other two factors, and suggests that employees fully follow operating rules and spend more time in training. In [22], a study on job stress in human resource management is conducted and the results show that psychological factors are also very important and effective

  1. 2.

    Quantitative calculation of human reliability

In [23], a suitable method for quantitative assessment of human reliability is presented. However, in the proposed method, organizational factors and inter-dependency between operators in power system switching operations are not considered [6, 24]. To improve the method in [23], references [6, 24] propose methods for measuring and analyzing human reliability in a power system, methods which take into account organizational factors and the interdependency between operators during a switching operation. The results indicate that the probability of human error is much closer to the actual situation recorded in statistics. The probability of human error is estimated in [25] by combining the two methods of a Success Likelihood Index Model and a Bayesian Network.

  1. 3.

    Consequences of human error

The studies in [3, 26] verify and analyze the effects of human error on power system reliability and conclude that it is essential to consider human factors when determining maintenance policy. It has shown in [14] that the two reliability indicators of LOLP and EPNS in power system increase with human error.

  1. 4.

    Evaluating personnel performance

Nowadays, electricity company operators need to manage large volumes of data because of the sensitivity of electrical energy supply, and are required to solve the problems related to unexpected events in a power grid quickly. In this way the impact of these unexpected events on consumers can be minimized. Hence, in [27], a method is designed with the help of technical and economic indicators to evaluate the performance of distribution network operators. In the case study, it is shown that the impact of human error on the consequent interruption duration for priority consumers and revenue reduction is more severe than others. The study in [28] proposes a new method for evaluating the impact of dispatchers’ excessive workload on human error. This method evaluates the dispatchers’ workload from the four dimensions of information comprehension, speech output, action output, and attention. Results from the study on ten dispatchers reveal the probability of human error due to inappropriate workload.

  1. 5.

    Methods of reducing human error

Reference [19] proposes an approach to identify effective solutions for reducing human error in maintenance activities based on cost-benefit analysis at the Kenya power plant. The paper divides the causes of human error into 11 factors and the most important of which are the use of instructions, fatigue, transfer of knowledge and experience, and time pressure one of the maintenance activities of power transmission lines is their inspection, which can cause fatigue to the human inspectors because of the long distances involved and sometimes impassable routes. Therefore, power line inspection software is presented in [29] to reduce human error. Experiments performed by the software show that inspectors’ workload is significantly reduced, as is human error. One way to reduce the fatigue of maintenance personnel is to use digital substations, as remote testing eliminates the need to travel to the substation. In [30] the definition of digital substations and maintenance testing and remote testing is introduced. Reference [31] shows that the effectiveness of a human operator in automated systems depends on the characteristics of the workplace and the working environment. It proves that taking measures to improve the working environment, in addition to improving ergonomic indicators, is profitable for businesses. Awareness of the situation at the time of error is one of the important factors in developing security of the power system. Because the operator can make successful decisions in a timely manner and without errors and can prevent the cascading outages. In [32] the increase of situational awareness by fault location through fault passage indicators (FPI) has been investigated

Although various research has been carried out on human error and its effects, many issues have not been fully addressed. The main purpose of this paper is to clarify the following issues:

  1. 1)

    “Human error” is the consequence of circumstantial and situational factors that affect human performance while itself is not the cause of events [4]. Therefore, it is necessary to identify these situational factors from all aspects such as organization, personnel, workplace, etc. to reduce human error. Research has so far only identified the causes of human error to a limited extent.

  2. 2)

    In September 2011, a maintenance error and weaknesses in operation planning caused power outage for more than six million households from Southern California and Arizona to northwestern Mexico [4]. According to the report in [12], the cause of two of the 14 major accidents in the world from 2003 to 2015 is human error of the maintenance groups. Studies of human factors in the maintenance of the aviation industry have been extensively conducted, while other industries have been slow to integrate human factors into their maintenance performance measurements [18]. Studying the factors affecting power industry maintenance groups is different from the factors affecting power grid operators or maintenance groups in other industries for the following reasons:

    • Working in an HV electric environment.

    • Existence of miscellaneous types of protection equipment from different manufacturers.

    • Location of power transmission substations in different geographical and climatic zones.

    • Working with RPMT in unusual times such as at night or on weekends.

  3. 3)

    To improve quality, maintenance operations are always supervised by supervisor groups. Therefore, the performance of the supervisor groups can affect that of the PMET. However, previous studies have not examined such factors.

Methods

Preventive maintenance of electrical equipment is crucial to maintain the stability of a power system and extend the service life of the equipment [26] and human factors strongly affect the efficiency of the maintenance task [18]. This paper proposes a method for analyzing the human error caused by the transmission maintenance team according to the algorithm shown in Fig. 1. The power system of Fars Electricity Maintenance Company (FEMC) has been selected as a case study.

Fig. 1
figure1

Framework of the proposed method for the analysis of human error caused by maintenance team of power transmission system

Identification of high-risk teams

The activities and duties of all the maintenance workgroups that can lead to human error are recognized, and Fig. 2 shows the personnel chart of the FEMC maintenance teams. In the FEMC power grid, which has 15,000 km transmission lines and 250 substations, more than 75 maintenance groups are daily involved with the maintenance and most of these groups have 3 personnel.

Fig. 2
figure2

The structure of maintenance teams affecting human error

Execution teams are the persons who are responsible for maintenance in transmission substations and lines over 63 kV. Headquarters teams are supportive of the executive teams from scientific, financial, and administrative aspects. In this group, two subgroups directly affect system protection, i.e., the protection relay setting team and the spare parts team. If the personnel in these two subgroups do not perform as required, unplanned interruptions of the power grid may be caused because of incorrect setting of the protection relays or the purchase of poor-quality equipment. The other subgroups in the headquarters group can also indirectly affect the performance of the executive groups. For example, should the financial subgroup not provide financial resources well, it could cause personnel dissatisfaction and misconduct.

As shown in Fig. 3, human error is primarily the cause of power grid outages of the Fars Regional Electricity Company due to maloperations in the protection sector for a period of 5 years from 2012 to 2017. As is seen from Fig. 3, human error of the maintenance teams is divided into three categories while the error of the RPMT is the largest cause of power outages. Therefore, the RPMT is detected as the high-risk work team.

Fig. 3
figure3

The causes of power grid outage of Fars Regional Electricity Company due to maloperations in the protection sector from 2012 to 2017

HRA method selection

There is no large labeled dataset available regarding the roots of human error [33]. Thus, it is crucial to develop a systematic method that can detect the main causes of human error and classify them by using small labeled datasets. Reference [4] studies the HRA methods and shows that the THERP and CREAM methods are the most common ones in power system application. However, these methods belong to the older generations of HRA [14], and are time consuming, and do not have a clear procedure for error detection [4].

In 1990, Reason presented a model for identifying human error in air accidents, but no corrective solutions were proposed [33]. Shappell and Wiegmann introduced a model called HFACS that was developed based on the Reason model to identify human error [34]. HFACS was argued by Dekker in 2002 to be one of the most powerful tools for examining different types of incidents [35]. This method is divided into four categories based on the structure: organizational influence, unsafe supervision, precondition for unsafe acts, and unsafe acts of the operator. Then, the HFACS model is used to summarize and categorize these roots for the following reasons:

  1. 1)

    Extensive analysis of human error considering the multiple causes of human failure [36].

  2. 2)

    Considering a framework for identifying causes of supervision.

  3. 3)

    General terms and descriptors allowing the HFACS method to be used for a wide range of industries and activities.

  4. 4)

    Among the latest generation of HRA.

Collection and classification data

The steps of identifying, collecting, and classifying the causes of human error in this paper have been done according to the framework of Fig. 4, for which more details will be given in Section 4.

Fig. 4
figure4

Framework of identifying, collecting, and classifying the causes of human error

Identify a technique for assessing

Since controlling and reducing the 60 causes of human error identified in Section 4 are difficult and time-consuming, these causes are ranked to prioritize the important and key problematic factors to avoid wasting time and money.

The calculation of the probability of the occurrence of basic causes (roots) from the perspective of maintenance personnel will be described in Section 5.

Ranking and solution

Ranking, analysis, and proposing solutions for high-priority error roots are expressed in Section 6.

Roots of human error of maintenance teams

The following procedure is considered to identify and classify the causes of error based on the proposed framework in Fig. 4:

  • Studying papers and research on human error, especially in the field of maintenance.

  • Reviewing the steps of maintenance implementation.

  • Investigating the history of electric power transmission industry events due to human error.

  • Interviewing maintenance experts and technicians who have made mistakes.

  • Interviewing skilled staff in the maintenance department.

  • Interviewing expert supervisors on maintenance personnel.

  • Interviewing safety experts.

  • Obtaining the findings and comments and questionnaire design for human error description in the power system through the HFACS framework.

  • Selecting a population based on the Cochran relationship with a 5% error (approximately 132 people) and reviewing the questionnaire comments.

  • Finalizing the basic causes of human error at the four levels of error in the HFACS method.

Unsafe acts level

A maintenance executive team is made up of two or three people one of whom is the team leader with more experience, expertise, and skills. However, in the analysis of the events of FEMC in 2016 and 2017, it is clear that most of the human error events were made by the more experienced persons. The investigation shows that the team leaders, because of the repetitive tasks over many years, refrain from performing the tasks properly, including preliminary study of the protection plan, step-by-step follow-up of the instructions and checklists, etc.

In addition, misunderstanding or devaluing the instructions and the test sheets can also lead to human error. Twelve basic causes are identified at the intermediate error level (errors and violations) as follows:

  • Errors: Skill-based errors

    • UA1: Complex and varied electrical network equipment.

  • Errors: Decision error

    • UA2: Instructions and settings not revised precisely, and mistakes being repeated several times.

    • UA3: Existence of viewpoints that some checklist items are important and should be checked but the rest are not needed.

    • UA4: Insufficient knowledge of the cause and performance of each item in the checklist or settings.

    • UA5: Tasks being beyond a person’s ability.

    • UA6: Wrong method being chosen for maintenance.

  • Violations: Routine violation

    • UA7: Not using the drawing.

    • UA8: Not using the instructions.

    • UA9: Working in a hurry.

    • UA10: Using mobile ‘phone during work.

    • UA11: Violation of maintenance instructions.

  • Violations: Exceptional violation.

    • UA12: Personnel not ready for maintenance work for various reasons.

Precondition for unsafe acts

This level of error includes environmental factors, operator conditions, and individual factors.

Since transmission substations and lines are usually built in suburbs, to cover the proper maintenance of the network, the executive groups are centralized in cities nearby the suburbs. On average, each group covers 6000 km2 and in the event of something happening to the network, the teams can check and correct the network in the shortest possible time. However, this causes the maintenance groups to be on missions outside their workplace continuously. On the other hand, the large volume of work has caused insufficient staff relaxation and insufficient time with the family, resulting in physical and mental tiredness of the staff.

The shortage of backup technicians is one of the base causes of human error due to the intensification of the executive group activities. This is the most significant cause of human error in the human resource management sublevel from individual factors. Ten roots of the human error have been detected at the precondition for unsafe acts level as follows:

  • Operator conditions: Adverse mental state

    • UP1: Pride for various reasons including experience, age, specialty, etc.

    • UP2: Psychasthenia.

    • UP3: Not spending enough time with family can make for staff discomfort and fatigue.

    • UP4: Workload.

  • Operator conditions: Adverse physiological state

    • UP5: Physical tiredness.

  • Individual factors: Communication coordination and planning

    • UP6: Lack of backup technician.

    • UP7: Having a second task.

    • UP8: Continuous missions abroad.

    • UP9: Not paying attention to the mistakes expressed in previous operations and corrections by the relevant subgroups.

  • Environmental factors: Physical environment

    • UP10: Not having suitable environmental conditions (heat, cold, weather, etc.) can affect performance.

Unsafe supervision error level

The maintenance of power transmission networks in Iran is carried out by the non-governmental contractors that are supervised by the regional electric companies. Therefore, the supervisory groups of the employer and contractors’ headquarters indirectly affect the performance of the executive groups. The roots of the errors of the employer’s supervisory groups are defined at the three levels of inadequate supervision, supervisory violation, and failure to correct a known problem, while eight reasons for these causes of the intermediate error are identified as follows:

  • Inadequate supervision

    • US1: Supervisor not following up with the completion of the drawing defects.

    • US2: Inappropriate knowledge of equipment instruction due to lack of proper transfer of prior experience.

    • US3: Limited knowledge or experience of the supervisor.

    • US4: Supervisor emphasizing the full implementation of the maintenance operation without prioritizing tasks.

    • US5: Employer or supervisor having the view that the appearance and cleanliness of the work are less important than the correct performance of the equipment.

  • Failed to correct a known problem

    • US6: Lack of follow-up for the fixing of defects by the supervisor.

  • Supervisory violation

    • US7: The supervisor’s work announcement out of the maintenance program.

    • US8: Improper honoring of personnel.

Limited knowledge and experience of the supervisors can result in irrational and non-normative comments that can disrupt the operation of the technical contactors. For example, it has been planned to replace old and inefficient protective relays at the time of maintenance to prevent re-outage of equipment (lines, transformers, etc.), but such changes sometimes take too long to complete due to the workload, incorrect prediction of the relay replacement time at the same time as maintenance, insufficient knowledge of the executive team, and so on. All of these are indirect factors affect the performance of the executive personnel.

Since the use of protective drawings or relay catalogs is necessary for accurate and speedy maintenance, these drawings must be modified after any change in the protection circuits or equipment. Supervisors are responsible for the update, but sometimes due to lack of effective follow-up, executive groups experience shortcomings or contradictions.

The subgroup on scheduling maintenance operation and the program coordinator at the headquarters of the contractor company cause human error at planned operations with nine identified causes. Studies show that the roots of inappropriate scheduling can be expressed by the following factors:

  • US9: Maintenance personnel not having the proper time to rest and upgrade their knowledge.

  • US10: Maintenance operation being carried out at an inappropriate time (such as: from 0 am to 6 am or during holidays).

  • US11: Setting the maintenance schedule regardless of the environmental and power network conditions.

  • US12: Highly demanding maintenance operations and accompanied by repetitive actions.

  • US13: Trying to fix mistakes related to the existing data, settings, spare parts, etc. by executive teams at runtime.

  • US14: Maintenance operation lasting more than the working hours.

  • US15: Employer’s request for maintenance operation outside the rules or guidelines.

  • US16: Synchronization of corrective or defective projects with maintenance operation.

  • US17: Unsuitable appointment of personnel for sensitive tasks.

Organizational influence error level

The behaviors and decisions of the managerial level directly affect the mental conditions and activities of the operating groups such that even the smallest incorrect decision can cause disturbance and distrust in the whole organization. This level of error with 21 identified causes is the main reason for human error in terms of staff surveys and event roots. This has been reported as follows:

  • Resource management: Human resources

    • O1: Having no proper motivation.

    • O2: Lack of proper specialized training.

  • Resource management: Funds

    • O3: Low salary.

    • O4: In terms of salary, there is not much difference between people with and without responsibility.

    • O5: Lack of budget to provide new or update grid equipment.

  • Resource management: Equipment or facility resources

    • O6: Development in power grids.

    • O7: Lack of sufficient and up-to-date equipment.

  • Organizational climate: Structure

    • O8: Information, instructions, results of meetings, etc. from the directors or heads of departments to the personnel not being properly transmitted.

    • O9: Opaque and inappropriate personnel promotion criteria.

    • O10: Lack of proper communication between the subgroups and feedback from each other.

    • O11: Transmission of stress and work collisions from upstream to downstream.

  • Organizational climate: Culture

    • O12: Lack of financial and mental attention to personnel by organization.

    • O13: Wrong belief that personnel should work continuously.

  • Organizational climate: Politics

    • O14: Lack of specific policy to delegate authority and trust to less experienced personnel.

    • O15: Delay in employing expert personnel.

    • O16: Not enough attention is paid to employment factors based on merit and professional qualifications.

    • O17: Lack of analysis of the human error events and efforts to eliminate them.

  • Organizational process: Operation

    • O18: Inappropriate planning for use of staff and facilities.

  • Organizational process: Methods

    • O19: Failure to update test instructions.

    • O20: Incomplete job description and organizational structure.

    • O21: Lack of clear instructions for penalties and encouragement/incentives.

Human resources are the most important assets of any maintenance organization. Therefore, strong motivational strategies can retain specialized and experienced personnel in the organization and attract new expert staff.

The Three resources, namely financial resources, testing equipment, and human resources, can help improve activities. When financial resources are sufficient, personnel may receive appropriate salaries and the testing equipment can also be updated along with the development of electrical equipment.

Clarity in job descriptions and instructions can prevent staff confusion. For example, should there be no step-by-step instruction in the differential transformer relay test, each executive group would have to perform the relay setting based on the related experience and knowledge, which could cause unstable operation of the relay and inappropriate performance during the operation of the transformer. Such human error does not result from the mistakes of the executive groups, but is rather due to weakness in the organization.

Determine the probability of occurrence of the base causes of errors

Since there is no documented information from past events, especially on the basic causes of errors, calculating the probability of the occurrence of the basic causes is not possible. Hence, the probability of the occurrence of the events is calculated using questionnaire surveys among the experts. The probability of the occurrence of the error is classified into five categories: frequent, probable, occasional, very low, and unlikely. The results of the survey are obtained from experts in both qualitative and linguistic forms, which are converted into numerical scores of 5 to 1 for subsequent calculations.

Thirty maintenance experts of power system protection, who act as the head group, supervisor, technician, or worker are selected according to Table 1. Since the experts are at different levels on things such as education, work experience, age, and organizational level, relative weighting factors as shown in Table 1 are applied to the expert point of view. The relative weighting factor Wk is obtained as:

$$ {W}_k=\frac{\sum_{i=1}^4{S}_{ki}}{\sum_{j=1}^{ne}{\sum}_{i=1}^4{S}_{ji}} $$
(1)
Table 1 Scoring criteria for experts

where Wk is the relative weight of the expert k, Skiis the score of the expert k on the four criteria, Sji is the score of the expert j on the four criteria, and ne is the number of experts.

The experts’ perspectives about the probability of the occurrence of the causes of the errors are calculated and presented in the form of consensus for each cause of the error as:

$$ {M}_n=\frac{\sum_{k=1}^n{W}_k{A}_{nk}}{P\ast {W}_m} $$
(2)

where Mn is the consensus expert opinion on the probability of error n, Ank is the opinion of expert k on the probability of error n, Wm is the average relative weight of experts, and P is the number of people who have commented on the probability of error n.

Results and discussion

There are approximately 7600 relays installed in the Fars electricity network to protect the power grid in the case of fault events and to prevent catastrophic outages. The RPMT of FEMC must inspect and repair the protection circuits, obtain their health status, and calibrate settings of these relays throughout the year.

Each of these maintenance operations may be carried out with error or low accuracy or negligence, under the influence of various root causes by any member of the team.

This negligence may result in the failure of the same or other equipment during or after a maintenance operation. Our survey shows that 70% equipment failures caused by human error occur after the completion of maintenance operations because maintenance operators were not working precisely. For instance, if the operator in the protection subgroup applies a wrong setting or configuration to the relay, it may lead to the relay malfunctioning during its operation. Another example is the unplanned outage of transformers. If the maintenance group does not seal the mechanical relays (Buchholz relay or thermometer) precisely during transformer maintenance, they will misoperate because of water penetration.

Studies were conducted on FEMC protection teams in 2017, and were performed into approximately 20 3-h sessions. The results of the study and the survey on the opinion of experts for predicting the roots of RPMT errors are analyzed in terms of their occurrence probability and are as follows:

  1. 1)

    Sixty underlying causes of errors have been detected according to the protection experts’ opinion, and these affect the performance of personnel during maintenance. Organizational factors and unsafe supervision factors as external stimuli, and unsafe act factors as the internal stimuli, both impact the operation of the maintenance personnel. These causes are classified into 20 proposed subcategories of the HFACS method. The relationship between the introduced error levels and the consequence is shown in the conceptual model of Fig. 5. As shown, the level of organizational factors impacts other levels, and the unsafe act factors level is affected by other error levels.

Fig. 5
figure5

Relationship between the error levels

The results of this study show that the cause of the intermediate error of organizational influences, especially financial resource management has the greatest impact on human error from the viewpoint of RPMT according to Table 2, while the causes of the intermediate error of supervisory and exceptional violations have the lowest effect.

  1. 2)

    Table 3 shows the results of the specialized surveys on the probability of causes of human error for the maintenance teams on power system protection. According to the collected data, there is no probability of error in the “frequent” and “unlikely” categories.

Table 2 Probability of human error according to expert viewpoints in the framework of the HFACS method
Table 3 Views from experts on 60 basic cases of error in 5 probability categories

Seven percent of the basic causes of the identified error affect the performance of RPMT with ‘probable’ probability. Figure 6 shows the ranking of the causes, where Red indicates basic causes with ‘probable’ probability, blue with ‘occasional’ probability, and green with ‘very low’ probability. The two basic causes of error with ‘probable’ probability affect the performance of all teams, as follows:

  • ➢ Paying attention to personnel finance issues such as salary reform based on the rank and job position can boost teams’ morale.

  • ➢ Not employing sufficient number of personnel puts a heavy burden on the working groups with little time for rest and refreshing. They also feel a great deal of physical fatigue because of the exposure to environmental and atmospheric factors. This could lead to errors during maintenance work.

Fig. 6
figure6

The ranking of the causes of human error

To eliminate or reduce the identified errors, accurate planning and management are required, e.g.:

  • Personnel assessment and ranking.

  • Paying attention to the knowledge of experienced maintenance personnel.

  • Accelerating staff recruitment.

  • Prioritizing maintenance checklist items.

  • No mobile ‘phone use while working.

  • Paying a reasonable salary.

The results from this study can provide a good reference for the following future studies:

  1. 1.

    Use of virtual environments to identify and analyze human factors instead of surveys.

  2. 2.

    Use of virtual environments to calculate the probability of underlying causes of human error.

  3. 3.

    Studying the logical relationship between the causes and consequences of human error.

  4. 4.

    Ways to calculate the quantifying human reliability of relay protection maintenance personnel and its improvement.

  5. 5.

    Relationship between human error and profitability of maintenance contractors and analyzing the cost-benefit of reducing human error.

Conclusion

The analysis of recent events in the Fars Electricity Maintenance Company caused by human error shows that the protection subgroup accounting for 62% of the errors caused most of the human error-induced events among the executive maintenance teams.

In this study, 60 basic causes of human error are identified and predicted using 4 levels of error in the HFACS method, which can be used to control and increase human reliability of maintenance teams and in electrical industry research.

Analyses and surveys from the RPMT in the FEMC show that 7% of the 60 basic causes have a high probability (i.e., are probable) of resulting in events. According to the results, the salary system, the inadequacy of test equipment, the shortage of personnel and their tiredness due to high workload are the main causes of effective factors in the behavior and performance of the executive personnel. Also, decisions at the top of the organization directly affect the performance at the lower levels.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

HRA:

Human Reliability Analysis

HFACS:

Human Factors Analysis and Classification System

FEMC:

Fars Electricity Maintenance Company

RPMT:

Relay Protection Maintenance Team

PMET:

Personnel of Maintenance Executive Team

CREAM:

Cognitive Reliability and Error Analysis Method

kV:

Kilovolt

THEA:

Technique for Human Error Assessment

PHEA:

Predictive Human Error Analysis

THERP :

Technique for Human Error Rate Prediction

CREAM:

Cognitive Reliability and Error Analysis Method

LOLP :

Loss of Load Probability

EPNS:

Expected Power not Supplied

References

  1. 1.

    NERC’s state of reliability 2018 report, Atlanta, GA. https://www.nerc.com/pa/RAPA/PA/Performance%20Analysis%20DL/NERC_2018_SOR_06202018_Final.pdf.

  2. 2.

    Iran power transmission network events analysis report-2017. Iran Grid Management Company annual report. https://www.igmc.ir/Documents/EntryId/275890 (In persian).

  3. 3.

    Bao, Y., Ch, G., et al. (2018). Impact analysis of human factors on power system operation reliability. Journal of Modern Power Systems and Clean Energy, 6(1), 27–39.

    Article  Google Scholar 

  4. 4.

    Torres, E. S., Celeita, D., & Ramos, G. (2018). State of the art of human factors analysis applied to industrial and commercial power systems. In 2018 2nd IEEE International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES). https://doi.org/10.1109/ICPEICES.2018.8897323.

    Chapter  Google Scholar 

  5. 5.

    Pasquale, V., Iannone, R., et al. (2013). An Overview of Human Reliability Analysis Techniques in Manufacturing Operations, (pp. 221–240). Rijeka: Operations management, InTech.

    Google Scholar 

  6. 6.

    Tang, J., Bao, Y., et al. (2013). A Bayesian network approach for human reliability analysis of power system. In 2013 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC). https://doi.org/10.1109/APPEEC.2013.6837128.

    Chapter  Google Scholar 

  7. 7.

    Zhou, C., & Kou, X. (2010). Method of Estimating Human Error Probabilities in Construction for Structural Reliability Analysis Based on Analytic Hierarchy Process and Failure Likelihood Index Method. Journal of Shanghai Jiaotong University (Science), 15(3), 291–296. https://doi.org/10.1007/s12204-010-1005-3.

    Article  Google Scholar 

  8. 8.

    Yi, X., Dong, H., et al. (2016). Human Reliability Analysis Method on Armored Vehicle System Considering Error Correction. Journal of Shanghai Jiaotong University (Science), 21(4), 472–477.

    MathSciNet  Article  Google Scholar 

  9. 9.

    Salmon, P. M., Cornelissen, M., et al. (2012). Systems-based accident analysis methods: A comparison of Accimap, HFACS and STAMP. Safety Science, 50, 1158–1170.

    Article  Google Scholar 

  10. 10.

    Baysari, M. T., Caponecchia, C., et al. (2009). Classification of errors contributing to rail incidents and accidents: A comparison of two human error identification techniques. Safety Science, 47, 948–957.

    Article  Google Scholar 

  11. 11.

    DeMott, D. L. (2016). Tailoring a human reliability analysis to industry needs. In 2016 Annual Reliability and Maintainability Symposium (RAMS). https://doi.org/10.1109/RAMS.2016.7448030.

    Chapter  Google Scholar 

  12. 12.

    Veloza, O. P., & Santamaria, F. (2016). Analysis of major blackouts from 2003 to 2015: Classification of incidents and review of main causes. The Electricity Journal, 29(7), 42–49.

    Article  Google Scholar 

  13. 13.

    Alhelou, H. H., Hamedani-golshan, M. E., et al. (2019). A survey on power system blackout and cascading events: research motivations and challenges. Energies, 12(4), 1–28.

    Google Scholar 

  14. 14.

    Bao, Y., Guo, J., et al. (2014). Analysis of power system operation reliability incorporating human errors. In 2014 17th International Conference on Electrical Machines and Systems (ICEMS), (pp. 1052–1056). https://doi.org/10.1109/ICEMS.2014.7013625.

    Chapter  Google Scholar 

  15. 15.

    Frank, C. J., Bendarik, R. A., et al. (1985). Dispatch Control Center Human Factors –Revisited. IEEE Transallctions on Power Apparatus and Systems, PAS-104(6), 1294–1300.

    Article  Google Scholar 

  16. 16.

    Wang, A., Tu, Y., & Liu, P. (2011). Quantitative Evaluation of Human-Reliability Based on Fuzzy-Clonal Selection. IEEE Trans on Reliability, 60(3), 517–527.

    Article  Google Scholar 

  17. 17.

    Ajukumar, V. N., Gandhib, M. S., & Gandhic, O. P. (2015). Identification and assessment of factors influencing human reliability in maintenance using fuzzy cognitive maps. Quality and Reliability Engineering International, 31(2), 169–118.

    Article  Google Scholar 

  18. 18.

    Peach, R., Ellisl, H., & Visserl, J. K. (2016). A maintenance performance measurement framework that includes maintenance human factors: a case study from the electricity transmission industry. South African Journal of Industrial Engineering, 27(2), 177–189.

    Article  Google Scholar 

  19. 19.

    Sheikhalishahi, M., Azadeh, A., et al. (2017). Human factors effects and analysis in maintenance: a power plant case study. Quality and Reliability Engineering International, 33(4), 895–903.

    Article  Google Scholar 

  20. 20.

    Stojilkovice, E., Janackovtic, G., et al. (2016). Development and application of a decision support system for human reliability assessment – a case study of an electric power company. Quality and Reliability Engineering International, 32(4), 1581–1590.

    Article  Google Scholar 

  21. 21.

    Sh, L., Li, C., et al. (2018). Risk identification and analysis for new energy power system in China based on D numbers and decision-making trial and evaluation laboratory (DEMATEL). Journal of Cleaner Production, 180, 81–96. https://doi.org/10.1016/j.jclepro.2018.01.153.

    Article  Google Scholar 

  22. 22.

    Wu, Z., Pan, X., et al. (2019). The task demands resources method: A new approach to human reliability analysis from a psychological perspective. Quality and Reliability Engineering International, 35(4), 1200–1218.

    Article  Google Scholar 

  23. 23.

    Hai-bo, L., et al. (2013). A quantitative method for human reliability in power system based on CREAM. Power System Protection and Control, 41(05), 37–42.

    Google Scholar 

  24. 24.

    Bao, Y., et al. (2013). Analysis of Human Reliability in Power System Switching Operation considering Dependency of Operators. In 2013 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC). https://doi.org/10.1109/APPEEC.2013.6837155.

    Chapter  Google Scholar 

  25. 25.

    Abrishami, S., Khakzad, N., et al. (2020). BN-SLIM: A Bayesian Network methodology for human reliability assessment based on Success Likelihood Index Method (SLIM). Reliability Engineering System Safety, 193, 106647. https://doi.org/10.1016/j.ress.2019.106647.

    Article  Google Scholar 

  26. 26.

    Bao, Y., Wang, Y., et al. (2015). Impact of human error on electrical equipment preventive maintenance policy. In 2015 IEEE Power & Energy Society General Meeting. https://doi.org/10.1109/PESGM.2015.7285939.

    Chapter  Google Scholar 

  27. 27.

    Teive, R. C. G., Neto, E. A. C. A., et al. (2017). Intelligent system for automatic performance evaluation of distribution system operators. In 2017 19th International Conference on Intelligent System Application to Power Systems (ISAP). https://doi.org/10.1109/ISAP.2017.8071399.

    Chapter  Google Scholar 

  28. 28.

    Song, B., Wang, Z., et al. (2018). A multidimensional workload assessment method for power grid dispatcher. In Engineering Psychology and Cognitive Ergonomics, (pp. 55–68). Cham: Springer. https://doi.org/10.1007/978-3-319-91122-9_5.

    Chapter  Google Scholar 

  29. 29.

    Marinez, C., Sampedro, C., et al. (2018). The Power Line Inspection Software (PoLIS): A versatile system for automating power line inspection. Engineering Applications of Artificial Intelligence, 71, 293–314. https://doi.org/10.1016/j.engappai.2018.02.008.

    Article  Google Scholar 

  30. 30.

    Apostolov, A. (2017). Efficient maintenance testing in digital substations based on IEC 61850 edition 2. Protection and Control of Modern Power Systems, 2(37). https://doi.org/10.1186/s41601-017-0054-0.

  31. 31.

    Lavrov, E., Pasko, N., & Siyk, O. (2020). Information technology for assessing the operators working environment as an element of the ensuring automated systems ergonomics and reliability. In 2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET). https://doi.org/10.1109/TCSET49122.2020.235497.

    Chapter  Google Scholar 

  32. 32.

    Jain, T., Ghosh, D., & Mohanta, D. K. (2019). Augmentation of situational awareness by fault passage indicators in distribution network incorporating network reconfiguration. Protection and Control of Modern Power Systems, 4, 26. https://doi.org/10.1186/s41601-019-0140-6.

    Article  Google Scholar 

  33. 33.

    Diller, T., Helmarich, G., et al. (2014). The Human Factors Analysis Classification System (HFACS) Applied to Health Care. American Journal of Medical Quality, 29, 181–190.

    Article  Google Scholar 

  34. 34.

    Shappell, S. A., & Wiegmann, D. A. (2001). Applying reason: the human factors analysis and classification system (HFACS). Florida: Human Factors and Aerospace Safety.

    Google Scholar 

  35. 35.

    Dekker, S. W. A. (2002). Reconstructing human contributions to accidents: the new view on error and performance. Journal of Safety Research, 33(3), 371–385. https://doi.org/10.1016/S0022-4375(02)00032-4.

    MathSciNet  Article  Google Scholar 

  36. 36.

    Alexander, T. M. (2019). A case based human reliability assessment using HFACS for complex space operations. Journal of Space Safety Engineering, 6(1), 53–59. https://doi.org/10.1016/j.jsse.2019.01.001.

    Article  Google Scholar 

Download references

Acknowledgments

Not applicable.

Authors’ information

Not applicable.

Funding

Not applicable.

Author information

Affiliations

Authors

Contributions

All authors collected and analyzed the human reliability data regarding the power transmission system protection personnel. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Mehdi Nafar.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tavakoli, M., Nafar, M. Human reliability analysis in maintenance team of power transmission system protection. Prot Control Mod Power Syst 5, 26 (2020). https://doi.org/10.1186/s41601-020-00176-6

Download citation

Keywords

  • Power transmission system protection
  • Maintenance teams
  • Human reliability
  • HFACS method