 Original research
 Open access
 Published:
Graph representation learningbased residential electricity behavior identification and energy management
Protection and Control of Modern Power Systems volume 8, Article number: 28 (2023)
Abstract
It is important to achieve an efficient home energy management system (HEMS) because of its role in promoting energy saving and emission reduction for endusers. Two critical issues in an efficient HEMS are identification of user behavior and energy management strategy. However, current HEMS methods usually assume perfect knowledge of user behavior or ignore the strong correlations of usage habits with different applications. This can lead to an insufficient description of behavior and suboptimal management strategy. To address these gaps, this paper proposes nonintrusive load monitoring (NILM) assisted graph reinforcement learning (GRL) for intelligent HEMS decision making. First, a behavior correlation graph incorporating NILM is introduced to represent the energy consumption behavior of users and a multilabel classification model is used to monitor the loads. Thus, efficient identification of user behavior and description of state transition can be achieved. Second, based on the online updating of the behavior correlation graph, a GRL model is proposed to extract information contained in the graph. Thus, reliable strategy under uncertainty of environment and behavior is available. Finally, the experimental results on several datasets verify the effectiveness of the proposed model.
1 Introduction
The energy crisis is a matter of current concern all over the world. The energy consumption of residents and business endusers accounts for more than 40% of the total, and continues to rise [1]. In this context, improving energy efficiency on the demand side is particularly critical to the sustainable development of both economy and society [2, 3]. A home energy management system (HEMS) is one of the most important technologies for energy saving and emission reduction. It achieves the maximum benefit on the demand side by promoting flexible loads to participate in demand response efficiently [4].
An efficient HEMS is built on two critical issues, i.e., identification of user behavior and energy management strategy. Behavior identification provides accurate input to the optimization model. Thus, developing a practical method which can accurately capture and describe home energy usage is critical to behavior identification. Then, an HEMS strategy can be developed after grasping user behavior. It is desirable that this intelligent strategy can not only deal with the uncertainty of exogenous information, but also achieve rapid selfadaptation for different users.
Existing research either assumes that the usage behavior is known in advance, or additional intrusive devices are used to obtain user behavior. However, considering the dynamic changes of the behavior and the heavy deployment cost of intrusive devices, these methods are not practical [5]. Using nonintrusive load monitoring (NILM) to assist behavior identification is a feasible alternative. It does not require additional investment and equipment transformation [6]. However, traditional NILM methods only disaggregate the load, but cannot realize behavior identification. In addition, current NILM methods face low disaggregation accuracy and high equipment requirement [7,8,9]. Thus, developing a practical and accurate online energy behavior identification method for HEMS input is still a challenging task.
For energy management strategy, some studies assume that behavior is known without specifying the source of behavioral information. Thus, these studies fail to effectively consider the dynamic uncertainty of user behavior and result in suboptimal management strategy. Reference [10] describes user satisfaction according to the time difference between strategy decisions and habits. However, the optimization decisionmaking process of each appliance is independent of each other in this study. This may cause unsatisfactory decisions that do not meet the expectations of users. Reference [11] considers the interdependence of specific appliances, such as the dependence between washing machines and washer dryers. However, this dependence cannot reflect the correlation between all electrical appliances. While there is a certain correlation between different usage habits of appliances, this kind of behavior correlation contains complex and unstructured data. Traditional methods, e.g., LSTM or classic Qlearning, cannot make full use of such unstructured data to make effective decisions. Existing methods for exploiting behavioral information in NILM are limited [12]. Reference [13] proposes a graphbased representation of the temporal features of appliance activities, but it fails to capture the dependencies among electricity usage patterns. In [14], label correlation is incorporated into behavior recognition, but it relies on the time series signal of appliances to capture their correlation. This may introduce errors or miss some information.
To compensate for the abovementioned shortcomings, including the insufficient use of behavior information, inefficiency of behavior identification, and inadaptability of strategy, NILMassisted graph reinforcement learning (GRL) is proposed for an intelligent HEMS strategy. The main contributions of this study can be summarized as follows.

(1)
A behavior correlation graph is constructed to represent the complex behavior correlation. The dynamically updated behavior correlation graph can effectively represent the dynamic habits of users, and directly provide the necessary behavior information for an intelligent decision by the HEMS.

(2)
A behavior identification method based on multilabel NILM technology is proposed. NILM is regarded as a multilabel classification question, which can effectively reduce the model scale. This behavior identification method can not only accurately realize the function of load disaggregation, but also realize the online update of the user behavior correlation graph. The generated behavioral features are combined with electrical features to effectively improve the performance of behavior identification.

(3)
A GRLbased adaptive HEMS is proposed for excavating information from the behavior correlation graph and for providing energy management strategy. The GRL model can be adjusted based on dynamic habits and uncertain exogenous information. It continuously adapts to the changes of both internal and external factors and makes decisions that are compatible with user expectations.
2 HEMS framework
The proposed HEMS framework is shown in Fig. 1. The left part of Fig. 1 is the learning process of user behavior, and is based on the multilabel NILM model, namely, a multilabel subtask gated network (MLSGN). The learning process of user behavior comprises four parts: construction of the behavior correlation graph, model training, appliance disaggregation, and behavior update. First, in order to represent the users’ initial behavior, a subgraph is extracted from the prior graph based on users’ appliance information. Next, after data preprocessing such as normalization, the graph and the data are provided to the model training proportionally. Finally, when the disaggregation results are obtained, the behavior graph is updated and continuously provides the latest behavior information for accurate disaggregation and effective energy management.
The right part of Fig. 1 shows the learning process of the HEMS strategy according to the updated and learned behavior graph. The strategy generates on–off commands on the basis of load state and optimization objectives, and is a process of exploration. Subsequently, all states, actions, and rewards are sent to the replay buffer, which provides data for the learner to update strategy.
The practical deployment of the proposed method is shown in the middle part of Fig. 1. The MLSGN model uses the aggregated data provided by an outdoor electricity meter for load disaggregation and behavior identification. The HEMS strategy generates on–off commands based on the objective function, load states, and environment states to guide users to manage home energy consumption. Specifically, the load state refers to the online state of loads, the environment state refers to the exogenous information, e.g., outdoor temperature and electricity price, and the correlation state refers to the correlation information in the graph.
3 Online behavior monitoring method
3.1 Behavior representation and updating method
In this study, a graph is used to represent users’ usage behavior. Essentially, behavior information consists of the usage habits of appliances in each period of time and the correlations of usage habits of different applications [15]. Among them, the usage habits of different periods can be expressed by the use probability in each period, and the behavior correlation can be expressed by the probability that an appliance is used after other appliances. Behavior correlation is a kind of complex unstructured data, which is represented by graphs in this paper. Generally, graph data consists of nodes and weighted edges. It can describe the association relationship more intuitively, and is a promising way of dealing with complex data relations.
The nodes of the behavior correlation graph represent the appliances, the edges represent the correlations between the appliances, and the weights of the edges reflect the strength of the correlations. The weight matrix is called the behavior correlation matrix in this study, which can be calculated as:
where \({\text{p}}_{{\text{i, j}}}\) represents the probability that appliance \({\text{a}}_{{\text{i}}}\) works after appliance \({\text{a}}_{{\text{j}}}\). \({\text{N}}_{{\text{j}}}\) is the total number of times that appliance \({\text{a}}_{{\text{j}}}\) is on, and \({\text{N}}_{{\text{i, j}}}\) represents the number of times that appliance \({\text{a}}_{{\text{i}}}\) works after \({\text{a}}_{{\text{j}}}\).
A priori behavior correlation matrix is used to represent the habits of mass users, and the corresponding graph is the users’ initial behavior correlation graph. The prior behavior correlation graph can be derived from other sources of data, such as may be publicly available or institutionally collected largescale user datasets. In order to avoid the influence of signal noise and the overfitting of the priori correlation matrix, \({\text{p}}\) is smoothed by threshold \(\uptau\), as:
Clearly, \(\stackrel{\mathrm{}}{\text{A}}\) is not a symmetric matrix and the behavior correlation graph is directed, as shown in Fig. 2.
The behavior correlation matrix is updated by the prior behavior correlation matrix and the posterior behavior correlation matrix, as:
where the posterior matrix p refers to the correlation matrix calculated based on the online behavior data of specific users. This comes from the results of load disaggregation. \(\uplambda\) is the retention ratio of the historical behavior at each iteration.
3.2 NILMbased behavior identification model
The usage probability of appliances in each period and behavior correlation can be obtained by behavior identification. Different from the traditional NILM, behavior identification needs not only to identify the type of the appliance, but also to extract the behavior of appliance usage.
Early NILM studies do not consider the information contained in user habits [16, 17]. In order to make efficient use of behavior correlation information, recent research uses multilabel classification technology to consider label correlation [18, 19]. However, the traditional multilabel classification methods are either not competent for the analysis of unstructured behavior data, or the extracted behavior data cannot be effectively used for decisionmaking in an HEMS. Thus, this study improves the single label NILM model subtask gated network (SGN) that is in [7], and proposes a high precision multilabel behavior identification model MLSGN.
The structure of MLSGN is shown in Fig. 3. The network takes the aggregated power sequence and the behavior correlation graph as input and outputs the online probability of each appliance through a sigmoid function. At the same time, from the output load disaggregation results, the behavior information of specific users is learned and updated to ensure that the dynamic behavior can be described accurately.
To make sure that the correlation of appliance behavior can be effectively learned during feature extraction, the model provides the behavior information extracted by a Graph Convolutional Network (GCN) layer to each process of electricity feature extraction. The extraction method in the orange dotted frame in Fig. 3 is the feature extraction layers in SGN, which consist of onedimensional convolutional layers and dense layers.
4 Management strategy based on GRL
4.1 Problem formulation of HEMS
In general, residential load is divided into thermostatically controlled loads (TCL), interruptible loads (IL), transferable loads (TL), uncontrollable loads (UL), and distributed photovoltaic (PV) [20]. TCL mainly includes appliances where shortterm interruptions have almost no impact on the comfort of users, e.g., air conditioners and water heaters. IL includes appliances where users often have no usage needs but still consume electricity, e.g., water dispensers. TL includes washing machines, washer dryers, dishwashers, and cookers, whose task can be delayed. UL includes lighting, etc., and unknown types of appliances also belong to UL. Although the energy consumption of these appliances cannot be adjusted, their information reveals user behavior and this can help manage the energy of other appliances more efficiently.
Thermostatically controlled load (TCL) For air conditioners and water heaters, an equivalent thermodynamic model is used to reflect the state transfer process. The equivalent thermodynamic model of air conditioners [21] and water heaters [22] can be expressed as:
The temperature of air conditioners and water heaters should be controlled within specific ranges, as:
For \(\forall {\text{m}} \in {\text{M}}_{{{\text{TCL}}}}\), the closer the indoor or water temperature \({\text{T}}_{{\text{n}}}\) to the expected temperature \({\text{T}}_{{{\text{set}}}}\), the higher the comfort, as:
Interruptible load (IL) For IL, user comfort is related to the difference between management strategy and usage habits. The greater the difference, the lower the comfort, and it can be formulated as:
where \({\text{m}} \in {\text{M}}_{{{\text{IL}}}}\). The first item of (10) represents the difference between the decisions on which appliances are used in each period of time and the habits of users, and the second item represents the behavior correlation difference between the strategy and the previous habit.
Transferable load (TL) The usage comfort of TL can also be calculated by the difference between management strategy and usage habits according to (10). The work of these appliances is expected to be finished within the set time. Once the appliance is turned on, it cannot be interrupted until the work is finished. These requirements are formulated respectively, as:
where \({\text{m}} \in {\text{M}}_{{{\text{TL}}}}\).
In this paper, the binary decision vector \({\text{x}}_{{\text{m, n}}}\) is used to represent the working condition of appliances. \({\text{x}}_{{\text{m, n}}} { = 1}\) indicates that the appliance is on, otherwise it is off. Therefore, a set of decision variables are defined as:
4.2 Design of state
The state variables of TCL, IL, and TL are formulated respectively as follows:
where these variables mainly include the working status of the appliance itself, user comfort, and corresponding usage habit information. The state of temperature comfort \({\text{s}}_{{\text{n}}}^{{{\text{TC}}}}\) and the completion progress of work \({\text{s}}_{{\text{n}}}^{{{\text{CP}}}}\) can be calculated as:
In addition to the state of various loads, system state \({\text{S}}_{{\text{n}}}\) also includes the state of the associated load \({\text{S}}_{{\text{n, cor}}}\), the predicted power of PV \({\text{P}}_{{\text{n}}}^{{{\text{PV}}}}\), the electricity price \(\rho_{{\text{n}}}\), and the outdoor temperature \({\text{T}}_{{\text{n, env}}}\), as:
\({\text{S}}_{{\text{n, cor}}}\) includes the correlation information of the two appliances with the largest correlation coefficient in controllable loads and the correlation information of the uncontrollable loads, as:
where \({\text{i}} \in {\text{M}}_{{{\text{IL}}}} \cup {\text{M}}_{{{\text{TL}}}}\), and \({\text{j}} \in {\text{M}}_{{{\text{UL}}}}\). All this correlation information includes the corresponding appliance’s id, correlation coefficient, and on–off state of the previous period.
4.3 Design of reward function
The optimization target of the HEMS is to minimize the cost of energy consumption with respect to user comfort and the constraints of appliance operation. Therefore, the reward function includes three parts: the cost of energy consumption, user comfort, and the penalty for violating the constraints, which is formulated as
where \(E_{{\text{n}}}\), \({\text{C}}_{{\text{n}}}\), and \({\text{F}}_{{\text{n}}}\) denote the energy consumption cost, comfort, and penalty, respectively. α is the energy consumption coefficient, \({\text{R}}_{{\text{o}}}\) is a positive reward offset, and \({\text{S}}_{{\text{c}}}\) is the feasible region that satisfies the constraints.
The electricity cost can be expressed as:
Comfort \({\text{C}}_{{\text{n}}}\) includes temperature in accordance with (9) and (10). If the mentioned constraints (7)–(8) and (11)–(13) are violated, a penalty of \( {\text{F}}_{{\text{n}}}\) is added to the reward function.
4.4 Design of GRL model
In order to effectively use the behavior information in the behavior correlation graph, an HEMS strategy based on GRL is proposed. The designed GRL model structure is shown in Fig. 4, where each agent corresponds to one controllable appliance in the home and manages its optimal decision. The input of the model is the state of each appliance, and the output is the Q value of each action. The observation encoder layer consists of two dense layers. Since the states of different appliances are different, independent dense layers are employed to code for different types of appliances. The convolutional layer and the Q network are also made up of two dense layers, which try to collect the features of appliances and obtain the action value. Since the features have been extracted by the encoder, the parameters of the convolutional layer and Q network can be shared among different appliances. The sharing of parameters ensures the model size will not surge with an increase in the number of loads.
To ensure the effective exploration of the action to obtain strategy improvement, a random decision is performed with probability ε, and the optimal decision \(\hat{X}_{{\text{n}}}\) of the model is performed with probability 1 − ε. The exploration rate decreases \(\Delta {\upvarepsilon }\) after each epoch of training until it reaches the lowest value \({\upvarepsilon }_{{\text{d}}}\). The optimal decision \(\hat{X}_{{\text{n}}}\) can be obtained by:
where \({\text{G(}} \cdot {)}\) represents the GRL model.
The goal of model training is to minimize the value of the loss function. The loss function of the model training is formulated as:
where \({\text{L(}} \cdot {)}\) denotes the loss function, and \({\text{G}}^{\prime}{(} \cdot {)}\) is the target network. \({\uptheta }\) and \(\uptheta ^{\prime}\) are the parameters of each network. \({\text{S}}_{{\text{m, n}}}\) and \({\text{r}}_{{\text{m, n}}}\) represent the observation and reward values of the appliance m, respectively. \({\upgamma }\) denotes the discount rate of the reward.
5 Performance evaluation
5.1 Datasets
To provide a fair comparison between the proposed behavior identification method and existing methods, REDD [23] and REFIT [24] datasets are used in the experiments.
The REDD dataset provides energy consumption data of six houses. To avoid the influence of insufficient samples, the data of house 1 and house 3, which is relatively sufficient, is selected in this study. To realize a reasonable comparison, the same preprocessing method in the SGN model [7] is used for the REDD dataset. The appliances in house 1 include dishwasher, fridge, microwave, and washer dryer, while those in house 3 are electronic load, dishwasher, electric furnace, fridge, microwave, and washer dryer.
The REFIT dataset provides electric power measurements from 20 households. The first 10 houses are experimented with in the official preprocessed version. Each house contained energy consumption data of 9 appliances.
To distinguish the houses in the two datasets, house 1 and house 3 in REDD are abbreviated as B1 and B3, and the first 10 houses in REFIT are denoted by H1–H10, respectively.
For energy management strategy, because neither of these two datasets contains all types of appliances studied in this paper, the behavior information is constructed by the combination of dataset extraction and behavior customization. In order to ensure the authenticity of behavior information, energy consumption data with the house id of "1240" in the Pecan Street dataset [25] is chosen to extract the behavior. This contains the data of all the transferable loads in this study for 6 months. Additionally, the usage probability of a water dispenser in each period is customized based on the habits of most users. The lighting and appliances in the bedroom are regarded as the uncontrollable loads in this study, while it is assumed that water heaters and air conditioners will not be turned off without an HEMS. In addition, it is also necessary to consider the uncertainties of PV output, electricity price, outdoor temperature, and user demand. Thus, some disturbances are added according to [20].
5.2 Data preprocess
For the MLSGN model, the remaining data of REDD and REFIT are used to construct a prior behavior correlation graph as shown in Fig. 5. The threshold \({\uptau }\) for the process of construction is set to 0.25. The labels in the dataset and their abbreviations are shown in Table 1.
The initial behavior correlation graph for behavior identification is a subgraph extracted by the a priori graph, and the extraction method is to set the related edges of appliances that the user does not have to 0. To compare with the SGN model more fairly, the processing methods of SGN in data preprocessing and some parameter selections are followed. The inputs of MLSGN are the power sequence with length of 512 and the behavior correlation graph. The retention ratio \({\uplambda }\) is set to 0.95. The output of the model is the on–off state of each appliance at the midpoint of the sequence. The working power threshold of each appliance is set as 15 W. Additionally, the aggregated data is normalized by Zscore.method before training [26], and the appliances data is normalized by the max–min method [27]. The model is trained by the Adam algorithm [28], and the loss function can be expressed as:
where \({\text{o}}_{{\text{m}}}\) and \({\hat{\text{o}}}_{{\text{m}}}\) represent the on–off state of appliance m and the predicted working probability, respectively.
For the GRL model, time step \({\Delta t}\) is equal to 1 h and N is 24. The behavior correlation graph of GRL and the usage probability of each appliance are shown in Figs. 6 and 7, respectively. These are calculated from the data in the Pecan Street dataset.
Load parameters such as air conditioner and water heater are shown in Table 2, while Table 3 shows the parameters of the transferable loads, in accordance with [20, 29]. The curves of PV output, temperature, water demand and, electricity price used in the simulation are plotted in Fig. 8.
The exploration rate ε is 0.65, \(\Delta {\upvarepsilon }\) is 0.02, \({\upvarepsilon }_{{\text{d}}}\) is 0.02, and the discount rate \({\upgamma }\) is 0.95. These parameters are the optimal values selected after multiple experiments and comparative analyses. The injected water temperature is 8 °C. The temperature ranges of air conditioner and water heater are between 23–28 °C and 54–70 °C, respectively. The volume of the water heater tank V is 40 gallons, the average rated power of lighting and bedroom appliances are 0.1 kW and 0.8 kW, respectively. The penalty value \({\text{F}}_{{\text{n}}}\) is 10, and the reward offset \({\text{R}}_{{\text{o}}}\) is 10.
5.3 Evaluation method for behavior identification
Hamming loss (HL), accuracy (Acc), and F1Score are used to evaluate the MLSGN model [30]. Hamming loss \({\text{L}}_{{{\text{HL}}}}\) is a classical evaluation method of multilabel classification, and is used to reflect the misclassification of the model, and can be calculated as:
where \({\text{N}}_{{\text{H}}}\) is the number of samples. The effect of \({1(} \cdot {)}\) is the logical judgement of whether \({\hat{\text{o}}}_{{\text{n}}} \ne {\text{o}}_{{\text{n}}}\). If it is true, it equals 1, otherwise it is 0. \({\text{L}}_{{{\text{HL}}}}\) is the error rate, and the smaller the value, the more accurate the prediction.
Acc and F1Score are commonly used in singlelabel classification. Both values range from 0 to 1, and the larger the value, the better the performance.
5.4 Behavior identification result
To demonstrate that the proposed MLSGN model not only outperforms the traditional multilabel classification methods, but also has stronger recognition ability than the original singlelabel model, it is compared with the classical multilabel classification model and multilabel knearest neighbor algorithm (MLKNN) [31], random klabel sets algorithm (RAKEL) [32], and singlelabel model SGN. To prove its superior performance, it is also compared with the load disaggregation with attention model (LDWA) [8], which is a more advanced model built on SGN using attention technique.
Figure 9 shows the behavior correlation graph of the 12 households obtained when the experiments are performed on the latest data. The thicker edges in the graph represent the greater weights. The experimental results of the proposed behavior identification model on the 12 houses are shown in Table 4, where the best performance of each result is highlighted in bold.
The experimental results show that the performance of the SGN, LDWA and MLSGN models is significantly better than that of MLKNN and RAKEL. Moreover, except for B3 of REDD and H10 of REFIT, the experimental results of other houses indicate that the improved MLSGN model achieves better results in terms of hamming loss, accuracy and F1score. The average recognition accuracy of MLSGN reaches 93.2%. From the experimental results of B3 and H10, it can be inferred that considering the behavior correlation of appliances does not always improve the recognition accuracy or may even deteriorate performance. This is because of the overfitting of correlation features caused by few appliances in these two houses.
5.5 Results evaluation of GRL
This section presents numerical simulation results to evaluate the performance of the NILMbased HEMS. Users’ energy consumption is simulated considering uncertainties of environment and usage behavior. The convergence of average Q value and constraints violations in training are shown in Fig. 10. As seen, each agent can converge to the maximum Q value after training. At the beginning of the training, the average Q value of the agent is negative because it is easy to violate the constraints (7)–(8) and (11)–(13). After continuous exploration, the agent gradually learns how to produce more proper actions, and the average Q value gradually increases from negative to positive. Finally, it converges to the maximum Q value, while the violation of constraints also gradually disappears and user comfort is ensured.
After the model converges, the optimal management strategy of appliances can be carried out. To evaluate the proposed method’s performance on user comfort, it is applied with double deep q learning (DDQN) [33] to an HEMS with varying energy coefficients. The comfort can be calculated by (9), (10) and (22). For a more intuitive analysis of comfort level, the electricity cost \(E_{n}\) is set to 0. Figure 11 shows the comfort results. It can be concluded that the proposed method outperforms DDQN on user comfort across all energy coefficients, and user comfort declines with increasing energy coefficient, indicating that users can trade off comfort for lower energy consumption.
Similarly, the energy consumptions of the proposed method and DDQN under different energy coefficients are compared, as shown in Fig. 12. It can be inferred that the two methods have comparable energy consumption under different energy coefficients. Compared with the case without HEMS, the daily cost of applying the proposed method is significantly reduced, by 15.9%, 18.3%, and 18.7%, respectively. Additionally, the larger the energy coefficient, the smaller the daily cost. The effect of α is to balance energy saving and comfort, and greater energy coefficient means that users are more concerned about saving energy than comfort. Therefore, the energy coefficient can be set according to users’ usage preference. From the aforementioned experimental results, it can be concluded that the proposed method balances comfort and energy consumption better under uncertainty.
The total energy consumption of all loads in three different scenarios is compared when α is set to 0.5, as shown in Fig. 13. The three scenarios are: using the proposed HEMS method, using the DDQNbased HEMS method, and without HEMS. The timeofuse price for the day is also shown in the figure. As shown in Fig. 13a, when the HEMS method proposed in this work is deployed, the loads consume more energy when the price is low, and reduce the demand when the price is high. Specifically, the transferable loads avoid working at peak times, and their energy demand is postponed to the period of moderate electricity price between 10:00 and 16:00. Thermostatically controlled loads are not only expected to reduce the energy cost, but also to ensure that the temperature is kept within the comfort range. The variation curves of the indoor temperature and water temperature of the water heater are shown in Fig. 14. It can be concluded that under the influence of various uncertain factors, the temperature can be maintained within the set range, and the energy consumption during peak times can also be effectively controlled. In contrast, in the case without HEMS, as shown in Fig. 13c, most transferable loads work at peak times, which results in higher cost. In addition, the proposed method also ensures the rationality of the management. Comparing with Fig. 13b, it can be seen that after applying the proposed method, the clothes dryer always works after the washing machine, and the dishwasher generally works after the cooker. This validates that the proposed method combined with behavior correlation can better deal with energy management than the method without incorporating correlations.
6 Conclusion
In this study, a novel method for residential electricity behavior identification and energy management based on graph representation learning is presented. The proposed method constructs and updates a graph that captures users’ electricity usage habits, and leverages an improved multilabel NILM method to identify their behavior. Moreover, the method proposes an HEMS strategy based on GRL, one which addresses the anomaly management problem arising from ignoring appliance correlation in conventional methods. The proposed method can adapt to users’ changing behavior by online updating of the graph, and assist them in continuous energy management.
The proposed method is evaluated through simulations which demonstrate its superior performance in behavior identification and HEMS. The proposed method has two main advantages over existing ones. First, it achieves a high average recognition accuracy of 93.2% in the experiments, demonstrating its effectiveness in behavior identification. Second, it reduces the average electricity cost for users by 18.3%, while maintaining a high level of user comfort and satisfaction, and making management decisions that match user preferences. Therefore, the method balances user comfort and energy cost better than other methods.
In future work, we will continue to tackle the overfitting caused by the ‘few shot’ learning problem to further improve the generalization performance of behavior identification. At the same time, it is of great significance to migrate the proposed method to software and hardware systems.
Availability of data and materials
Not applicable.
Abbreviations
 AC/WH/PV:

Air conditioner/water heater/photovoltaic
 DDQN:

Double deep q network
 GRL:

Graph reinforcement learning
 HEMS:

Home energy management system
 IL/TL/UL:

Interruptible/transferable/uncontrollable load
 LDWA:

Load disaggregation with attention
 MLSGN:

Multilabel subtask gated network
 MLKNN:

Multilabel knearest neighbor algorithm
 NILM:

Nonintrusive load monitoring
 RAKEL:

Random klabel sets algorithm
 SGN:

Subtask gated network
 TCL:

Thermostatically controlled load
 \({\text{A}}_{{{\text{m}}^{\prime}{\text{, m}}}}\) :

Probability of appliance m work after m'
 \({\text{C}}_{{{\text{AC}}}} {\text{/C}}_{{{\text{WH}}}}\) :

Equivalent thermal capacity of AC/WH
 \({\text{C}}_{{\text{m, n}}}\) :

Comfort value of appliance m
 \(E_{{\text{n}}} {\text{/C}}_{{\text{n}}} {\text{/F}}_{{\text{n}}}\) :

Electricity cost/comfort/penalty
 M :

Set of all loads
 \({\text{M}}_{{{\text{UL}}}} {\text{/M}}_{{{\text{TCL}}}}\) :

Set of UL/TCL
 \({\text{M}}_{{{\text{IL}}}} {\text{/M}}_{{{\text{TL}}}}\) :

Set of IL/TL
 N :

Total number of manage steps in one day
 \({\text{P}}_{{\text{m}}}\) :

Rated power of appliance m
 \({\text{P}}_{{{\text{AC}}}} {\text{/P}}_{{{\text{WH}}}}\) :

Rated power of AC/WH
 \({\text{P}}_{{\text{n}}}^{{{\text{PV}}}}\) :

Predicted power of PV
 \({\text{R}}_{{{\text{AC}}}} {\text{/R}}_{{{\text{WH}}}}\) :

Equivalent thermal resistance of AC/WH
 \({\text{S}}_{{\text{n}}}^{{{\text{TCL}}}} /{\text{S}}_{{\text{n}}}^{{{\text{IL}}}} /{\text{S}}_{{\text{n}}}^{{{\text{TL}}}}\) :

Self state of TCL/IL/TL
 \({\text{S}}_{{\text{n, cor}}}\) :

State of the correlated load
 \({\text{S}}_{{\text{n, i, cor}}}^{{(1)}}\) :

Correlation information of appliance with the largest correlation coefficient
 \({\text{S}}_{{\text{n, i, cor}}}^{{(2)}}\) :

Correlation information of appliance with the second largest correlation coefficient
 \({\text{S}}_{{\text{n, i, cor}}}^{{{\text{UL}}}}\) :

Correlation information of appliance with the largest correlation coefficient in UL
 \({\text{T}}_{{\text{n}}}^{{{\text{AC}}}} {\text{/T}}_{{\text{n, env}}}\) :

Current indoor/outdoor temperature
 \({\text{T}}_{{\text{n}}}^{{{\text{WH}}^{\prime}}}\) :

Current water temperature without considering the injected water
 \({\text{T}}_{{\text{n}}}^{{{\text{WH}}}} {\text{/T}}_{{\text{n, inject}}}^{{{\text{WH}}}}\) :

Current heated/injected water temperature
 \({\text{T}}_{{{\text{min}}}}^{{{\text{AC}}}} {\text{/T}}_{{{\text{max}}}}^{{{\text{AC}}}}\) :

Lower/upper bounds of indoor temperature
 \({\text{T}}_{{{\text{min}}}}^{{{\text{WH}}}} {\text{/T}}_{{{\text{max}}}}^{{{\text{WH}}}}\) :

Lower/upper bounds of water temperature
 \({\text{V/V}}_{{\text{n, demand}}}\) :

Volume of water heater tank/water demand
 \({\text{c}}\) :

Working times required in one day
 \({\text{p}}_{{\text{n}}}\) :

Usage probability in the nth period
 \({\text{s}}_{{\text{n}}}^{{{\text{MP}}}}\) :

State of management permission (1/0)
 \({\text{s}}_{{\text{n}}}^{{{\text{CP}}}}\) :

Completion progress of work in one day
 \({\text{s}}_{{\text{n}}}^{{{\text{TC}}}}\) :

State of temperature comfort
 \({\text{s}}_{{\text{n}}}^{{{\text{RM}}}}\) :

Remain controllable times in one day
 \({\text{t}}_{{\text{m, start}}}\) :

Start time of appliance m
 \({\text{t}}_{{\text{m, min}}} /{\text{t}}_{{\text{m, max}}}\) :

Lower/upper bounds of controllable time
 \({\text{x}}_{{\text{n}}}\) :

On/off decision in the nth period (1/0)
 \({\text{x}}_{{\text{m, n}}}\) :

On/off decision in the nth period of appliance m (1/0)
 \(\rho_{{\text{n}}}\) :

Timeofuse price in the nth period
 \(\Delta {\text{t}}\) :

Length of time step
 \(\Delta {\text{t}}_{{\text{m}}}\) :

Length of working time required in one day
References
PérezLombard, L., Ortiz, J., & Pout, C. (2008). A review on buildings energy consumption information. Energy & Buildings, 40(3), 394–398.
Zhang, D., Yao, L., & Ma, W. (2013). Development strategies of smart grid in china and abroad. Proceedings of the CSEE, 31(31), 2–14.
Liu, S., Zhou, C., Guo, H., Shi, Q., Song, T. E., Schomer, I., & Liu, Y. (2021). Operational optimization of a buildinglevel integrated energy system considering additional potential benefits of energy storage. Protection and Control of Modern Power Systems, 6(1), 1–10.
Zhixin, Fu., Ziyan, Li., Junpeng, Z., & Yue, Y. (2022). Multiuser multitimescale power packages and home energy optimization strategies. Power Systems Protection and Control, 50(11), 21–31.
Çimen, H., Çetinkaya, N., & Vasquez, J. C. (2021). A microgrid energy management system based on nonintrusive load monitoring via multitask learning. IEEE Transactions on Smart Grid, 12(2), 977–987.
Lin, Y. H., & Tsai, M. S. (2017). An advanced home energy management system facilitated by nonintrusive load monitoring with automated multiobjective power scheduling. IEEE Transactions on Smart Grid, 6(4), 1839–1851.
Shin, C., Joo, S., Yim, J., Lee, H., & Rhee, W. (2019). Subtask gated networks for nonintrusive load monitoring (Vol. 33, pp. 1150–1157).
Piccialli, V., & Sudoso, A. M. (2021). Improving nonintrusive load disaggregation through an attentionbased deep neural network. Energies, 14, 847.
Xiu, Y., An, Li., Gaiping, S., et al. (2022). Noninvasive load monitoring based on an improved GMMCNNGRU combination. Power Systems Protection and Control, 50(14), 65–75.
Lu, R., Hong, S. H., & Yu, M. (2019). Demand response for home energy management using reinforcement learning and artificial neural network. IEEE Transactions on Smart Grid, 10(6), 6629–6639.
Berk, C., Robin, R., Siddharth, S., David, B., & Abdellatif, M. (2017). Electric energy management in residential areas through coordination of multiple smart homes. Renewable and Sustainable Energy Reviews, 80, 260–275.
Zhai, S., Zhou, H., Wang, Z., et al. (2020). Analysis of dynamic appliance flexibility considering user behavior via nonintrusive load monitoring and deep user modeling. CSEE Journal of Power and Energy Systems, 6(1), 41–51.
Peng, B., Pan, Z., Yu T., et al. Graph data modeling and graph representation learning methods and their application in nonintrusive load monitoring problem[J/OJ]. In Proceedings of the SCEE (in Chinese).
Nalmpantis, C., & Vrakas, D. (2020). On time series representations for multilabel NILM. Neural Computing and Applications, 32, 17275–17290.
Kong, W., Dong, Z. Y., Hill, D. J., Ma, J., Zhao, J. H., & Luo, F. J. (2016). A hierarchical hidden Markov model framework for home appliance modeling. IEEE Transactions on Smart Grid, 9, 3079–3090.
He, D., Lin, W., Liu, N., & Harley, R. G. (2013). Incorporating nonintrusive load monitoring into building level demand response. IEEE Transactions on Smart Grid, 4(4), 1870–1877.
Lam, H. Y., Fung, G., & Lee, W. K. (2007). A novel method to construct taxonomy electrical appliances based on load signaturesof. IEEE Transactions on Consumer Electronics, 53(2), 653–660.
Tabatabaei, S. M., Dick, S., & Xu, W. (2017). Toward nonintrusive load monitoring via multilabel classification. IEEE Transactions on Smart Grid, PP(1), 1–1.
Singhal, V., Maggu, J., & Majumdar, A. (2018). Simultaneous detection of multiple appliances from smartmeter measurements via multilabel consistent deep dictionary learning and deep transform learning. IEEE Transactions on Smart Grid, 10, 2969–2978.
Su, Y., Zhou, Y., & Tan, M. (2020). An interval optimization strategy of household multienergy system considering tolerance degree and integrated demand response. Applied Energy, 260, 114.
Lei, Y. U., Tang, Q., & Zhang, J. (2015). Optimal operation for residential microgrids based on load resources classification modelling and heuristic strategy. Power System Technology, 39, 2180–2187.
Du, P., & Ning, L. (2012). Appliance commitment for household load scheduling. In Transmission & distribution conference & exposition. IEEE.
Kolter, J. Z., & Johnson, M. J. (2011). REDD: A public data set for energy disaggregation research. In Artificial intelligence (Vol. 25).
Murray, D.. (2015). A data management platform for personalised realtime energy feedback. In Proc. 8th int. conf. energy efficiency domestic appl. lighting (EEDAL) (pp. 1–15).
Pecan street inc. dataport [EB/OL]. https://dataport.pecanstreet.org/data.
Al Shalabi, L., Shaaban, Z., & Kasasbeh, B. (2006). Data mining: A preprocessing engine. Journal of Computer Science, 2(9), 735–739.
Xia, M., Liu, W., Wang, K., Zhang, X., & Xu, Y. (2019). Nonintrusive load disaggregation based on deep dilated residual network. Electric Power Systems Research, 170, 277–285.
Kingma. D., & Ba, J. (2014). Adam: A method for stochastic optimization. In ICLR 2015.
Wang, J., Li, Y., & Zhou, Y. (2016). Interval number optimization for household load scheduling with uncertainty. Energy & Buildings, 130(Oct), 613–624.
Lin, W. Z., Fang, J. A., Xiao, X., et al. (2013). iLocAnimal: A multilabel learning classifier for predicting subcellular localization of animal proteins. Molecular BioSystems, 9(4), 634–644.
Zhang, M. L., & Zhou, Z. H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1338–1351.
Tsoumakas, G., Katakis, I., & Vlahavas, I. (2010). Random klabelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering, 23(7), 1079–1089.
Hasselt, H. V., Guez, A., & Silver, D. (2015). Deep reinforcement learning with double Qlearning[J]. In Computer ence.
Acknowledgements
Not applicable.
Author information
Xinpei Chen
received the B.Eng. degree in electrical engineering from the South China University of Technology, Guangzhou, China, in 2020, where he is currently pursuing the M. Eng. degree with the School of Electric Power Engineering. His research interests include artificial intelligence techniques and its application in smart grid.
Tao Yu
received the B.Eng. degree in electrical power system from Zhejiang University, Hangzhou, China, in 1996, the M.Eng. degree in hydroelectric engineering from Yunnan Polytechnic University, Kunming, China, in 1999, and the Ph.D. degree in electrical engineering from Tsinghua University, Beijing, China, in 2003. He is currently a Professor with the College of Electric Power, South China University of Technology, Guangzhou, China. His research interests include nonlinear and coordinated control theory, artificial intelligence techniques, and operation of power systems.
Zhenning Pan
received the B.Eng. and Ph.D. degrees in electrical engineering from the South China University of Technology, Guangzhou, China, in 2016 and 2021, respectively. His major research interests include intelligent operation and optimization of smart grid, and demand response.
Zihao Wang
received the B.Eng. degree in electrical engineering from the Hunan University, Changsha, China, in 2020, where he is currently pursuing the M.Eng. degree with the School of Electric Power Engineering. His research interests include intelligent terminal and topology identification of lowvoltage distribution network.
Shengchun Yang
received the B.S. degree from Huazhong University of Science and Technology, Wuhan, China in 1995, M.S. degree from Nanjing Automation Research Institute, Nanjing, China in 1998 and Ph.D. degree from Huazhong University of Science and Technology, Wuhan, China in 2016. He is currently working for China Electric Power Research Institute as associate director. His research interests include demand response, AI applications in power system operations with high penetration of flexible load and renewable generation.
Funding
This work is supported by State Grid Corporation of China Project “Research on Coordinated Strategy of Multitype Controllable Resources Based on Collective Intelligence in an Energy” (5100202055479A0000).
Author information
Authors and Affiliations
Contributions
XC carried out theoretical analysis of the process and performed simulation and experiment to verify the proposed method, TY and ZP offered help in theory and practice, read and put forward suggestions for the paper. ZW contributed to the electrical model simulation experiment. SY guided and assisted the manuscript revision and improvement. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chen, X., Yu, T., Pan, Z. et al. Graph representation learningbased residential electricity behavior identification and energy management. Prot Control Mod Power Syst 8, 28 (2023). https://doi.org/10.1186/s4160102300305x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4160102300305x