An explainable framework for load forecasting of a regional integrated energy system based on coupled features and multi-task learning

To extract strong correlations between different energy loads and improve the interpretability and accuracy for load forecasting of a regional integrated energy system (RIES), an explainable framework for load forecasting of an RIES is proposed. This includes the load forecasting model of RIES and its interpretation. A coupled feature extracting strategy is adopted to construct coupled features between loads as the input variables of the model. It is designed based on multi-task learning (MTL) with a long short-term memory (LSTM) model as the sharing layer. Based on SHapley Additive exPlanations (SHAP), this explainable framework combines global and local interpretations to improve the interpretability of load forecasting of the RIES. In addition, an input variable selection strategy based on the global SHAP value is proposed to select input feature variables of the model. A case study is given to verify the effectiveness of the proposed model, constructed coupled features, and input variable selection strategy. The results show that the explainable framework intuitively improves the interpretability of the prediction model.


Introduction
A regional integrated energy system (RIES) is defined as a region with various forms of energy supply, conversion and storage devices, where the consumption of different energies such as electricity, cooling, and heating are coupled with each other. An RIES is generally a unified management area, such as industrial parks, commercial buildings, campuses, and other multi-energy joint operation systems, areas which allow for unified access and analysis of different energy data. Accurate and reliable load forecasting is the foundation for optimal dispatching and energy management of an RIES, as it is used to provide data support for developing energy trading plans, energy system security assessment, and demand-side management. Therefore, an effort to improve the accuracy of load forecasting is meaningful for the development of an RIES.
Because of the strong correlation between different types of energy consumption, the planning and operation of an RIES are more complex than a traditional single type of energy system [1,2]. Two types of correlations exist in an RIES: intra-coupled relation (the correlations between load and its own power) and inter-coupled relation (the correlations between two different energy loads). This means that the demand for different types of loads is not only related to their own historical loads, but also to the coupling relations (the correlations and influences of two or more elements or systems) between other energy type loads. This brings new challenges and ideas for load forecasting of different energy types. For RIES load forecasting, traditional methods are mainly based on the forecasting methods of electric load such as wavelet neural networks [3], auto-regressive integrated moving average (ARIMA) [3], Bayesian regression model [4], and Grey model [5], etc. However, these studies predict cold, heat, and electric loads separately, and lack the exploration and utilization of the coupling relations between different load forecasting tasks. Therefore, the impact that changes in one type of energy demand exert on other types of energy demand cannot be determined, and this has a negative impact on the accuracy of load forecasting for the RIES. In recent years, deep learning has been widely used in load forecasting with its strong learning capability and data feature extraction ability [6], with techniques such as convolutional neural networks (CNNs) [7,8], long short-term memory (LSTM) [9,10], recurrent neural networks (RNNs) [11], and deep belief networks (DBNs) [12], etc.
To extract the coupling relations between load features, reference [8] uses a CNN to excavate and fuse the load features in high-dimensional space, and feeds the fused features into LSTM in a time-series manner. A dynamic time series feature extraction method based on LSTM is proposed to excavate the relevance among the series in different times by reconstructing the input sequence through the encoder-decoder model in [10]. Experimental results of the methods proposed in [8] and [10] show that the prediction accuracy can be improved by using coupling information between different energy loads. However, these methods usually build the model based on original features (OFRs) obtained from the historical load data directly as the input of the load forecasting model, but rarely consider the internal coupling relations between OFRs. The load forecasting model is then unable to make full use of the intrinsic relations at the feature level to obtain better performance. To fully extract and quantify the coupling relations, a method of constructing coupled feature representation (CFR) is proposed to quantify the global dependence of continuous features in [13]. In [14], coupled features of different image data based on CFR are proposed for multi-modal Alzheimer's disease diagnosis. This achieves higher diagnostic accuracy than existing methods. Inspired by these studies and considering the continuity of loads data and strong correlations between cold, heat, and electric loads, this paper introduces CFR to extract the intrinsic relations between OFRs.
To extract coupling relations between different load forecasting tasks, attention is mainly paid to MTL, which puts multiple related tasks together to learn and realize information sharing to make use of the coupling relations between different tasks. Both [15] and [16] achieve good prediction results based on MTL, and this proves the advantage of MTL in the field of multi-type load prediction. However, deep learning's opaque structure might hinder further analysis or improvement of the model.
The planning and operation of an RIES highly depend on accurate and reliable load forecasting. Interpretability means finding out the key features that affect the prediction results and the basis of prediction decision. This can then provide a reference for the review of the prediction results and model improvement. Previous work on the interpretability of the deep learning model include sensitivity analysis [17], surrogate model [18], and SHAP [19]. The sensitivity analysis and surrogate model are local interpretation methods, which can only interpret the output results of an individual sample. Thus, they prevent analysis of the importance of input features to the model based on global samples. In addition, the effect of LIME is limited by the surrogate model itself, and the similarity between the surrogate model and target model is also difficult to measure. SHAP, proposed in [19], is an additive feature attribution method based on game theory, and can be used for local and global interpretation of arbitrary models. Some studies on the interpretability of load forecasting have been carried out. For example, based on SHAP, an explainable framework is built in [20] to explain the random forest prediction model for failure modes of reinforced concrete members, while [21] establishes an attribution analysis framework for transient voltage stability evaluation. Experimental results show that the SHAP value can quantify the relationships between input features and output results, and facilitate the understanding of model decisions and complex nonlinear relationships in the model [20][21][22].
In this paper, for better performance and interpretability of the load forecasting model of an RIES, an explainable framework is proposed which consists of two main parts: the load forecasting model and its interpretation. To extract the strong coupling relations between different energy loads at the feature level, CFR proposed in [13] is introduced. Its main idea is to quantify the coupling relations between OFRs by constructing the representation containing intra-coupled relation and inter-coupled relation as part of the input variables of the prediction model. The MTL model with LSTM as sharing layer (LSTM-MTL) is adopted as the base model to excavate coupling relations between different energy loads at the forecasting task levels. To improve the interpretability of the load forecasting model, SHAP is introduced to provide global and local interpretation. The global interpretation can reveal the relationships between the values of features and prediction results, and measure the contribution of features to the different load forecasting tasks. This can provide a basis for selecting input features of the load forecasting model. For the local interpretation, it is used to provide details of the contributions of input feature variables to the load prediction of an individual sample. Experiments based on load datasets provided by Arizona State University Integrated Energy System validate the feasibility of the framework designed in this paper.
The main contributions of this paper are as follows: (1) By introducing the calculation representation of the coupled features as the model input variables and constructing the LSTM-MTL model based on hard parameter sharing, the coupling relations between different energy loads can be excavated at the feature level and the forecasting task level, respectively. (2) Global and local interpretation based on SHAP is proposed to analyze the difference in the contribution of each feature to prediction results and the relationships between input features and prediction results, so as to improve the interpretability of the load forecasting model of an RIES. (3) An input variable selection strategy based on the global SHAP value is proposed to select input feature variables. This helps eliminate the overfitting and over-computation of the model by removing the input features that contribute less to the prediction model.
The rest of this paper is organized as follows. In Sect. 2, the details of the CFR construction method and LSTM-MTL load forecasting model are provided. The interpretation method of the LSTM-MTL model and its input variable selection method based on SHAP are provided in Sect. 3, while Sect. 4 summarizes features selection and the experimental process of the explainable framework. Results of the proposed explainable framework are presented in Sect. 5, where the implementation of the framework is discussed. Section 6 concludes the paper and discusses future directions.

Proposed RIES load forecasting model
In order to extract the coupling relations between different energy loads in an RIES from the perspectives of input features and forecasting tasks, a coupled feature construction method based on CFR is introduced to extract the feature-level intrinsic relations between input loads. It combines the representation of intra-coupled and inter-coupled relations as part of the input variables of the load forecasting model. The load prediction model based on LSTM-MTL is built to excavate the coupling relations between different forecasting tasks. Given that the LSTM model has a strong dynamic feature extraction ability of time series and the fully connected network is suitable for the final fitting of the model after feature extraction, LSTM is adopted as the sharing layer, and the fully connected network is used for each subtask in the LSTM-MTL model as shown in Fig. 1.

Coupled features representation (CFR)
CFR can capture the global dependence of continuous features by constructing intra-coupled relations and inter-coupled relations [13]. In this paper, the coupled features between cold, heat, and electric loads are constructed based on CFR. The process of constructing coupled features is listed as follows (the superscripts C, H, and P represent cold, heat, and electric loads, respectively).  where z iC , z iH , and z iP represent the cold, heat, and electric load values of the ith sample, respectively. z e ij represents the eth (e = 1, 2, · · ·, E) power of z ij , e.g., z 2 ij represents the square of z ij .E is the maximal expansion number.
(2) Calculate the intra-coupled relation of cold, heat, and electric loads, and the intra-coupled relation matrix is defined as (taking the cold load as an example): where , z e 2C , · · · , z e MC ] T is the eth column vector of expansion matrix Z. Suppose that there are M data pairs x i , y i . The PCC is calculated as: The closer the value of ρ 2 is to 1, the stronger the correlations between the two variables, and the positive and negative values of ρ represent positive and negative correlation, respectively.
(3) Calculate the inter-coupled relation between cold, heat, and electric loads, and the inter-coupled relation matrix is defined as (taking cold load as an example): where HC e 1 e 2 is the PCC between Z e 1 iH and Z e 2 iC (e 1 , e 2 = 1, 2, · · ·, E) of expansion matrix Z. (1) For the cold load in the i th sample, its intra-coupled extension matrix is defined as: (5) Its inter-coupled extension matrix with heat and electric loads is defined as: The coefficient matrix w is introduced to set the weights of PCC between different vectors. For the i th sample, w = [(1/1!), (1/2!), · · · , (1/E!)] and the CFR of cold load can be defined as: where ⊙ is Hadamard product. u i (C) uses a Taylorlike series to quantify the global dependency, since any analytic function can be approximated by a Taylor polynomial [13]. Although the above steps are illustrated for cold load, they can also be applied directly to heat load and electric load to get u i (H ) and u i (P). (7) The CFR of the ith sample can be represented as: where u i includes 3 × E coupled feature variables (CF). u i (C) , u i (H) , and u i (P) represent the CFR of cold load, heat load, and electric load, respectively. The coupled feature construction algorithm has the following characteristics: (1) The algorithm is an unsupervised learning method with strong generalization ability; (2) The robustness of the algorithm can be improved using global data; (3) The size of the power can be set flexibly according to the actual situation to extract the coupling relations; (4) The coupled features have physical interpretability and can be quantified to provide data support for subsequent interpretability research.
Previous study has shown that the maximal expansion number E = 3 or 4 is large enough to capture the global couplings of attributes [13]. Thus, in the following experiments, the extended power E is taken as 3 to avoid excessive computation. A total of 9 coupled features are extracted, called CFR1-9. CFR1-3 mainly describe the coupling relations between cold load and other loads, while CFR4-6 and CFR7-9 describe the coupling relations of heat load and electric load, respectively.

LSTM-MTL model
A load forecasting model based on LSTM-MTL is proposed to learn the complex coupling information between cold, heat, and electric load forecasting tasks. The LSTM-MTL model adopts a hard parameter sharing mechanism, which is less prone to overfitting in the case of strong coupling of the loads in the RIES prediction model with many model parameters and complex structure [23]. The core concept of LSTM is gate structure, which decides whether data should be discarded, controls information to be added to the LSTM unit, and outputs unit state information by introducing forget gate f t , input gate i t , and output gate o t , respectively [24]. The formulas involved are: where the meanings of the relevant variables in (9)(10)(11)(12)(13)(14) are described in [24].
In the LSTM-MTL load forecasting model, the input vector is considered as X = (x 1 , x 2 , · · ·, x t ) , and is transmitted to the LSTM sharing layer, where the coupling relations among the three load prediction subtasks are captured through sharing underlying parameters. Then, the output vector H = (h 1 , h 2 , · · ·, h t ) is connected to the linear layer of the three subtasks and the output result is defined as: where Relu is the activation function, W is the weight matrix, and b is the bias. y k i is the kth (k = 1, 2, and 3 represents cold, heat, and electric load forecasting task, respectively) subtask output result, and the different toplevel parameters can be obtained through the linear layer of each subtask.
Finally, the loss function of the kth subtask is defined as L k (θ share , θ k ) , where θ share and θ k represent the sharing parameters among subtasks and the unique parameters of the k th subtask, respectively. The integral optimization loss function of LSTM-MTL is defined as: where T = 3 represents that the model has three subtasks in total, and a k is the weight coefficient of the k th subtask.
Finding the suitable value of a k to balance the contribution of each task is the key to MTL. The three prediction (9) tasks of cold, heat, and electric are strongly coupled, and the goal of all the three prediction tasks is to minimize the prediction error with the same evaluation index MAE, which makes the three prediction tasks similar. In the following experiments, the goal is to minimize the overall prediction error of the three tasks, and we preset the a k of each task to 0.4, 0.2, and 0.4, respectively, according to the peak ratio of the three loads in the RIES. In addition, the prediction errors are compared by setting different weights as shown in Table 6 in the Appendix. By training the model, the optimal parameters θ share and θ k are obtained, which helps extract the coupling relations between different subtasks and learn the differences between subtasks, respectively.
To summarize, CFR is introduced to extract the feature-level intrinsic relations between input loads, and the prediction model based on LSTM-MTL is built to determine the complex coupling information between cold, heat, and electric load forecasting tasks. Therefore, the coupling relations between loads can be extracted from the perspectives of input features and forecasting tasks to improve the model performance.

Explainable framework
The LSTM-MTL model of RIES load forecasting described above is a complex integrated model, and its lack of interpretability can lead to user distrust in model prediction results and hinder further analysis or improvement of the model. To improve the interpretability of the prediction model, this section uses SHAP to provide an explainable framework for the LSTM-MTL prediction model, and an input variable selection strategy based on the global SHAP value is proposed to avoid overfitting and over-computation.

SHAP (Shapley additive explanation)
SHAP was proposed by Lundberg and Su-In Lee in 2017 based on cooperative game theory to describe the where X is the set of coalition vector x ′ .
There may be more than one valid feature in the input feature x ′ . This will affect the evaluation of the contribution of an individual feature. In order to reduce this impact, the weight of loss function L is defined as: where x ′ represents the number of valid input features in x ′ .
The details of calculating the SHAP value of input features x are shown below.
(1) Create the training data set X in random ways, X is the set of coalition vectors x ′ k ∈ {0, 1} M , k ∈ {1, 2, · · ·, K } . K represents the number of coalition vectors x ′ in X.
(2) Convert each x ′ k to original feature space using x k =h(x ′ k ), and obtain the prediction using (3) Calculate the weight π x (x ′ k ) of each x k using (19). (4) Fit the explanation model g x ′ by optimizing the loss function L. (5) Return φ i (the SHAP value of the ith feature in x).
Compared with other methods, SHAP has a solid theoretical foundation, is fast, and has the advantage of local accuracy and consistency [29]. This indicates that the SHAP value of each feature is uniquely determined and can correctly reflect the contribution of features to the model. References [25][26][27] prove that the SHAP value is the only attribution method that satisfies all the above properties at the same time.

Principles of interpretation
Applying SHAP to the interpretation of the cold, heat, and electric load forecasting model and fitting the load forecasting model with an interpretable linear model, the contribution of each feature variable to cold, heat, and electric load prediction results can be obtained based on the SHAP value. Using (17)(18)(19), the SHAP values in the kth (k = 1, 2, and 3 represents cold, heat, and electric load forecasting tasks, respectively) prediction subtask can be obtained as: where y k i is the prediction value of the ith sample in the kth prediction subtask, y k base is the average of the kth prediction subtask, and M is the number of input features. x ij is the jth feature of the ith sample and f k (x ij ) is its SHAP value, which indicates the contribution of the feature to the kth prediction subtask. f k (x ij ) > 0 indicates that the feature increases the prediction value, while f k (x ij ) < 0 indicates that it decreases the prediction value. (20)  As shown in Fig. 3, the explainable framework of the load forecasting model for RIES includes two parts: loads prediction and interpretation. The left side of Fig. 3 shows the relationships between the SHAP model and the LSTM-MTL model, while the SHAP value calculation is built on the basis of the LSTM-MTL model. The right side of Fig. 3 shows the interpretability application based on SHAP, where the typical daily sample and the global sample are input into the SHAP model to obtain local interpretability and global interpretability, respectively.

Global interpretation
The global interpretation is generated by analyzing the average contribution of input feature variables to the prediction results based on the global samples and can be obtained by calculating the average SHAP value of each feature variable with global samples. Therefore, the SHAP value of the feature variables in the cold, heat, and electric load forecasting tasks can be calculated separately with global samples to measure the importance of each feature variable in different forecasting tasks. By weighting and summing the SHAP values of cold, heat, and electric loads at a certain proportion, the importance of feature variables to the whole prediction model can be obtained, and an input variable selection strategy based on the global SHAP value is proposed to improve the prediction model by removing the input features that contribute less to the prediction model.

Local interpretation
The local interpretation is generated by analyzing the contribution of input feature variables to the load prediction of an individual sample, and the local interpretation of the ith sample can be obtained using (20). In this paper, typical daily samples in winter and summer are selected to analyze the relationships between the SHAP distribution of the feature variables and the load prediction results in different seasons. This provides a further explanation of the decision basis of the prediction model and enhances the credibility of the load forecasting model.

Experimental set-up
In this section, the details of prediction and interpretation are shown, including the selection of input variables for the LSTM-MTL prediction model, the evaluation method of the prediction model performance, and the main experimental process of interpretable analysis and application for load forecasting of RIES.

Input variables of load forecasting model
Selecting rational feature variables is the key to obtaining accurate RIES load prediction results. In addition to the historical loads and the constructed coupled feature variables (CF), load demand is also affected by other external factors such as season, week, hour, etc. Table 1 shows the feature variables selected in this paper. The time window length of the LSTM-MTL model is a very important parameter in deciding the size of input features. Considering the time series characteristics of the load, the time windows from 24 h, 48 h, 72 h, and 96 h are chosen, as shown in Table 7 in the Appendix. According to the method shown in Fig. 4, the highest prediction accuracy is achieved when the time window is 72 h, and thus, it is set as the time window of the LSTM-MTL model, which predicts the load at the current moment based on the input variables 72 h before.

Evaluation metric for forecasting accuracy
Mean absolute percentage error (MAPE) and root mean square error (RMES) are used as error evaluation metrics to evaluate the prediction method described above. The relevant formulas are: where x i and y i are the true value and prediction value, respectively, and n is the number of samples. In this paper, the RIES studied is located in the hot Phoenix, where the cold load and electric load demand is higher than the heat load demand. Thus, the importance of cold, heat, and electric loads is weighted to 0.4, 0.2, and 0.4, respectively, according to the ratio of cold, heat, and electric peak loads. The overall errors of the RIES load forecasting model are:

Prediction and interpretable analysis process
This section introduces the main experimental process of cold, heat, and electric load prediction of RIES and its interpretable analysis. Since the training time of the LSTM-MTL model is long, the experimental process is divided into two parts: offline training and online application as shown in Fig. 4.

Offline training
Step 1 Collect historical data of cold, heat, and electric loads and other external data in Table 1 as part of the input variable set.
Step 3 Build the LSTM-MTL load forecasting model, and determine some hyperparameters such as learning rate, optimizer, time window, and weight based on experience and task characteristics.
Step 4 Train the model until the number of iterations reaches the preset value or the accuracy improvement is less than the preset value.
Step 5 Repeat Step 4 and update the hyperparameters in step 3 according to the prediction accuracy.
Step 6 Fit the explanation model and obtain the SHAP value to provide local interpretation, global interpretation, and input variable selection strategy for the prediction model.

Online application
Step 7 Obtain the RIES real-time data and calculate the coupled features.
Step 8 Input the feature variables into the improved model to obtain the load prediction results of RIES and provide explainable analysis for the prediction results.

Data set
The load data is collected by the campus of Arizona State University in Tempe, USA, where the minimum daily average heat load in summer is about 60% of the annual average heat load, and the minimum daily average cold load in winter is 24% of the annual average cold load. The units of cold load, heat load, and electric load are Ton/h, mmBtu/h, and kW, respectively, and their unit conversion formula is: The temporal granularity of this data set is 1 h, i.e., the daily load curve includes 24 points and spans from January 1, 2019 to July 15, 2020, totaling 562 days. For the modeling part, the training, validation, and test datasets account for 70, 15 and 15%, respectively. The temperature data is obtained from the actual data at the Phoenix weather station [28].  Table 2 shows the CF extraction results and the PCC between coupled features and cold, heat, and electric loads. The prediction results of the LSTM-MTL model with CF (CF-LSTM-MTL) for RIES load prediction are shown in Fig. 5, and the distribution of the hourly MAPE values is shown in Fig. 6. By comparing the prediction errors of the prediction model with or without CF and MTL, the advantages of the coupled features and prediction model constructed in this paper are verified as shown in Table 3. As shown in Fig. 6, most MAPE values of the CF-LSTM-MTL prediction model for cold, heat, and electric loads are less than 5%. From Table 3, the prediction accuracy of both heat and electric loads are improved significantly after adding CF into the LSTM model (CF-LSTM) and the prediction accuracy of all the three loads is significantly improved after introducing MTL and CF together. From the results, we can conclude:

Coupled feature extraction and load prediction results
(1) The construction of coupled features and the multitask learning model can improve the accuracy of the prediction model and avoid modeling the three loads simultaneously. This greatly improves the practical value. (2) Nine coupled features are extracted and the PCC between coupled features and loads are very high (PCC > 0.8), which satisfies the requirements of the traditional input variable selection method based on correlation. Excess coupled features may lead to overfitting and over-computation in load prediction and reduce the prediction accuracy. An input variable selection strategy based on SHAP value is proposed in a later experiment to improve the prediction model.

Global interpretation
The results of the global SHAP value distribution of RIES load forecasting and the contribution of different features to the prediction of cold, heat, and electric loads are shown in this section. The global SHAP values of feature variables are ranked from large to small as shown in Fig. 7, where each point in the scatter plot represents a sample, the horizontal coordinate represents the SHAP value, and the darker color of the sample point represents the larger feature value.
As shown in Fig. 7, the feature variables have different contributions in different forecasting tasks, and the top six important features of cold, heat, and electric load forecasting are shown in Table 4. The following conclusions can be drawn: the SHAP values of 'small' months (January, February, etc.) are negative. These decrease cold load prediction results. These are consistent with the actual situation that local cold load level is low in cold months and high in hot months. (4) In heat load prediction, SHAP values of temperature and hour are small and many external factors have no significant effect on heat load prediction. This is caused by the fact that Arizona's overall temperature is high and its heat load remains stable.
In summary, global SHAP value distribution can visualize the relationships between the distribution of SHAP value and input features. These are consistent with objective understanding and laws, and validate the effectiveness of the constructed coupled features.

Local interpretation
The global interpretation mainly shows the contribution of each feature to different forecasting tasks based on the global sample. In this section, we further explain the basis of an individual sample prediction and analyze the contributions of different features to different prediction samples. Considering the limited space available in the paper, typical daily samples in summer and winter are selected for comparison and analysis. Figures 8, 9 and 10 show the local interpretation results of the cold, heat, and electric loads, respectively. In these figures, the length of arrow represents the absolute SHAP value of the corresponding feature, the red arrow represents a positive SHAP value that drives predictions higher, and the blue arrow represents a negative SHAP value that makes predictions lower.
Local SHAP value distribution shows that there are significant seasonal differences in the contribution of each feature to the prediction results of cold, heat, and electric loads. Taking temperature and historical load as examples, we make the following analysis: (1) For the influence of historical load in the prediction of cold load, the SHAP value of heat load is a large negative value in both summer and winter as shown in Fig. 8. Similarly, in the prediction of heat load, the SHAP value of cold load is also a large negative value as shown in Fig. 9. These indicate that there is a strong negative correlation between the cold and heat loads, which is consistent with the existence of the strong complementarity between the energy use curves of cold and heat loads. (2) For the influence of temperature in summer, the SHAP value of temperature is positive in the prediction of both cold and electric loads as shown in Figs. 8a and 10a. This indicates that higher temperature can contribute to a higher load prediction value. In winter, the SHAP of temperature is negative in both heat and electric load predictions, which indicates that higher temperature can contribute to lower load prediction value. In terms of physical significance, high summer temperatures lead to higher cooling load demand and frequent use of electric cooling equipment, while low winter temperatures lead to higher heat load demand and frequent use of electric heating equipment.
In summary, local SHAP value distribution can visualize the relationships between input features and individual prediction results, and these visualizations are consistent with objective understanding and laws and improve the transparency of the prediction model and the credibility of the prediction results.

Features selection
According to the ratio of cold, heat, and electric peak loads, the global SHAP value of features in cold, heat, and electric load predictions are weighted to 0.4, 0.2, and 0.4, respectively, and the sum is shown in Fig. 11. It can be seen that the SHAP values of hour, historical load, CFR8, and CFR4 are at high levels, while the SHAP values of CFR7, CFR3, and CFR5 are at very low levels, which indicates that the three coupled features have less contribution to the prediction results. Then, an input variable selection strategy is applied to improve the model by removing the three coupled features from the input features, and the errors of the original model and improved model are shown in Table 5. By comparison, it can be seen that MAPE and RMSE are reduced after applying the input variable selection strategy, which shows the strategy is effective and can improve the prediction accuracy of the model. In addition, the strategy can reduce the data dimension, which helps reduce the modeling effort and increase the practical value of the model.

Conclusion
In this paper, a coupled feature construction method is introduced to construct CFR between loads. It constructs the representation of intra-coupled and inter-coupled relations, and integrates the representation by a Taylorlike expansion as part of the input feature variables. It can extract and quantify the intrinsic relations between input loads. An LSTM-MTL loads prediction model is proposed to determine the coupling relations between different load forecasting tasks based on a hard parameter sharing mechanism. The coupling relations between loads in the RIES are then fully used from the perspectives of input features and forecasting tasks. In order to improve the interpretability of the load forecasting model, global and local interpretations are provided. The global interpretation can show the important features of the prediction model, and be used to analyze the relationships between the SHAP value and input features. Local interpretation is used to interpret the load prediction results of an individual sample, and show the prediction basis of the prediction model. In addition, an input variable selection strategy based on the global interpretation results is proposed to simplify the prediction model and avoid overfitting and over-computation by selecting input features according to their importance.
The case study has shown that the proposed model and input variable selection strategy can effectively improve the accuracy of load forecasting in an RIES, and these methods can be extended to the application of the prediction model in different scenarios. From the global and local SHAP value distributions of cold, heat, and electric load prediction, we can intuitively obtain the relationships among input features, SHAP value, and prediction results. The experiments show that the interpretation results are consistent with the characteristics of the load prediction. They can improve the credibility of the prediction results and provide a basis for users to undertake comprehensive energy optimization scheduling and management.
With the continuous development of RIES, main energy users will be extended to commercial areas, residential communities, office buildings, etc. As the objects of energy use become more diversified, the subsequent research needs more refined data support, including on the flow of customers in commercial areas, daily sales, residential area user types, etc. The data helps achieve a more threedimensional portrayal of the coupling relations between loads. Therefore, based on the proposed method in this paper, the load forecasting model and interpretation analysis can be further extended to these areas in the future.