An information entropy-based risk assessment method for multiple-media gathering pipelines

Unrefined and highly corrosive upstream petroleum resources and complex operating environments pose a significant threat to the integrity and safety of gathering pipelines. The present study proposed a novelty method to perform a risk assessment for gathering pipelines. The use of historical failure data developed a fishbone diagram model of hazard factors. The risk index system was developed based on the KENT method, including failure likelihood and failure consequence coefficient models. Information entropy theory was used to determine the weight of each indicator. Combined with the area-level safety design coefficient, The welding institute (TWI) method was improved to perform risk classification for different areas. The proposed method was applied to 81 gathering pipelines. Results demonstrated that the proposed method could meet the actual conditions of gathering pipelines, improving upstream energy security.


Introduction
Gathering pipelines are the primary energy transmission infrastructure for upstream oil and gas fields [1]. Compared with long-distance pipelines, unrefined transport media can cause more serious internal corrosion [2][3][4]. Besides, the operating environment with high uncertainties will cause the gathering pipeline failure, seriously affecting upstream production, environmental pollution, and even casualties [5][6][7][8]. Although this is well-known in the industry, the statistical data show that failure accidents of gathering pipelines are rising [9]. Pipeline owners implement risk-based integrity management to prevent such accidents as much as possible [10,11]. Accuracy and adaptability of risk assessment are crucial for predicting risks and reducing accidents [12].
Extensive studies have been conducted to mitigate pipeline risk [13][14][15]. However, those methods were developed for the risks faced by long-distance. The increasing number of accidents indicates that those methods do not apply to gathering pipelines [9]. This may be because some critical properties of the gathering pipeline were ignored, including transporting multiple corrosive and high-temperature media, small outer diameter, small wall thickness, and low operating pressure [4]. It is necessary to sort out all risk factors of the gathering pipelines. For long-distance pipelines, the semi-quantitative-based method, i.e., the KENT method, developed a comprehensive index system to implement pipeline risk assessment, including the indicators of failure likelihood and failure impacts [16]. Many variants have been generated based on the KENT method, such as the fault tree-based and the Bayesian networkbased models [17,18]. However, the KENT method can only provide a subjective expert-based evaluation. Besides, *Correspondence: yihuan.wang@swpu.edu.cn the weight of all major categories of indicators is the same, which is not enough to reflect the pipeline risk characteristics. The information entropy method could compensate for the lack of information to dynamically determine the weights of indicators according to actual pipeline conditions, which helps reduce the subjectivity of assessment. Therefore, this method has been widely used in risk assessment [19][20][21].
The present study developed a novelty information entropy-based risk assessment method for multiplemedia gathering pipelines, including a risk calculation model and a risk classification method. The historical accidents of gathering pipelines were systematically analyzed to develop a fishbone diagram model for sorting out the risk indicators. A risk evaluation index system was developed for multiple-media gathering pipelines. The weight of each index was determined by the information entropy method. The use of the modified TWI method implemented risk classification. The applicability and accuracy of the proposed method were illustrated through a case study.

Statistical analysis of gathering pipeline accidents
Accident statistical analysis is the premise of risk assessment. The main risk factors are sorted out through the analysis of the root causes to develop a practical risk analysis method [22]. Pipeline and Hazardous Materials Safety Administration (PHMSA) collected and analyzed the failure causes of gathering pipelines in the US in the past 20 years, and the statistical results are shown in Fig. 1(a) [23]. Alberta Energy Regulator (AER) organized the failure causes of Canadian crude oil and gas pipelines, respectively, and the statistical results are shown in Fig. 1(b) [24]. Fig. 1(c) shows the statistical failure caused by gathering pipelines in China from 2011 to 2016 [25]. From Fig. 1, corrosion is the primary failure factor of pipeline accidents in the US, specifically, 55.5% of general accidents, 28.6% of serious accidents, and 57% of major accidents. In Canada, 69% of crude oil pipeline failures were caused by internal corrosion, and 53.2% of gas pipeline failures were caused by internal corrosion. Also, for the accidents in China of crude oil pipelines, gas pipelines, water pipelines, and steam pipelines, the corrosion contribution was 69.5%, 73.43%, 70.60%, and 69.43%, respectively. It can be seen that corrosion is responsible for more than 40% of the gathering pipeline failure, in which internal corrosion-induced failure is over 24%, which becomes the leading factor. Thus, internal corrosion is the primary hazard factor of gathering pipeline failure.

Identification and quantitative analysis of failure factors
The Fishbone diagram is an analysis method to capture the root cause of an incident, which has been widely used in engineering failure analysis given its intuitive image and ability to mine the grounds of the accident deeply [26]. Figure 2 shows the developed fishbone diagram model of gathering pipeline failure factors. Specifically, the causal factor can be four categories, including thirdparty damage, corrosion, design, and misoperation, involving the human-machine-environment, which can directly affect the safety status of gathering pipelines [5]. From the statistical analysis of the accident, corrosion is the main reason for the failure of gathering pipelines, especially internal corrosion. The mixed transportation of multiple corrosive media can produce different corrosion effects. To identify the corrosion hazard factors pertinently, this work considers four different corrosion media for internal corrosion factors, i.e., crude oil, gas, water, and steam.
Quantitative analysis of the main failure indexes regarding internal corrosion can reduce the subjectivity of risk assessment and improve accuracy. Quantitative analysis indicators include pressure, sulfur content, temperature, chloride ion, and salinity. The failure data are collected from various oil and gas fields in Northwest China (Table 1) to determine the functions of such indicators and the failure rate (Fig. 3).
Given the statistics of failure accidents, fishbone diagram model, and quantitative analysis results, combined with the analytic hierarchy process, a risk evaluation index system for gathering pipelines is developed regarding different transportation media, as shown in Table 2, 3,4,5 and 6 [3,4,15,27].

Overview of KENT method
According to the KENT method, pipeline risk assessment includes the likelihood and consequences of pipeline failure [16]:  where R is the pipeline risk value; P is the failure likelihood score; C is the consequence score. The risk assessment model for failure likelihood in the KENT method can be where P third party , P corrosion , P design and P misoperation are the score of the third-party damage indicator, the corrosion indicator, the design indicator, and the misoperation indicator, respectively. The failure consequence calculation model can be (2) P = P third party + P corrosion + P design + P misoperation where K w is the hazard of the product; LV is the leakage volume; D is the diffusion coefficient; S is the receptor coefficient.

An information entropy-based method of failure likelihood for gathering pipelines
When evaluating the failure likelihood, different causal factors have individual effects on pipeline safety. Therefore, the weight of each factor needs to be determined. Then, Eq. (2) can be where L 1 , L 2 , L 3 , and L 4 are the weights of third-party damage, corrosion, design, and misoperation, respectively, It should be noted that the L 1 , L 2 , L 3 , and L 4 are dynamically determined based on the actual situation. Ignoring that will reduce the accuracy of the risk assessment. In this work, the dynamic weights can be determined by the information entropy method combined with failure frequency [28,29], following the steps: a. According to the actual situation of the oil and gas field, the information matrix AT is developed by the experts: b. Define the membership function μ(t ij ): where k = n is the conversion parameter; γ (1, 2,…, n) is the adjustment coefficient; t ij is the recommendation trust degree of the i-th recommended entity for the j-th attribute index.
c. Develop membership matrix B: d. Determine the initial weight by Eqs. (8)(9)(10)(11), as follows: (4) P = L 1 P third party + L 2 P corrosion + L 3 P design where the initial weight is G = [g 1 , g 2 …, g n ], there are four indexes, including third-party damage, corrosion, design and misoperation in this work, n is 4. The initial weight is g 1 , g 2 , g 3 and g 4 , respectively; t j is the average recommendation, representing recommend the entity's consistent views on attribute indicators; b ij is the degree of membership of trust t ij ; m is the number of recommended subjects; ζ j is the recommended blindness, i.e., uncertainties due to differences in recommendations.
Subsequently, according to the failure rate of various indexes from different oil and gas fields, the dynamic weight can be determined by where F i (i = 1, 2, 3, 4) is the number of accidents caused by third-party damage, corrosion, design, and misoperation, respectively; F is the number of failures of gathering pipelines; a i (i = 1, 2, 3, 4) is additional weights determined by the evaluator for third-party damage, corrosion, design, and misoperation, respectively, a 1 + a 2 + a 3 + a 4 = 0.

Failure consequence assessment model
The failure consequences of gathering pipelines can be assessed by medium harmfulness and receptors. The KENT method-based failure consequence assessment model can be where K w is the medium hazard score; S is the receptor score; K wsum is the total score of medium hazards; S sum is the total score of the receptors. The failure consequence coefficient is within [0.3117,1].

Risk classification
According to the population density of different areas and China standard GB 50251-2015: Code for the design of gas transmission pipeline engineering [30], the surroundings can be defined as level 1 first-class area, level 1 second-class
Backfilling method (6) Both process and method are correct (2); Process is correct but the method is incrorrect (4); Both backfill process and method are incorrect (5); None (6).  (7); None (10) Safety measures (10) Safety responsibility system is sound and strictly implemented (3); There is a safety responsibility system but not implemented (7); None

Total 100
Page 10 of 17 Qin et al. J Infrastruct Preserv Resil (2022) 3:19 factors. The primary failure indicators are quantitatively analyzed to ensure the objectivity of evaluation indexes. Further, this section develops a novelty risk assessment method for gathering pipelines combined with the KENT method and information entropy theory. Also, the risk classification method is proposed to judge the pipeline's safety status in different regions. The proposed risk assessment framework for gathering pipelines is shown in Fig. 5.

Site description and pipeline selection
The proposed method is implemented in gathering pipelines of an oil and gas field in Northwest China. As shown in Fig. 6, the operating environment of the gathering pipelines is complex with high uncertainties, including deserts, gobi, cross railways, highways, national highways, woods, rivers, and scenic spots. Three operating regions are selected and marked in blue. The selected pipelines pass through the World Devil City Scenic spot (See pink label), farmland (See white label), and river (See black label). Roads route the operation area (See yellow label) and the main traffic road inside the oil field (See green label). 81 double-high (A term that denotes high failure likelihood and high failure consequences) pipelines transporting four different media, including crude oil, gas, water, and steam, are selected for the case study. The characteristics of the pipelines chosen are shown in Table 7.

Method implementation
Each pipeline can be assessed according to the risk index system developed in Section 2.2. Then, the score of pipeline failure likelihood can be determined given the method in Section 2.3.2. Further, according to the pipeline inspection data and the opinions of oil and gas field experts, the pipeline trust recommendation matrix AT can be Table 6 Consequences of failure a The range of influence is within 200m; b Gathering pipeline connects the two stations or facilities, e.g., well-metering station represents the well and the metering station are connected by a pipeline
Then, the failure likelihood can be determined by (14) P = 0.0803P third party + 0.8744P corrosion + 0.011P design + 0.0343P misoperation  There is no internal anti-corrosion method for pipeline laying. The pipeline crosses the aisle, the traffic flow is large, the marking pile along the line is incomplete, and the marking content is unclear. The insulation layer of the pipeline has fallen off. The failure consequence coefficient can be determined by Eq. (13). The use of Eq. (1) can assess the risk of each pipeline.

Results and discussion
Results show that the lowest risk value is 19.765, while the highest is 49.085. The risk grade boundary can be determined by the safety design coefficients of different areas. The risk level boundary of the level 1 first-class area is between 25.39444 and 31.96212. The risk level boundary of the level 1 second-class area is between 24.831496 and 30.742408. The risk level boundary of the level 2 area is between 23.98708 and 28.91284. The risk level boundary of the level 3 area is between 23.2834 and 27.3882. The risk level boundary of the level 4 area is between 22.57972 and 25.86356. The risk matrix is shown in Fig. 7.
The risk level of each pipeline is determined given the proposed risk matrix, as shown in Table 8. The risk value of the selected pipeline is between 19.765 and 49.085, where 8 pipelines are low-risk level, accounting for 9.88%; 34 pipelines are medium-risk, accounting for 41.96%; There are 39 high-risk pipelines, accounting for 48.16%. Among the four kinds of transmission medium pipelines, the risk value of water transmission pipelines is the lowest, most are at medium risk level, and a few are at low-risk level, but there is no high-risk level. The gas transmission pipeline with the highest risk value is located in densely populated areas and has serious failure consequences. 72% of thin oil pipelines are high risk, 28% are medium risk, and there is no low risk; 36% of heavy oil pipelines were rated as high risk, 56% as medium risk and 8% as low risk.
The proposed method (i.e., method 1) is compared with a previous method (i.e., method 2), see Appendix [30], as shown in Fig. 8. The previous risk assessment method used in the case of oil and gas fields mainly refer to a China Code (GB-32167-2015). The gathering pipelines are scored using the semi-quantitative risk assessment index system to determine the failure likelihood and consequence scores. The semi-quantitative failure lilelihood index and failure consequence index are shown in Table 1 A and Table 2.
Results show that high, medium and low risk accounted for 48.16%, 41.96% and 9.88%, respectively, by using the method 1. Meanwhile, the results using the method 2 show that all pipelines are at high risk where the lowest risk value is 6.4827, and the highest one is 185.8968, demonstrating risk threshold is quite different between the two methods.
Further, 81 double-hight pipelines are investigated onsite to examine the accuracy of the two methods. The results show that some pipelines are not featured with high risk, e.g., thin oil pipeline #9, heavy oil pipeline #33, gas pipeline #52, water pipeline #64, and steam pipeline #79. They all have a 5cm thick concrete protective layer, intact pipe embankment, and marker piles. A staff patrols the pipelines daily. The pipelines are equipped with a real-time monitoring system. Also, heavy oil pipeline #41, gas pipeline #46, and water pipeline #69 have a 10cm protective layer. Staff patrol the pipelines daily, fill corrosion inhibitor and pigging regularly, set up real-time monitoring and automatic cutting system, carry out staff training regularly, and set up protection devices when routing densely populated and scenic areas. Therefore, it can be explained that method 2 cannot accurately reflect the actual risk status of the pipelines, which provides too conservative protective measures, increasing maintenance costs.

Conclusions
In this work, a comprehensive risk assessment framework was proposed to effectively avoid the environmental hazards and economic losses caused by the failure of gathering pipelines. A risk index system for multiple media gathering pipelines was developed based on the KENT method. The information entropy method was used to determine the weights of the failure likelihood indicators to improve the accuracy and applicability of the method. An improved risk classification method for different regions was proposed by introducing the safety design coefficient, reflecting the actual risk status in different areas. The proposed risk assessment method was   applied to a case study. Results showed that high-risk pipelines account for 48.16%, medium-risk pipelines account for 41.96%, and low-risk pipelines account for 9.88%, consistent with the pipeline's actual operating conditions. Meanwhile, it demonstrated that the proposed method could guide risk operators to improve the effectiveness of risk management. However, the proposed method is essentially an expert-based system with subjectivity. A Bayesian network model could be established based on the index system proposed in this work to perform a more accurate quantitative risk assessment.