Evaluating the impact of factors in vehicle based pavement sensing implementation: sensor placement, pavement temperature, speed, and threshold

The purpose of the paper is to improve the efficiency of vehicle based sensing technology in highway pavement condition assessment by evaluating the effect of four factors (sensor placement, pavement temperature, drive speed, and threshold for pavement distress classification) and providing suggestions to better improve the accuracy of pavement condition detection and minimize the interruption of pavement sensing operation. Two I-10 corridors in the Phoenix region were selected for vibration data collection and data analysis. A series of statistical analyses were performed to determine if each one of the factors has a significant impact on the pavement distress detection. The results of Analysis of Variance (ANOVA) tests and Analysis of Covariance (ANCOVA) tests show that the placement of sensors have a significant effect in the pavement condition assessments. The significant differences occurred in the group of sensors that were placed on the same side of the vehicle, as well as, in either front wheels or rear wheels of the vehicle. The effect of pavement temperature on the vehicle based sensing implementation is significant while the mean drive speed is not seen as a significant factor in the pavement condition survey. The two thresholds were determined to select points of interest (POI; cracks, potholes) for the pavement distress classification and these POIs are in good agreement with international roughness index (IRI) data in an ArcGIS map. The findings of the paper can be used to better improve the computing algorithms of vehicle based sensing techniques.


Introduction
Evaluating road surface performance required costly equipment and high skilled staff. Traditionally, a roadway profiler has been used by some of governments for pavement condition surveys. The survey result provides longitude roughness data that can be used by highway authorities for decision making for pavement maintenance and repair. Generally, the annual costs of obtaining roughness data and condition surveys could be more than one hundred thousand dollars [1] which is not affordable for most of highway agencies who have a need but with limited budget to conduct pavement condition surveys. To approach a low-cost and efficient monitoring system, the use of vibration data instead of pavement roughness index via different methodologies has been given attention in road condition assessment among highway agencies and institutes. There are two commonly used methods to collect vibration data; the first one being a smartphone based sensors, which is a convenient and easy way to gain acceleration data while another one is a vehicle mounted sensor used to collect pavement sensing patterns and signals according to their demands. Many research studies showed that there *Correspondence: chunhsing.ho@unl.edu is a correlation between acceleration data and pavement roughness [2][3][4][5][6]. Thus, using accelerometers equipped in a smartphone or vehicle has been widely used in pavement condition surveys.
Ho et al. [7] and Zhang et al. [8] conducted a year-long pavement condition survey using vehicle based pavement sensing techniques on the I-10 corridors in the Phoenix region. Their findings are strongly in support of vehicle mounted sensors for use in pavement condition assessment. Based on their study, there are still some factors that could have an impact on the accuracy of pavement condition detection such as tire pressure, drive speed, placement of sensors, threshold for pavement distress classification, etc. Thus, a further investigation on how these factors would influence the implementation of vehicle based sensing work is needed.

Literature review
Road roughness and pavement condition index are two common parameters in pavement condition evaluation. Usually, the vehicle-mounted sensors are applied in a pavement condition survey to obtain roughness index through a specified automated methodology [9], however, which requires costly equipment and high skilled staff. A report also studied the relation between the roughness index and the pavement condition index (PCI) via visual inspection of different types of road deterioration [10]. However, field observation and rating heavily reply on human power and the results might not be consistent among field raters. Due to the limited budget and labor, highway agencies have been in search of a cost effective way to find an affordable pavement condition survey method such as accelerometers, real-time images, etc. A study by Yan et al. indicated that the acceleration signals have a significant effect on the deeper and longer cracks in the pavement condition [11]. A report by Alavi et al. expressed that the vertical acceleration data from a smartphone can be used in assessing overall airport pavement condition [12]. Moreover, Douangphachanh et al. indicated that the acceleration vibration data has a good linear relationship with PCI values [3,6] and is correlated with roughness index [2]. These are evident that the use of accelerometers in pavement condition assessment has been dramatically increased.
To precisely analyze vibration signals and transfer the results in to the pavement distress classifications, a variety of computing algorithms have been used in the determination of pavement distress such as numerical analysis, supervised machine learning, and image processing through MATLAB software [11][12][13][14][15][16][17][18]. For example, Fast Fourier transform (FFT) and Short-term Fourier Transform (STFT) have been used to find the displacement data based on the vertical acceleration for evaluating road performance [5,17]. Their results indicated that higher displacement values, and poor road conditions, visually matched with road conditions. Through numerical analysis of vertical acceleration, Yan et al. found that most cracks have positive vertical acceleration in pavement condition assessments [11] and the cracks can be identified by integrating the differential intensity and height changes from the data [19]. Additionally, using polynomial approximation is another way to determine the locations of potholes in the condition assessment [20]. More recently, machine learning based techniques have become a widely used method in road condition assessment. For example, k-nearest neighbor (KNN) is one of common and simple machine learning approaches to classify pavement conditions. Du et al. used the KNN method to distinguish the abnormal pavement types such as bump and pothole, and the results showed that the accuracy of the recognition is more than 90%, which proved the vibration data was appropriate to be processed in the condition assessment [13]. Additionally, Artificial Neural Network (ANN), fuzzy theory, and random forests regression have been broadly applied by numerous scholars to recognize road surface conditions for maintenance and rehabilitation purposes [14,16,21,22].
Recent studies have suggested that using vertical accelerations from a single device or sensor to assess pavement conditions is not sufficient to facilitate pavement distress classification [7,8,23,24]. Thus, the use of multiple devices or sensors in vibration data collections have been used in several studies. For instance, Staniek indicated the use of three-dimensional analysis of acceleration vibrations could lead to a better and precise calculation for pavement condition indices [24]. To recognize the abnormal pavement performance, Chuang et al. used a deep machine learning technique associated with vertical and lateral acceleration [23] and their results were successfully validated by the road network of Taipei city, Taiwan with an accuracy of 98%. Additionally, the bagged trees classification and robust regression analysis were applied in pavement monitoring through three-dimensional rotations and the results of detection exhibited accuracy of more than 90% [25]. Ho et al. [7] and Zhang et al. [8] used magnitude values, which was combining all three accelerations in the three directions to monitor pavement conditions and their results were effectively validated with IRI segments.
Given all above studies, it is clearly the use of accelerometers is getting popular among highway agencies and institutes for its affordable costs and reasonable results in pavement condition classifications. However, the factors that could have influenced the accuracy of pavement distress detection have not well studied yet. To address these issues and better support the implementation of vehicle based sensing work, the paper evaluates the effect of the factors (sensor placement in vehicle, pavement temperature, drive speed, and thresholds for pavement distress classification) on the implementation of vehicle based sensing techniques using statistical analyses. The objectives of the paper are to: 1. evaluate the effect of individual factor on the pavement sensing operation and provide suggestions to minimize its effect on the interruption of pavement condition and 2. support the currently used vehicle based sensing techniques in the pavement condition survey.
This paper presents statistical analysis using both analyses of variance and analysis of covariance to evaluate the effect of different factors such as sensor placement, pavement temperature, speed, and threshold in pavement condition assessments based on a year-long data set.

Data acquisition
Two road sections were selected in Phoenix, Arizona for pavement condition surveys that are shown in Fig. 1. The length of road Sect. 1 (51 st Ave to 27 th Ave) is 3 miles and two lanes on eastbound (EB) were selected. Similarly, two lanes on northbound (NB) were selected in road Sect. 2 (Chandler Blvd to Baseline Rd), and each of the lane has a length of 5 miles. All pavement sensing patterns and signals were collected monthly in a year long through multiple sensors by a 2016 Honda Accord. As shown in Fig. 2, sensors M1 to M4 were placed on the top of each control arm of the vehicle, and M5 was placed inside of the vehicle to gather sensing patterns simultaneously during travelling on the two road sections. A GoPro was attached to the front of the vehicle to record the real-time pavement conditions such as cracks, detection loop, and construction joints. The video is applied in the validation of pavement deterioration along with the IRI file for further pavement condition classification.  Table 1 is a sample of pavement sensing files, which includes acceleration vibrations from three directions (e.g. x, y, and z), GPS coordinates, time, driving speed, and magnitude values from all five sensors. The details about the data collection process are explained in references [7,8]. In the pavement condition assessment, the driving speed was maintained from 50 to 55 miles per hour except for the traffic congestion that occurred. Also, the pavement temperature was recorded by an infrared thermometer during each pavement condition survey.

Data application
To meet assumptions of normality in statistical tests, all magnitude values were transformed into a logarithm scale. Figure 3 illustrates the density plots of the transformed data (e.g. logM). It is clear to see that the distributions of sensor 5 are significantly different than other sensors due to the placement in the pavement condition assessments.

Data analysis
This section introduces the methodologies that were used to evaluate the effect of sensors in the pavement sensing work and to determine the significant variables such as pavement temperature and mean speed that have an impact on the pavement condition assessment.

Effect of sensor placements in pavement condition assessment
To further investigate the effect of sensors in the pavement sensing work, the paper perform ANOVA tests associated with the cell means model through the following [26]: where. y ij = magnitudes in log scale of j th months from the i th sensors; µ i = mean for all magnitudes from the i th sensor; e ij = random error.
In the study, the random error was assumed to be independent identically distributed to a normal distribution with zero mean and constant variance in order to provide valid statistical inference.
(1)   The Tukey's test was applied for pairwise comparison to determine where the significant differences occur associated with small p-values such as less than 0.05 (or less than 0.10). In the paper, the TukeyHSD function [27] in R software was applied to conduct 95% confidence intervals and to investigate the significant difference in various groups based on simultaneous pairwise comparisons.

Effect of pavement temperature and drive speed in pavement condition assessment
The factorial treatment design was used to investigate the relationships among several types of treatments and various conditions [26]. In the paper, the Analysis of covariance (ANCOVA) tests were performed based on the factorial design to examine the interaction between sensors, pavement temperature, or mean speed in pavement condition surveys. Three statistical models were built synchronously including the simple regression model, main effects model, and interaction model to access the interaction effects. The simple regression model [28] can be written as below: where. y = mean magnitude in log scale; β 0 = intercept; β 1 = regression coefficient as known as the slope; X = inputs; ε = estimated error.
The main effects model and interaction model were constructed to address the ANOVA tests and ANCOVA tests in the paper. The main effects model can be written as follow [28]: where. y = mean magnitude in log scale; β i = estimated coefficients from the model; X i = inputs that consist of sensor and pavement temperature.
To test the interaction of the sensor and pavement temperature or sensor and speed, a model can be written as below: where. y ijk = mean magnitude in log scale of k th months with the j th pavement temperature.
(or mean speed) from the i th sensors; µ = overall mean; α i = fixed effect of the i th sensor; β j = fixed effect of the j th pavement temperature (or mean speed); (αβ) ij = interaction effect of the i th sensor and the j th temperature (or mean speed); e ijk = experimental error.
In the paper, an experimental error is assumed to be independent identically distributed to a normal distribution with zero mean and a constant variance [26].
Additionally, a hypothesis test is necessary to be constructed along with the ANCOVA tests to test the interaction of two variables through following form: where.
(αβ) ij = interaction effect of the i th sensor and the j th temperature (or mean speed); MS(AB) = mean square of sensors and pavement temperature (or mean speed); MSE = mean square of error. Reject the null hypothesis (e.g. no sensor × pavement temperature interaction effects) if the p-value is small (e.g. less than 0.10) and conclude that the interaction effects exist. When the interaction effect exists, the result of the main effects is not to be discussed in detail. If the intersection effects are not significant, discuss the results of the main effect model and conclude that the predictors (sensor, pavement temperature, or speed) have effect on the magnitudes if the p-value is small (e.g. less than 0.10).

The effect of thresholds in pavement condition assessment
Referring to the authors' previous work [7,8], the threshold values that classify the pavement condition were obtained from distribution fitting and percentile analysis. Figure 4 illustrates the process of obtaining threshold values from eastbound lane 1 in road Sect. 1 (51 st Ave. to 27 th Ave). A fitdistrplus package in R software was used to find the best distribution models based on the magnitude values from five sensors (e.g. M1 to M5). Then computing the 99.9 th percentile from the fitted models and the corresponding values are defined as thresholds. The remaining 0.1 percent of the data would indicate pavement deterioration. Additionally, the paper analyzed a single sensor (e.g. M5) individually for condition assessments. The concept of this new method is constructing 95% confidence intervals to determine untypical points using standardization data, and those untypical points would indicate pavement deterioration in the condition assessments.

Results and discussion
This section depicts all results and discussion from the proposed methodologies to assess pavement conditions. The results of sensor effects in condition assessments and determination of threshold values are shown in Tables 2 to 8 associated with Figs. 5 to 6.

Results in sensor placement effect
With the elimination of sensor 5, ANOVA tests and Tukey's tests were conducted as expressed in Eq. (1), and the results are shown in Tables 2 and 3. The means of magnitudes in the log scale from four sensors (M1 to M4) differ according to small p-values ( Table 2). The pairwise comparisons from Tukey's tests show that the means of sensor 1 that is mounted on the front left side of the vehicle differs from the means of the rest sensors for both lanes in road Sect. 2 and eastbound lane 1 in road Sect. 1 due to the p-values are less than 0.05 as shown in Table 3. Moreover, ANOVA tests were performed again to determine whether or not there is a significant difference between the sensors that are mounted on the same side of the vehicle such as front (or rear) wheels and left (or right) side. As shown in Table 4, the small p-values (e.g. less than 0.05) indicate that the mean magnitudes of two sensors that mounted on the same side of the vehicle (e.g. M1 and M3) differ in log scale, as well as, the sensors on either front wheels or rear wheels of the vehicle. Therefore, the paper concludes that the placements of sensors have a significant effect on the measurements of road performance and the effect of sensor 1 that is mounted on the left front wheel of the vehicle is most significant in pavement condition surveys than the rest of the sensors (e.g. M2, M3, and M4).

Results in interaction effects of sensor, pavement temperature, and mean speed
The simple regression model, main effect model, and interaction model were compared simultaneously in the paper as indicated in Table 5. For both road sections, the interaction models of sensors and pavement temperature are significant at the significance level of 0.10 since the p-values are less than 0.10. Thus, the paper concludes that the effect of sensors depends on the pavement temperature in the pavement condition surveys. However, the p-values are not small (e.g. greater than 0.10) from the interaction models and main effect models when testing the interaction of sensors and mean speed. Therefore, the result indicates that the mean speed is not an important factor in the pavement condition assessment. Additionally, the p-values from the intersection of sensor 4 and pavement temperature for road Sect. 1 are 0.419 and 0.384 (Table 6), for road Sect. 2 are 0.076 and 0.060 (Table 6), which indicate that the interaction effect of sensor 4 and pavement temperature exist less significantly than other groups and provide lower magnitude values in pavement condition surveys.
The paper aims to classify pavement conditions based on a preselected threshold value. However, since the means of vibration responses from the five sensors vary depending on the pavement temperature, it is somewhat difficult to appropriately determine a threshold value based on a year-long data set. Thus, the paper used data collected in the winter season from October to the following February intending to determine a threshold for pavement condition assessment. In this case, the ANOVA and ANCOVA tests were performed again to test the interaction of sensor and pavement temperature in the winter season. As shown in Table 7, all p-values are greater than 0.10, which indicate the effect of sensing data on the pavement temperature is not significant in the winter season (October to the following February) in pavement condition assessments. Figure 5 shows all threshold values that were used to classify pavement conditions in two road sections. The pavement surface temperature was also shown in the figures and it was noticed that the determination of threshold values varies based on statistical analysis. As shown in Fig. 5, the maximum threshold values occurred in June and August in both road sections that were caused by higher pavement surface temperature. At the same time, the trend of threshold values from all sensors is approximately flat in both sections in the winter (October to February). Therefore, the paper suggests a winter season might be a good time to collect vibration data to avoid pavement temperature effects in assessing pavement conditions while using a constant threshold value to implement pavement distress classification. Table 8 shows all thresholds from sensors 1 to 4 that were placed on the control arms of the vehicle. Referring to the above analysis, the paper only analyzed the thresholds in the winter season (e.g. October to the following February) to calculate an average threshold value for determining the poor pavement condition in two road sections using the following equations:

Results in threshold values in pavement distress classification
where. EB threshold = mean threshold in road Sect. 1; NB threshold = mean threshold in road Sect. 2; ET i , NT i = threshold values computed from the fitted distribution models; i = 10, 11, 12, 1, and 2 that represent the month in data collection period.
As shown in Table 8

Comparison of proposed threshold values with IRI data
The paper used the mean threshold values of 1.79 g and 1.28 g to identify pavement distress points (known as points of interest, POI) in the raw data for the winter season (October to the following February) in the two road sections. Those selected POI were imported in ArcGIS software and graphically illustrated in Fig. 6. An IRI data was obtained from ADOT for comparison with the selected POI, and the correlation between select POI and IRI values are shown in Fig. 7. In this case, IRI values exceeding 95 is identified as a fair to poor condition were selected and graphed in a GIS map [29]. As can be seen in the road Sect. 1 of Fig. 6, poor IRI segments are in good match with the selected POI using the proposed threshold value. As for road Sect. 2, it is noticed that there is a segment circled in blue where the selected POI were displayed without IRI being appeared. This difference in an identification of pavement distress needs a further verification so allowing the team to make a decision. A video made by Go-Pro was retrieved and the team was able to locate the area and made a few snap shots of images that show pavement surface conditions (Fig. 6c). Obviously, the images show deteriorated road surface conditions in support of the selected POI on a map. It can be concluded that the proposed threshold values are valid and effective in the pavement distress classification based on the comparison with IRI and the verification of field images.

Conclusions
The paper intends to evaluate the four factors (placement of sensors, temperature, mean speed, and threshold values) influencing the accuracy of pavement condition detection has the following conclusions through proposed methodologies: 1. The mean of magnitude values in log scale from sensors 1 to 4 differ in the pavement condition surveys through ANOVA tests (Table 2). Among four sensors amounted in the vehicle, Sensor 1 has the most significant difference from other sensors based on the Tukey's tests (Table 3) in the pavement condition assessments. 2. The differences between sensors on the same side of the vehicle (e.g. left or right side, front or rear wheels) are significant in pavement condition surveys due to small p-vales from ANOVA tests (Table 4). 3. Through ANCOVA tests, the interaction effect of sensors, pavement temperature, and mean speed in condition assessments were investigated. The effect of pavement temperature on the pavement distress detection is significant based on small p-values occurred in interaction models (Table 5) in two road sections. the paper also suggested to collect vibration data during the winter season to reduce the effect of pavement temperatures in the pavement distress detection. 4. The effect of the mean speed in the pavement distress detection is not an important factor given the fact that larger p-values were calculated from main effect model and interaction model (Table 5). 5. Two thresholds (1.79 g for the section of 27 th Ave. to 51 st Ave and 1.28 g for the section of Baseline Rd. to Chandler Blvd.) were determined using statistical analysis to select POIs. Based on GIS mapping, these POIs are in good agreement with IRI data. The results indicate the threshold values are valid and effective in the detection of deteriorated pavement conditions. 6. The findings of the paper can be used to better improve the computing algorithm of vehicle based pavement sensing techniques.