Application of TRMM in the Hydrological Analysis of Upper Bengawan Solo River Basin

Rainfall is a major water resource with a significant role in terms of growth, environment concerns, and sustainability. Several human activities demand adequate water supply for drinking, agriculture, domestic, and commercial consumption. The accuracy of any hydrologic study depends heavily on the availability of good-quality precipitation estimates. Most countries are unable to provide sufficient climatic data, including rainfall and observed discharge statistics. This scarcity is a huge obstacle in conducting thorough hydrologic studies over a certain period. For instance, Indonesia, as an archipelagic country, has long been faced with data availability problems. For this reason, Tropical Rainfall Measuring Mission (TRMM), which was developed by NASA, became an alternative solution to rainfall data limitations. However, to be applied in hydrologic investigations, TRMM data require proper estimation and adjustment. The aim of this study was to evaluate the quality of TRMM rainfall data and its application in determining design flood and water availability. Dividing the data into several groups based on its magnitude and multiplying each unit with a correction coefficient are parts of the modification process. Subsequently, objective functions, including false alarm ratio (FAR), probability of detection (POD), and root mean square error (RMSE) were also applied. Rainfall-runoff modeling and design storm analysis at Delingan dam were used to study the TRMM correction performance. Based on the analysis, corrected TRMM showed considerable findings compared to ground station data. Model calibration and verification using corrected TRMM data provide satisfactory model parameters compared to ground station derivatives. The results also disclosed a closer fit of the corrected TRMM to catchment response translated from derived rainfall-runoff model parameters to ground station compared to control. Furthermore, design storm calculated from corrected TRMM reflects an improvement compared to uncorrected TRMM data.


Background
Hydrologic models simplify a real-world system in order to understand, predict, and manage water resources. These models encompass precipitation to streamflow and are represented in a mathematical form with complex variations in user requirements and data availability (Kite, 1996). In addition, the models are useful in several aspects of analysis, including flood, water availability, waterworks design, etc. The development of more complicated, physically realistic, distributed hydrologic models has significantly increased the demand for spatial data. Similarly, data collection agencies are under pressure to increase their conventional ground-based data networks to provide wider coverage and improved data solutions. Remote sensing technologies are often considered as innovative mechanisms to obtain data at reduced cost (Koblinsky et al., 1992, Hardika, 2017.
Indonesia has been overwhelmed with hydrologic data limitations due to its geographical features. However, the introduction of remote sensing technologies easily acquire these data leading to a more accurate model. By applying the data provided by satellite measurements, distributed model formulation became possible to balance the demand for data. Data are simply obtained by satellite measurements in rural communities with no rainfall station, or with difficult terrain to conduct land surveys.
Remote sensing permits the detection of a spatial-temporal pattern of hydrologic data across large territories assumed to be inaccessible, and also provides useful information on a critical component of the hydrologic cycle, including precipitation data, soil moisture, snow coverage, and evapotranspiration (Immerzeel et al., 2009;Tang et al., 2009). However, the satellite data deliver quick access, approximately real-time, and are spatially and temporarily distributed, although the data require verification and evaluation with ground station data. There are several reasons for demanding validation, including the use of infrared and microwave radiation in satellites to measure rainfall. Occasionally, the waves are interrupted in the atmosphere (Wijaya et al., 2018;Willy et al., 2020). The errors in TRMM and ground station data need to be minimized before conducting further hydrologic analysis. This study introduces an approach to correct the TRMM data for improved model performance.

Problem Identification
Surface rainfall observation is spatially distributed and represent environmental characteristic only at a single point within its surroundings (Rozante et al., 2010). Satellite rainfall measurement, also called Tropical Rainfall Measurement Mission (TRMM), shows an average dimension of remotely sensed precipitation with regularly spaced grid points. This satellite system covers a large territory while neglecting geographical features, which is challenging for ground rainfall stations (difficult location to do rainfall measurement) (Kidd and Huffman, 2011). Therefore, satellite rainfall is not directly applied due to difference in collection methods. Also, there is need for evaluation prior to utilizing ground station data considered as "the truth".
Furthermore, an approach to calibrate daily TRMM rainfall using ground station data is formulated and TRMM data modification is performed by linear regression. The result is validated using hydrologic model where the data are assumed unavailable. This research is expected to generate a method to correct TRMM data for sparse areas or locations without rainfall station.

Upper Bengawan Solo River Basin
The upper catchment of Bengawan Solo River has a total area of 6,164.98 km 2 comprising 11 ground rainfall stations, where 10 are used to obtain the correction coefficient for TRMM data, while the other is reserved for verification. The 10 stations are Baturetno, Colo dam, Kalijambe, Klaten, Nepen, Pabelan, Parang Joho, Purwantoro, Tawangmangu, and Nawangan dam with daily rainfall data available from 1998-2018. Delingan station, located at Delingan (Tirtomarto) dam, has shorter data availability with daily rainfall data available from 2012-2018 and also a recorded annual maximum daily rainfall from 1994-2018. This dam is selected for verification purposes due to its daily inflow data from 2015-2019. Delingan dam also represents an area fit for hydrologic modeling while observed data is unavailable (observation data at Delingan is not included in correction determination).
The TRMM grid size is estimated at 0.25 o x 0.25 o , or equal to 28 km x 28 km. As a result, 18 grids are included in the catchment. Figure 1 shows the 11 rainfall stations used for correction coefficient calculation and TRMM grid for the catchment area as well as the Delingan station, located at grid number 23. By applying correction coefficients, corrected TRMM data at grid 23 is applied in rainfall-runoff model to calculate dependable flow from the Delingan Reservoir. Furthermore, calculated inflow from hydrologic model is compared to observed inflow data.

Indicator of Accuracy
Accuracy indicators including root mean square error (RMSE), probability of detection (POD), and false alarm ratio (FAR) were applied. RMSE is used to calculate the error resulting from TRMM and ground station (GS) data set, although lower values are preferred. POD and FAR are objective functions employed to determine the accuracy of TRMM "prediction" on a rainfall event. These indicators also evaluate the correction coefficient on a daily basis (Beaufort, Gibier and Palany, 2019).
The root means square error (RMSE) has been used as a standard statistical metric to measure model performance in meteorology, air quality, and climate studies. In addition, the RMSE penalizes variance for providing more significance to errors with larger absolute values compared to errors with lesser absolute values (Chai and Draxler, 2014). RMSE shows enhanced performance in detecting essential anomalies, and is calculated using Equation (1).
where is the number of data and is the error at . The false alarm ratio (FAR) represents the number of incorrect prediction on a precipitation event using TRMM divided by the total number of precipitation predicted. Larger FAR value depicts TRMM's unsatisfied performance, while low values relate to improved TRMM. FAR is calculated using Equation (2) The probability of detection (POD) is the number of correct predictions on precipitation using TRMM divided by the total number of observed events. POD is the opposite of FAR, where higher POD value shows superior TRMM performance at detecting precipitation events and is calculated by Equation (3).
where Nhits is the number of correctly predicted daily TRMM precipitation, NFalsealarm is the number of daily events classified as precipitation by TRMM when no precipitation is detected at rainfall station, and Nmisses is the number of daily precipitation undetected by TRMM.

TRMM Correction
For Daily TRMM correction, the data is classified into five groups based on the magnitude of rainfall with its correction coefficient for the Upper Solo River basin. The division is performed due to error difference between lesser and higher precipitation values. Smaller TRMM daily precipitation tends to overestimate ground station data, while for heavy rain, precipitation is underestimated (As-Syakur et al., 2011). This difference occurs as TRMM measures rainfall using its sensors when the rain becomes visible in the atmosphere, and TRMM value is the average rainfall for its respective grid.
Based on Mamenun (2014) and Wijaya (2018) studies, a simplified regression equation is applied in TRMM correction. The value is multiplied by a correction coefficient derived from the experimental rainfall station data. Previously, TRMM correction for a monthly basis in 6 Indonesian provinces representing 3 rainfall patterns was conducted using linear regression (Mamenun, Pawitan and Sophaheluwakan, 2014). Wijaya (2018) instigates the Mamenun (2014) approach to correct daily TRMM data by dividing the TRMM rainfall data into several groups with its magnitude. In this research, daily TRMM data is corrected using linear regression method and dividing the rainfall data into several groups. Corrected TRMM is calculated using Equation (4).

= (4)
Where is the uncorrected TRMM Data (in mm), is the corrected TRMM Data (in mm), and is the correction coefficient.

Correction Determination
In TRMM correction determination, 10 out of 11 available rainfall stations with 21 years of daily data across Upper Bengawan Solo River basin was initially deployed. Delingan station data was not used in correction determination, but for TRMM validation. This scenario is selected as Delingan possessed a different daily rainfall data size compared to the other 10 rainfall stations, from 2012 -2018.
Based on the randomness in daily rainfall events, RMSE was utilized on the probability of occurrence to eliminate the impact of randomness. Rainfall data, both from TRMM and ground stations, were used to calculated the probability of occurrence. Subsequently, the error due to the probability difference were evaluated using RMSE. Figure 2 shows TRMM and ground station rainfall data vary insignificantly, below 40 mm. The almost zero probability error was caused by TRMM false alarm. Furthermore, TRMM predicted less rainfall where on the ground station, no precipitation was recorded. TRMM correction using FAR, POD, and RMSE are means to bring the probability at low rainfall magnitude to match ground stations.
TRMM provides two separate errors compared to ground station. For small rainfall, lower probability of occurrence value was obtained, while TRMM showed a higher probability of occurrence for medium rainfall. By dividing the TRMM into several groups, the correction coefficient change appeared smooth and generated sufficient correction. For the lowest magnitude group range, POD and FAR were used simultaneously to obtain the optimum value. A large magnitude is related to heavy rain, and is used for design storm calculation. Meanwhile, the 5-year return period is specified as the largest group (larger than R5). Correcting daily TRMM causes NFalsealarm to decline due to lower TRMM values known to produce false alarm. However, the POD value also reduces as certain correctly predicted TRMM precipitation tends to zero (Nhits became Nmisses). Therefore, in determining the lower range values, the value with the optimum FAR improvement and POD reduction are combined using the ratio of the difference before and after correction of the two variables (ΔFAR/ΔPOD) with RMSE.

Water Availability Model
HBV-96 was applied as a hydrologic model to verify TRMM correction and the resulting flow was then compared to the results using ground station data and observed inflow to the Delingan Reservoir. This helps to assess the corrected performance. In addition, the HBV-96 parameters obtained from ground station, uncorrected, and corrected TRMM were compared to analyze the difference in using TRMM in ground station data in the unavailable location and to observe the TRMM performance for conducting correction on TRMM data. Discharge calculated using the HBV-96 model was then applied in dependable flow calculation.

HBV (Hydrologiska Byråns Vattenbalansavdelning) model is a rainfall-runoff model initially developed at the Swedish Meteorological and
Hydrological Institute (SMHI) in the early 1970s. After undergoing various modifications, the final result is the HBV-96 model (Zhang and Lindström, 1997), and Figure 3 shows the model structure. Furthermore, HBV-96 uses a response function to control the dynamics of the runoff and distribution time. Furthermore, the response function is divided into three boxes, termed soil box, upper response box, and lower response box. These response functions are governed by three recession parameters (K4, Kf, and α), evaporation limit (LP), maximum soil moisture (FC), capillary flux rate (CFLUX), percolation rate (PERC), and beta coefficient (β).
Sensitivity analysis was conducted prior to calibrating the model using rainfall data (ground station, uncorrected, and corrected TRMM). HBV-96 is known to have 8 model parameters representing the physical condition of the catchment area. Based on previous HBV studies, reasonable parameters were obtained to aid the calibration process (Booij, 2005) performed using 2017 data for Delingan dam and 2018 data for verification. Table 3 highlights the model parameter using ground station data, uncorrected, and corrected TRMM for Delingan.  Model calibration using observed ground station rainfall data, uncorrected, and corrected TRMM produced slightly varied parameter value. However, the use of corrected TRMM generated closer value to parameters derived from ground station data compared to uncorrected TRMM. Prior to correction, TRMM calibration obtained 6 separate parameter values. After correction, the difference reduced to only two parameters with values closer to ground station derived parameters. However, discharge calculated from both uncorrected TRMM and corrected visually do not differ widely. By applying the model, corrected TRMM showed results with superior relative volume error (RVE) compared to the uncorrected. Figure 4 represents the duration curve for Delingan dam.
Daily TRMM data reported unsatisfactory result in the water availability model with worse correlation coefficient and Nash-Sutcliffe value.
In calculating dependable flow, TRMM tends to overestimate calculation results compared to ground station rainfall data. Based on the estimation at Delingan, dependable flow derived using TRMM generated an overestimated value at higher confidence interval. However, dependable flow calculated using both TRMM varied from the observation data, and TRMM, particularly corrected TRMM, but showed similar catchment response in the HBV-96 model. Table  4 indicates the Nash-Sutcliffe and correlation values for the calculation results, where parameters obtained from the calibration phase are used for the verification phase.

Design Storm Analysis
Design storm was calculated from annual maximum daily rainfall obtained from ground station data and TRMM data. The calculation was performed using generalized extreme value (GEV) probability distribution with various return periods ranging from 2-1000 year return period. Delingan dam, located in TRMM Grid number 23, observed two rainfall stations, termed Colo Weir and Pabelan in similar grid. The design storm was compared to both uncorrected and corrected TRMM design storm, and two other rainfall stations available. The calculated design storm is represented in tabular form as Table 5 and visually in Figure 5.
The values for TRMM data were underestimated for a significant intensity rain event. This error was caused by the difference in measurement of TRMM and ground station, where TRMM applied area-averaged value. TRMM correction by multiplying a large magnitude rainfall using correction coefficient greater than 1, indicated significant improvement in design storm calculation, although the error reduced. Apart from Delingan dam rainfall station, design storm calculated using TRMM also corresponded to other rainfall stations in the respective grid. By correcting TRMM data, the absolute error declined from 18.7% -10.7%.

CONCLUSION
TRMM and ground station data obtained errors due to variations in measurement method. By performing TRMM correction based on ground station data, the error is possibly minimized. Misinterpretations between TRMM and ground station data improved RSME by 58.18% from 0.033-0.014, and FAR by 3.43% from 0.459-0.443, while POD reduced by 7.19% from 0.777-0.721 with ΔFAR/ΔPOD at 0.056. TRMM data application and its correction were conducted under two circumstances, including dependable flow and design storm analysis. Rainfall-runoff model calibration and verification using corrected TRMM data produced satisfactory model parameters compared to ground station derived parameters. Design storm calculated from corrected TRMM showed a significant improvement compared to uncorrected TRMM data.