Regional Frequency Analysis of Rainfall Using L-Moment Method as A Design Rainfall Prediction

Frequency analysis is a method for predicting the probability of future hydrological events based on historical data. Frequency analysis of rainfall data and discharge data is generally carried out using the moment method, but the moment method has a large bias, variant, and slope so that it has the potential to produce inaccurate hydrological design magnitudes. The L-moment method is a linear combination of Probability Weighted Moment, which processes data concisely and linearly. This research was conducted that the Lmoment method will obtain a regional probability distribution and design rainfall used as a basis for calculating hydrological planning in anticipation of disasters. The study’s location in the Mount Merapi area was chosen to more accurately predict the maximum rainfall that could cause cold lava in the area to reduce the risk of loss to the people living around Mount Merapi. The results showed that the entire rainfall stations homogeneous, and no data was released. The L-moment regional ratio results τ2 = 0.203, τ3 = 0.166, dan τ4 = 0.169. The homogeneity and heterogeneity tests show that all rainfall stations are uniform or homogeneous. No data were released from the discordance test results. Growth factor value increases in each return period design rainfall prediction. The regional probability distribution that is suitable for the research area is the Generalized Logistic distribution with design rainfall equation has been formulated. Regional design rainfall can predicts rainfall events that can occur in Mount Merapi area.Test model showed the minimum RBias = 0.45%, maximum RBias = 41.583%, minimum RRSME = 0.45%, and maximum RRSME = 71.01%. The L-moment method’s stability showed by model test minimum error = 1.64% and maximum error = 16.60%. The higher error value in higher return period shows that although L-moment can reduce bias data, but it has limitation in higher return period.


INTRODUCTION
Probability prediction of hydrological events based on historical data can be made utilizing frequency analysis. Frequency analysis serves as the basis for calculating hydrological planning to anticipate any possibilities that will occur in the future. Frequency analysis is used to predict extreme events such as rainfall or design floods in the future that can cause flooding, so anticipation is needed in flood protection structures to minimize risks. The use of data samples from the same variables in one region requires a regional frequency analysis. Frequency analysis, both rain data and discharge data, is generally performed using the moment method. The moment method is a nonlinear transformation; the higher sample moments the more unstable; moment method has a considerable bias, variance, and skewness. Nonlinear transformations can cause deviations and wrong parameter estimates because there are far from the data majority.
The L-moment method performs data processing concisely and linearly, and nonlinear data transformations can be avoided because there is no squared or cubic system calculation data. So, there is no bias, as is often found in the moment method.
The L-moment has the advantage of conventional moments where it has a wide distribution range. If it is estimated from a sample, it will produce more accurate data. L-moment and probability weighted moment (PWM) aims to conclude the probability distribution theory that will be used in the regional frequency analysis. The application of moment and L-moment methods need to be assessed for its applicability for regional rainfall frequency analysis to estimate design rainfall at specific return periods. It is hoped that the Lmoment study will provide more accurate results at Mount Merapi as the research area. The research location was chosen to more accurately predict maximum rainfall that could cause debris flow in the area. Analysis design rainfall can be the basis for discharge analysis which is planned as anticipation of flood protection waterworks to reduce the risk of losing people who live in Mount Merapi area.

Study Area and Data
The study area is located at Mount Merapi area with rainfall stations which has recorded daily data rainfall. In this study, 21 rainfalls stations were taken based on the distribution of locations and availability of rainfall data. The map of rainfall station distribution in this study area is showed in Figure 1.
This study uses daily rainfall records data from 1980 to 2018 at 21 rainfall stations with a completed data length of 16 -37 years. However, several years of data are not complete and not to be used in analysis. The data selection process is carried out through the annual maximum series, which selects data by the maximum rainfall in a year. Sorasan rainfall station is not involved in frequency analysis because It will be used as a reference station in the test model. The selection of Sorasan Station as a reference is based on the most comprehensive of data and continuous recording data, therefore as the most stable data.

Study Frameworks
The moment method is a quantitative measure of geometric properties in a probability distribution. The moment is usually to explain the stability of sample, the higher moments mean it is unstable and needs to add other information (Takeleb, 2010). To estimate the amount of discharge or design rainfall in a return period, the analysis is carried out through a statistical approach (Soewarno, 1995). The statistical approach includes moment parameters such as mean value, standard deviation, variation coefficient, skewness coefficient, and kurtosis coefficient. The most widely used probability distribution in hydrology is Normal distribution, Log-Normal distribution, Log Pearson III distribution, and Gumbel distribution (Chow V.T., 2010). The goodness-of-fit tests in the moment method are Chi-Square and Smirnov-Kolmogorov test (Sri Harto, 2009). The homogeneity test is a method used to measure data uniformity in an area developed to analyze the flood, but it can also be applied to rainfall data (Darlymple, 1960). L-moment is the linear length of the statistical result as a conventional moment used to summarize the probability function;s statistical parameters or the result of a data set. The Lmoment method, an analogy to ordinary moment estimated as a linear combination of Probability Weighted Moment (PWM). L-moment using regional frequency analysis is carried out in an environmental approach application because the data samples analyzed are observations of the same variables in one region (Hosking and Wallis, 2009).
The selection of distribution using the L-moment method shows the relationship between the theoretical L-Cs and L-Ck parameters of each distribution with the L-moment diagrams observational data. The L-moment parameter is identified for each type of theoretical probability distribution used in the L-moment diagram to obtain the type of distribution that matches with data distribution. The L-moment parameter is representing inverse of the cumulative distribution function (cdf) for multiple distributions, they are Uniform, Exponential, Logistic, Normal, Log-Normal, Generalized Pareto (GPA), Generalized Logistic (GLO), Generalized Extreme Value (GEV), Gumbel, and Pearson 3 Distribution. L-moment parameter based on PWM theory (Hosking and Wallis, 2009) (Maidment, 1993) (Malekinezhad, 2014): with data population parameter value β and Lmoment parameter λ. The L-moment ratio is based on the equation (Hosking and Wallis, 2009): Filtering data is carried out to determine the against of data and all data in an area. The filtering is based on the difference between the Lmoment ratio of a data with the average Lmoment ratio of all data regions. Equations used (Hosking and Wallis, 2009): is vector of sample L-moment ratio i location, N number of i locations i, ̅ regional unweight mean of L-moment for each region, cross product matrix A, Di discordance test value i location. I location is considered discordant if the Di exceeds value more than Di-critic.
Heterogeneity test aims to assess whether the rainfall gauge station location can be treated as a uniform region. The equation used is (Hosking and Wallis, 2009): with regional τ, length of data, ( ) τ i th order , and are obtained from Kappa distribution methods (Karian Z.A, 2010) (Hasby, 2014). The criteria were established by Hosking and Wallis (2009) for the assessment of heterogeneity in a region are if H < 1 the area is homogeneous, 1 ≤ H < 2 the area is possibly homogeneous, H ≥ 2 the area is not heterogeneous.
The selection of regional probability distribution extreme rainfall using τ3 R and τ4 R statistical parameters that will be compared to L-moment diagram (Vogel and Fennesey, 1993). After the election is done through visual observation, selected distribution is tested with goodness-offit test using equation (Hosking and Wallis, 2009) (Vogel and Fennesey, 1993): with 4 fitted distribution and 4 deviation standard of 4 .Then, the value of quantile at site can be determined by matching all data on the selected area with the distribution. The equation that used is (Hosking and Wallis, 2009): with at-site quantile of the location, λ i (i) average at i location F non-exceedance probability, and x(F) regional growth factor.
The equations that used in model test are (Hosking and Wallis, 2009) (Maidment, 1993): with ( )relative bias (RBias), ( ) relative root mean square error (RRSME), ( ) , at site quantile i location moment and L-moment methods, and ( ) at site quantile i location Lmoment method.

Homogeneity Test
Homogeneity test is used to see the uniformity of data in the study area. The homogeneity test's first step is determining the return rainfall of 2.33 and 10 years, R2.33 and R10, then determine the ratio value of R2.33 and R10 at each rain station and calculate the average ratio. Next, multiply the average ratio with R10 to find the period year T in each rain station. Then plot it on a semi-log scale graph with the abscissa of the data length record and T years as ordinate, 95% confidence limit curve is drawn based on the value of the confidence limit according to confident limit's values of Darlymple. Finally, visual observations were made, if some data are beyond the confidence limits, so the data is heterogeneous. Homogeneity test result is showed in Figure 2.
The Figure 2 shows that all rainfall stations have uniform or homogeneous data in Mount Merapi area, indicating that all data in rainfall station can be used for analysis. If the data is outside the confidence limit, then the data will not include in the next stage of analysis.

Moment Method
Statistical parameters are used as the basis for determining the probability distribution that fits the available data. The statistical measurement often used in hydrological data analysis is the measurement of central tendency and dispersion measurement. The central tendency measure is the mean value considered as the central value and can be used to distribution measure. The dispersion measurement is the degree of variance distribution around the mean value. Measurement dispersion method includes standard deviation, coefficient of variation, slope coefficient, and coefficient of kurtosis. Standard deviation (S) is the second moment to the average value, which shows data distribution, the greater standard deviation value, the more scattered the data is.
Coefficient of variation (Cv) is the comparison value between the standard deviation and the average value. Coefficient of skewness (Cs) is the third moment of the mean value which shows the degree of distribution asymmetry form. Coefficient of kurtosis (Ck) is the fourth moment on the mean value that measures the ductility of the distribution curve. The values of central tendency measurement and disperse measurement are known as moment parameters as conventional statistical methods. The moment parameter value at each rain station can be seen in Table 1.
The moment parameter values obtained are varies, the average mean value is 104.500 mm, average standard deviation value is 39.502, average Cv value is 0.389, average Cs value is 0.885, and average Ck value is 4.736.
The frequency analysis moment method using frequency analysis program (Luknanto, D., 2019). The frequency analysis at each rainfall stations obtained 9 rainfall stations are distributed Log Pearson III Distribution, 4 rainfall stations are distributed Log Normal Distribution, 4 rainfall stations are distributed Gumbel Distribution, and 3 rain stations are distributed Normal Distribution.
The moment method design rainfall is obtained by return period of 2, 5, 10, 20, 50, and 100 years based on the selected distribution at each rainfall stations. Moment method design rainfall plotted in Figure 3.  Maron is 70.335, and Ngepos is 52.936. The standard deviation value shows data distribution, a large standard deviation indicates that those three station's maximum rainfall data is spread over in a range.

L-Moment Method
Regional frequency analysis using moment Lmethod's steps consists of determining parameters and ratio values, filtering data using discordance test, indicating uniformity data with heterogeneity test, selecting probability distribution using goodness-of-fit test, determining growth factor values and predicting design rainfall. The L-moment parameter determination values are based on the probability weighted moment (PWM) theory and will be calculated using equations formulated by Hosking and Wallys. The L-moment parameter values obtained as in Table 2, then used to determine the L-moment ratio value which uses for the regional probability distribution selection.
In Table 2, L-moment parameter gives average λ 1 is 105.685, average λ 2 is 21.012, average λ 3 is 3.539, and average λ 4 is 3.517. The maximum parameter value at Gn. Maron is 154.350 as value and the minimum at Puncanganom station is 63.504 as value. The maximum value of λ 2 at Gn. Maron station is 39.329 350 as value and the minimum at Deles station is 15.297. Maximum value of λ 3 at Gn. Maron is 9.038 and the minimum λ 3 at Kaliurang station is -1.850.
Then L-moment ratio is used to obtained the regional ratio value by calculating L-momen ratio value with amount data for each rainfall station formula, so the regional value are 2 = 0.203, 3 = 0.166, and 4 = 0. 169.
Checking data in regional frequency analysis is carried out to present the field's size, and all data will get the same probability distribution results. Discordance test aims to filter out whether the data has a difference to the entire data in one region based on the difference between the ratio of one data to the regional ratio. Rainfall station location is considered discordant if it produces a Di value that exceeds the Di-critic value. Calculations at each location and the value in criticism based on the number of locations whose values have been determined by Hosking and Wallis.
There are several 20 locations in the analysis so that the critical value is 3. In Table 2, all values are Di <3 so that no rainfall station are excluded in analysis. The heterogeneity test aims to determine the non-uniformity based on the difference between the ratios in each location and the ratio of the area. The Kappa distribution method is used to determine the location parameters (ξ), scale parameters (α) and shape parameters (h, k). The coefficient is determined by the formula of Karian and Dudewicz. Determining of Kappa distribution coefficient at study area regional ratio by interpolation so A = 0.815753, B = 0.428046, h = -0.55604, and k = 0.11634. Furthermore, it was calculated by the heterogeneity test formula by Hosking and Wallis as follows: Heterogeneity test results H = 0.46 which H <1, so it can be concluded that all rain station locations are homogeneous and no rainfall station is excluded.
Selecting distribution using the L-moment method shows theoretical L-Cs and L-Ck parameters of each distribution with the observed data. The L-moment diagram is used to see the relationship of L-moment ratio of data to the Lmoment ratio of several distributions. In the diagram depicted the cdf function on ten probability distributions, which are Uniform, Exponential, Logistic, Normal, Log-Normal, Generalized Pareto (GPA), Generalized Logistic (GLO), Generalized Extreme Value (GEV), Gumbel, and Pearson 3. The probability distribution is based on the cdf value whose parameters have been formulated by Hosking, Wallis, and Maidment. Identication of the Lmoment diagram using 3 and 4 as a weighted average plotted on diagram. Plotting L-moment diagram aims to see the trend of average weighted data analysis towards probability distributions. From the Fig. 4 obtained visually that the average weighted is close to the GEV and GLO distribution curves. GEV and GLO distributions will be included at the goodness-of-fit test analysis. Visual observation through the diagram must be tested with the goodness-of-fit test which is an adaptation from L-moment statistics and Z probability Z value is | | ≤ 1,6 .
Determining all data with the selected probability distribution from Hosking and Wallis formula obtained |Z GEV | = 0.295 and |Z GLO | = 0.294. GLO distribution Z value is less than GEV distribution, so the selected distribution for Mount Merapi area is Generalized Logistic (GLO). Furthermore, the regional equation is determined in L-moment parameter and cdf inverse for GLO probability distribution (Hosking, 2009): ) Regional probability distribution equation as obtained: ] where x(F) is the growth factor of design rainfall and F is probability of return period T. Based on the regional distribution equation, the growth factor value is determining at each return period. The value of the growth factor at the return period of 2, 5, 10, 20, 50, and 100 years is showed in Table  3. Design rainfall analysis at each rainfall stations Mount Merapi area is obtained by multiplying parameter L-moment 1 and each growth factor in every return period 2, 5, 20, 10, 50, and 100 years as presented in Fig 5. Comparison of regional design rainfall with conventional method design rainfall aims to determine trend of rainfall data. In Fig. 6 Lmoment method design rainfall as rainfall regional (Rregional) in ordinate and moment method design rainfall as rainfall at-site (Rat-site) in abscissa. Plotting data illustrates that in the low return period, trend data is very dense and in the high return period trend data is spread out. This is caused by deviation data in several rainfall stations which has standard deviation value or  parameter is quite high. The rainfall data distribution occurs because historical data on several rainfall stations are varies. x 100% = 14,47% The result of RBIAS and RRSME calculation are showed in Figure 6 and 7. Then, performed a model test at the Sorasan Station as reference rainfall station because it has rainfall data the most comprehensive data and continuous recording data, therefore as the most stable data. The Sorasan rainfall station statistic parameters are: X ̅ = 85,689 mm S = 46,218 mm C v = 0,539 C s = 2,227 C k = 10,33 The Sorasan rainfall station is distributed Gumbel, so the design rainfall for return periods 2, 5, 10, 20, 50, and 100 years are 78, 119, 146, 172, 205, and 231 mm.
Design rainfall reference station compared to regional design rainfall in Fig 9. Regional design rainfall calculates by multiplying design rainfall reference station with growth factor. Then the error is calculated with Sorasan station as design rainfall reference. Example of 2 year return period: R sorasan = 78 mm Growth factor return period 2 year = 1,063 R Regional L-moment = 78 mm x 1,063 = 91 mm |Error|= ( 91 mm -78 mm 78 mm ) x 100% = 16,60% Figure 9. Model Test of Rregional to Rat-site From Fig. 8 Calculation results model test minimum error is 1.64% and maximum error is 16.60%. As can been seen in the chart, the higher error value occurs at low return period and very high return period. In this case, regional design rainfall L-moment method needs development to analysis in high return period even though this method can reduce bias of data.

CONCLUSIONS
Based on the analysis and discussion, there are conclusions as can be obtained as follows: 1. Homogeneity and heterogeneity test shows that all the rainfall stations in studyarea are uniform or homogeneous. The discordance test shows that at each location is less than the critical value so there is no data excluded. All the rainfall station data in Mount Merapi is used in every analysis steps. 2. Parameters in moment method and L-moment method needed to determine appropriated probability distribution and design rainfall at return periods. 3. The L-moment regional ratio in Mount Merapi area are 2 = 0.203, 3 = 0.166, and 4 = 0.169, the values used to determine average weighted on L-moment diagram. 4. The regional probability distribution using Lmoment method that is suitable for the study area is Generalized Logistic ( ] with x(F) design rainfall and F probability of return period. 5. The higher standard deviation value in rainfall station data, will effects on the higher design rainfall because distribution data is quite scattered in historical rainfall data records. 6. Test model showed the minimum RBias = 0.45%, maximum RBias = 41.583%, minimum RRSME = 0.45%, and maximum RRSME = 71.01%. The stability of L-moment method showed by model test minimum error = 1.64% and maximum error = 16.60%. The higher error value in higher return periods indicates that although L-moment method can reduce bias which occurs at moment method, but Lmoment method has inadequacy in higher return period.