Impact of Sunshine Duration and Clearness Index on Diffuse Solar Radiation Estimation in Mountainous Climate

In this paper, measured data of solar radiation was applied to develop forty-three (43) empirical models for estimation of monthly average diffuse solar radiation using clearness index, sunshine duration and a combination of them as predictors. The data covered a period of two years from May 2015 to April 2017 and was measured at Mehran University of Engineering and Technology, Hyderabad, Pakistan. Through a comprehensive statistical performance analysis, 43 dimensional models developed were tested for constructing the most accurate regression model to predict the monthly mean daily diffuse solar radiation in Hyderabad, Pakistan. On the whole, the model 42 – a hybrid of sunshine duration and clearness index predictors of diffuse fraction outperformed the remaining models proposed in this study. The best model (model 42) was then compared with 5 models and 5 measured data of diffuse solar radiation available in the literature and the NASA database by applying statistical indicators such as MBE, MPE, RMSE, RRMSE, R 2 and GPI. Through the analysis, the hybrid of sunshine duration and clearness index predictors of diffuse fraction model (model 42) was selected as the most appropriate model. The study concluded that the proposed hybrid model can serve as a baseline for the design of photovoltaic systems and estimate the monthly mean daily diffuse solar radiation on the horizontal surface for Hyderabad, Pakistan and other locations with similar local climate conditions. Citation:  Nwokolo, S.C. and Otse, C.Q. (2019). Impact of Sunshine Duration and Clearness Index on Diffuse Solar Radiation Estimation in Mountainous Climate. Trends in Renewable Energy, 5, 307-332. DOI: 10.17737/tre.2019.5.3.00107


Introduction
Since the beginning of the 19th century, the exploitation of conventional fuels is increasingly moving towards the development of industrialization and modern life style. It has resulted in various health hazards, environmental pollution, disruption of ecosystems such as crop and animal diversity, increased global warming and many more factors which drive the earth towards a dark future. Thus, the world needs a smart energy source that is unlimited in reserve and can be applied without major contributions to atmospheric pollution and greenhouse effect.
As reported by the literature [1], the earth has been already presenting numerous signs of global climate change as follows. NASA Goddard Institute for space studies estimating and understanding the diurnal fluctuations in multiple solar radiation parameters such as direct normal irradiance, diffuse horizontal irradiation, ground reflected solar radiation, evaporation and reference evapotranspiration.
In fact, despite the influence of the government, scientists, researchers and investors have explored solar energy as a type of renewable energy using the abovementioned various technologies, but fundamentally, the potential of solar energy has not been fully utilized [3]. For example, the energy emitted by the sun is so high that when only 0.1% of solar energy reaching the ground is converted into electricity with only 10% efficiency, the power output will be 17,300 GW, which is 7 of the global average instantaneous power consumption in 2012 [6][7][8].
These are significant and potential solar radiation reaching the Earth's surface in the form of solar energy, measurement of solar radiation and its components such as diffuse solar radiation and direct normal irradiance is limited, because there are very few standard weather stations can measure. Routinely, the data for these parameters is not available in the site of interest. However, other meteorological and atmospheric variables such as ambient temperature, cloud cover, rainfall regimes, and relative humidity are often measured routinely in most weather stations across the globe, as a result of its direct application in agricultural sciences and meteorology.
Due to the cost implication, maintenance, expertise needed for ground and satellite-derived technique of measuring solar radiation data (especially in rural and developing countries), prediction of solar radiation over a particular location using mathematical models has been initiated by solar energy researchers. Mathematical modeling serves as an alternative technique of generating data of solar radiation and its components without instrumentation network that would otherwise be needed.
Some researchers have stressed that accurate determination of diffuse solar radiation is important in design and performance analysis of solar energy projects, such as for designing and sizing photovoltaic sources as the future alternative energy [9][10][11]. For instant, Khorasanizadeh et al. [12] revealed that the impact of diffuse solar radiation to the annual solar energy is nearly 20% in Tabass, Iran.
It has been observed that in different locations across the globe, ground measurement of diffuse solar radiation is either scare or absent, whereas ground measurement of global solar radiation and weather parameters such as sunlight hours and precipitation are often available as a result of their traditional use in building and construction industries, agriculture, and meteorology. By applying mathematical correlations, diffuse solar radiation can be obtained as far as global solar radiation and other popularly measured meteorological parameters are available. For this reason, solar energy researchers across the globe have developed numerous empirical models in most metropolitan cities, because most meteorological and weather stations are often situated in these locations. From the mid-19th century, solar energy researchers have developed various empirical models for estimating diffuse solar radiation employing popularly measured variables. These variables include minimum and maximum temperature, hours of solar radiation and relative humidity [13][14][15][16][17][18]. Several researchers have equally developed regression models for estimating the monthly mean daily diffuse solar radiation employing the clearness index [4,6,9,13,[19][20][21][22][23][24][25][26][27][28][29][30][31] or applying sunshine hour fraction [28,30,[32][33][34][35][36] or with combination. of them [18,[37][38][39][40][41]. Despotovic et al. [42] observed that the empirical models using both clearness index and sunshine duration offer better estimation of diffuse solar radiation in the main five climate zones according to Koppen-Geiger climate classification

310
In spite of the vast number of studies on empirical models for estimating solar radiation across the globe, there is no recorded study in Hyderabad, Pakistan. The main objective of this study was to estimate forty-three models employed for estimating diffuse solar radiation using sunshine duration, clearness index and both of the predictors, obtain the best performing model using statistical indicators (such as mean bias error (MBE), mean percentage error (MPE), root mean square error (RRMSE), coefficient of determination (R 2 ) and global performance indicator (GPI)), and compare the selected best models with five models developed from the literature and five ground measured diffuse solar radiation in the literature together with satellite data obtained from NASA database for estimating diffuse solar radiation in Hyderabad, Pakistan.

Study Area
Hyderabad lying along the Indus River is the second largest city of Sindh province and the 8 th largest city in Pakistan (Fig. 1). It has a relatively mountainous climate which is slightly more pleasant than other parts of Central Sindh throughout the year. Summer and winter are the two main seasons, while spring and autumn are very short. The period from mid-April to late June is the hottest time of the year with temperatures as high as 48.5 °C. Winters are usually warm, around 25 °C during the day time and often below 10 °C at night, and are the best time to visit the city. The highest ever recorded temperature in Hyderabad was 48.5 °C in 1991, while the lowest was 1 °C in 2012 [43,44].

311
Under the present study, the measured global solar radiation and its components (diffuse and direct solar radiation) together with other meteorological parameters were measured by the Energy Sector Management Assistance Program of The World Bank Group at Mehran University of Engineering and Technology (M-UET), Hyderabad, Pakistan [45]. The measurements were performed for a period of two years (May, 2015 -April, 2017) so as to determine regional solar radiation and other meteorological variables. However, the monthly mean daily sunshine hours (for sunshine fraction in Equation 4) were based on the 30-year period (1981-1990) using the same geographical information as M-UET were obtained from the International Water Management Institute (IWWI) website [46]. The characteristics and specification of solar and other meteorological parameter instrument used are provided in Table 1. The obtained raw data (10 minutes summarization interval values) were post-processed in order to obtain daily values of global, diffuse and direct solar radiation data and other meteorological parameters such as air temperature, relative humidity, wind speed and wind direction. The data obtained was further averaged for a month so as to calculate the monthly mean values. The measured monthly mean daily values thus obtained and sunshine hours are shown in Table 2.

312
The fundamental requirements such as maximum possible sunshine hours (So) and extraterrestrial solar radiation on the horizontal surface (Ho) are significant for the prediction of diffuse solar radiation expressed as mathematically as given by Yaniktepe and Gene [47]: Where the mean sunrise hour angle ( ) s  can be evaluated as: The solar declination ( )  is given by Yaniktepe and as Genc [47] where ISC is the solar constant,  is the latitude and n the number of days of the year starting from first January. The maximum possible sunshine duration is calculated as: where other symbols retain their usual meaning.

Statistical Modeling
Estimation of diffuse component of global solar radiation involves modeling the monthly mean diffuse fraction or diffuse coefficient as a function of monthly mean sunshine fraction, clearness index and combination of sunshine fraction and clearness index. This could be attributed to the fact that lower fluctuations are often observed in monthly mean values of solar radiation from one month to another as component to daily values of solar radiation [22]. Hence, better estimation capacity is observed in monthly mean models [22].
Peers and researchers have stressed that validating training dataset using the same dataset of training might lead to partially validated results [21,48], thus, an independent validation dataset which involves that validating patterns have not been previously applied for training dataset is often employed. However, as a result of the short-term measure employed in this study (2 years), the present study employed dataset during May, 2015 -April, 2016 to develop the models for the station while validation dataset during May, 2016 -April, 2017 was used to test the models. This measure was employed to prevent the models from over fitting and to determine the estimation capacity of the developed models.
In diffuse solar radiation estimation, an empirical model uses diffuse fraction (Hd/H) or diffuse coefficient (Hd/Ho) with other easily measurable parameters. Moreover, since the first primitive work of Liu and Jordan [49] that estimated the mean diffuse solar radiation, numerous solar energy researchers have proposed several models in order to elaborate the Liu and Jordan model's functional form. The relationships representing the diffuse radiation are classified into three main classes: (1) sunshine duration-based models, (2) clearness index-based models, and (3) sunshine and clearness index-based models [50]. Owing to these classifications, the diffuse fraction (Hd/H), and diffuse coefficient (Hd/Ho) correlations were used in estimating the diffuse solar radiation in Hyderabad, Pakistan.

Clearness Index-Based Models
According to Nwokolo and Ogbulieze [50], models of the monthly mean diffuse fraction (Hd/H) and the diffuse coefficient (Hd/Ho) is a function of the clearness index; such that The proposed models under this class is shown in Table 3a.

Sunshine Duration-Based Models
Numerous models have been widely applied sunshine fraction (S/So) in associating the ratio of diffuse solar radiation (Hd) to often expressed as diffuse fraction (Hd/H), and the monthly average diffuse coefficient (Hd/Ho) to sunshine fraction or combination of both. Varying degrees of polynomial functions such as linear and quadratic, logarithmic and exponential models are applied for this study. Where S is the monthly mean daily hours of sunshine; such that The developed models under this class is shown in Table 3b.

Sunshine Duration and Clearness Index-Based Models
Under this class, the monthly mean diffuse fraction (Hd/H) and the diffuse coefficient (Hd/Ho) are function of the clearness index and sunshine fraction; such that The developed models under this class is shown in Table 3c.

Comparison of Models
In order to check the capacity and accuracy of the estimated data from the measured data in this study, numerous statistical indicators are applied [22][23][24]42]. These metrics include mean bias error (MBE), mean percentage error (MPE), root mean square error (RMSE), relative root mean square error (RRMSE) and coefficient of determination (R 2 ) as presented in equation (8 -12).          In order to check and select the best model out of the 43 recommended models used in this study, a global performance indicator (GPI) was applied. Applying the GPI newly introduced by Bailek et al. [37] and Despotovic et al. [51] on the 43 models developed in this study undoubtedly revealed the best model. The established best model was then used to compare the five measured data and five diffuse solar radiation models reported in the literature from different locations across the globe as presented in Tables 4  -5. This was established in order to check if the accuracy and application of the best model is limited to the site from which the model was developed. This technique was applied because peers and researchers from the time immemorial reported that diffuse solar radiation is dependent on local climate and geographical location [22-24, 42, 50]. Also, GPI was applied in this section for selecting the best performing model out of 43 models and for comparing with the literature because of the following: (1) The GPI combines the advantages of the statistical indicators presented in equation (8 -12) in order to reveal the best performing model, and (2) with the application of GPI, a single value which consists of short-and long-term statistical performances together with the linearity of the models will be clearly observed and selected. However, Bailek et al. [37] stressed that the GPI is a relative unbounded value and a higher value of the GPI implies a better statistical performance and modeling quality. According to Despotovic et al. [51] and Jamil and Akhtar [22], the values of all selected statistical indicators need to be scaled down so that the scaled values lie between 0 and 1. These scaled values are subtracted from the median value of the corresponding scaled statistical indicators.    Where Lat. represents latitude positive north/south in degrees, Lon. stands for longitude positive east/west in degrees, Ele. denotes elevation in meters and monthly mean diffuse solar radiation obtained from literature and NASA (same geographical information as study site) are all in MJm -2 day -1

Results and Discussion
In this section, the results of the measured data in the study site were compared with the following: (1) the developed 43 models in this study, (2) five measured data obtained from the literature together with the observed satellite data obtained from the NASA database, and (3) five models obtained from the literature and best performing model (model 42) as presented in Figs. 2 -5, and their corresponding estimation statistical indicators are presented in Tables 6 -7.

Analysis of Monthly and Yearly Solar Radiation Ground Observation
The results of the monthly and annual averages of the study site and the corresponding aggregate mean values for the duration of measurement (2 years) are presented. It can be seen vividly that the mean monthly and yearly solar radiation and meteorological values for the two years of measurement are presented in Table 2.
The summer season is from March-August whereas the winter season starts in October and ends in January for Hyderabad, Pakistan. The winter months are characterized by overcast, heavy rainfall clouds, heavy frogs, high relative humidity, low temperature, low wind speed and direction, and the highest ambient air pressure as shown in Table 2. This gives rise to minimum values of 14.33 MJm -2 day -1 and 14.75 MJm -2 day -1 reported for global and direct solar radiation in the month of January, respectively, whereas the minimum diffuse solar radiation of 5.02 MJm -2 day -1 was recorded in the month of December as shown in Table 2.
However, summer months are characterized by clear sky, high temperature and wind speed with low relative humidity and ambient air pressure. This culminates into high values of global solar radiation and its component. The maximum value for global solar radiation (24.84 MJm -2 day -1 ) occurred in the month of April, direct normal irradiance (20.78 MJm -2 day -1 ) occurred in the month of February, and diffuse solar radiation (13.28 MJm -2 day -1 ) occurred in the month of July. These results are comparable to the report of Jamil and Akhtar [22][23][24] in the humid-subtropical climatic region of India.
The  Table 2. This implies that the site is more favorable for the installation of photovoltaic technology or flat solar collectors as the magnitude of direct normal irradiance is below the threshold of 7200 MJm -2 year -1 in the months of January, June, July, August, November and December as presented in Table 2. It is therefore imperative to note that concentrated solar power should not be considered as a favorable technology in this station.
In Hyderabad, the global solar radiation is significantly higher than the direct normal irradiance as a result of higher attenuation effect of aerosols and water vapor on direct irradiance than the diffuse solar radiation component [54,62]. Hyderabad is situated by the coast and it is highly affected by a high load of sea salt, and water drops aerosols and water vapor loads as the station is located in the University setting where thousands of people and building structures are located. As a result, atmospheric particles are able to absorb light beams of a specific wavelength. These particles convert electromagnetic radiation into heat and eventually into diffused solar radiation components. So, the direct normal irradiance is obtained from the relation: 320 where z is the zenith angle and other variables retain their usual meaning. From equation (14), it is obvious that as the diffuse solar radiation component increases, the direct normal irradiance decreases and finally the global solar radiation remains at the same level.

Performance Evaluation
Developed models under the three classes of clearness index-based models, sunshine duration-based models and combination of the two predictors using either diffuse fraction or diffuse coefficient are now evaluated and results are shown in Fig. 2.
In clearness index-based models' class, diffuse fraction and diffuse coefficient are developed with only one predictor of the clearness index. Fourteen models are developed, with the restriction of the order of two in each input predictor. This could be attributed to the fact that the higher order equations have increased complexity. The numerous models developed are presented in Table 3a. The statistical indicators such as MBE, MPE, RMSE, RRMSE and R 2 have been evaluated for the developed models in the class. The results of the statistical indicators are presented in Table 6. MBE values lie in the range of -0.01057 to -0.00994 MJm -2 day -1 with a minimum value observed for model 6 (-0.01057 MJm -2 day -1 ). As observed from Table 6, models 1-14 under this class recorded negative values. This implies that models underestimated the measured data. However, the overestimation in the values is significantly small because the magnitude of MBE for this class is observably close to zero. MPE values lie in the range -0.08687 to -0.04886 MJm -2 day -1 with the minimum value observed for model 13 (-0.08687 MJm -2 day -1 ). RMSE values are observed to be small for all the developed models under this class with model 6 registering the minimum value of 0.177211 MJm -2 day -1 ). In general, the RMSE range buried between 0.177211 to 0.194205 MJm -2 day -1 . Accordingly, the RRMSE value buried in the range 25.50065 to 28.04482 MJm -2 day -1 with the minimum value 25.50065 registered for model 6. The coefficient of determination (R 2 ) has values in the range of 0.931 -0.937 representing good fit of measured data. The highest value of R 2 was recorded for model 5 and model 13.
Under the sunshine duration-based models, the diffuse fraction and diffusion coefficient models are developed with only one predictor of the sunshine duration parameter. Seventeen models are proposed, with the restriction of the order of two in each input predictor. This is because, the higher order equations demonstrate increased complexity. Hence, several models proposed under this class are presented in Table 3b. From the statistical indicators evaluated under this class, the results are presented in Table 4. MBE value lies in the range of -0.0043 to 0.003714 MJm -2 day -1 with the minimum value of -0.0043 MJm -2 day -1 registered for model 26. As observed models 15, 17, 22, 25, 28, 29 and 30 recorded a positive value of MBE, indicating overestimation while the remaining models reported a negative value implying an underestimation. However, the overestimation and underestimation in the values is significantly small since the values of MBE for the proposed models reasonably close to zero. This trend is equally observed for the models developed in the humid-subtropical climate region of India [22]. MPE values lie in the range -0.07466 to 0.005151 MJm -2 day -1 with the minimum value of -0.07466 MJm -2 day -1 recorded for model 29.  Under sunshine duration and clearness index-based models, diffuse fraction and diffuse coefficient models are proposed with two predictors. Twelve models are developed, with the restriction of the order of three in each input predictor. This is as a 322 result of the fact that, the higher order equations have increased complexity. The numerous models developed are presented in Table 3c. From the statistical results, MBE, MPE and RMSE reported similar trends in signaling but varying magnitude as in clearness index-based models. Similar trend was reported for models developed in India [21] indicating that models employing clearness index and those combining clearness index and sunshine duration exhibit similar diurnal fluctuation.  [55] Under this section, five (5) empirical models are often used by researchers in the literature, and four (4) measured data obtained equally from the literature and the NASA website were employed to check the applicability of the best models from the 45 proposed models. The results of the statistical indicators evaluated under this section are presented in Table 7. MBE values lie in the range of 0.25264 to 0.063076 on the five models obtained from the literature while the range of MBE on measured data obtained from satellite data from NASA website is -0.25168 to -0.08497. From the resulting matrices, only models 48 [52] have a positive value of MBE which indicates overestimation while the remaining models and measured data have a negative value leading to an underestimation. However, the overestimation and underestimation of these values is significantly small since the values of MBE for each of the proposed models are reasonably close to zero. MPE values lie in the range -1.03147 to 2.799933 and 0.445126 to 2.248962 with a minimum value of -1.03147 and 0.445126 for models from the literature (models [44][45][46][47][48] and measured data obtained from literature together with NASA data, respectively. Base on measured data obtained from the literature together with  [22], respectively. Also, coefficient of determination (R 2 ) recorded values in the range of 0.891 to 0.956 and 0.202 to 0.939 with maximum values of 0.956 reported for models 47 [28] and 0.939 for NASA data on models from the literature and measured data together with NASA data, respectively.

Global Performance Indicator and Ranking of Models
From the statistical indicators, it can be seen that different models from different classes come together with the models and measured data and satellite data obtained from the NASA database outperformed others. Thus, to avoid this variability and further improve the results of statistical analyses, global performance indicator (GPI) is applied.
As presented in Table 6, the GPI values of the proposed 43 models in this study classified under sunshine duration-based models, clearness index-based models and combination of both models are in the range of -2.2996 to -1.0208. The maximum GPI (-1.0208) and the minimum ranking of models were recorded for model 42 which is a hybrid of sunshine duration and clearness index predictors of variable dependent variable. It can be equally observed in Table 6 that hybrid of sunshine duration and clearness index predictors recorded as the best ranking model. This indicates that model 42 and hybrid of sunshine duration and clearness index predictors yielded the best performing model and class, respectively, in Hyderabad, Pakistan. Similar results were obtained in the literature [6,20,24,[41][42][38][39][40].
In order to achieve the objective of the study, the best model (model 42) selected using GPI metric was applied to compare with five (5) models and five (5) measured data obtained from the literature and the NASA database. This is to check if the accuracy and applicability of the best model are limited from which the model was developed, as researchers and peers reported that diffuse solar radiation and other components of global solar radiation are dependent on local climate and regional geography.
As presented in Table 7, the maximum GPI and the minimum scores of the five (5) models and five (5) measured data from the literature together with satellite data obtained from the NASA database are compared with the best performing model in Hyderabad, Pakistan station (i.e., model 42). After thorough analysis using statistical indicators, GPI and ranking of models (Table 7), the best model (model 42) can be employed for estimating diffuse solar radiation in Kerman, Iran and Algarh, India while UTA and LR stations located in Chile require local calibration of model 42 to actually fit the measured data. In general, model 42 is best suited to fitt data at stations Aligarh, India followed by Kerman, Iran while other stations such as UTA and LR located in Chile need local calibration to actually fit the measured values in Hyderabad. However, it can be observed that model 42 did not fit the calculated data from the NASA database despite the fact that the data employed for modeling 42 and that of NASA possess the same geographical information. This could be attributed to the fact that NASA data is estimated under 20% error from existing models in the literature and different locations compared to the study site. Also, comparing models 42 with 5 models in the literature revealed that model 42 actually predicted values obtained from Iqbal's model [13] and Maduekwu & Chendo's models [52] while other models such as Liu & Jordan [49], Page [25], and Ibrahim [28] required local calibration to actually fit the values of model 42. On the whole, Iqbal's model is best suited to fit the values of the best performing model in Hyderabad (model 42), followed by Maduekwu & Chendo's model.

CONCLUSIONS
The knowledge of diffuse solar radiation is important for the design and development of solar system. In this study, forty-three models were analyzed, which

ACKNOWLEDGMENTS
The author is grateful to the Energy Sector management Assistance Program power by the World Bank Group for providing the data used in this research. My thanks also go to all the authors cited in this paper for their research works that has made this research possible.

CONFLICTS OF INTEREST
The author declares that there is no conflict of interests regarding the publication of this paper.