Artificial Intelligence for Sustainable River Ecosystem Management.

Title: Prediction of Suspended Sediment Concentration in the Padma River Using Artificial Intelligence Approach

Problem Statement

Rivers carry suspended sediments along with their flow. These sediments deposit at different places depending on the discharge and course of the river. However, the deposition of these sediments impacts environmental health, agricultural activities, portable water sources and also reduces the flow area, thus affecting the movement of aquatic lives and ultimately leading to the change of river course. Variations in SSC of rivers impact changes in river morphology. Alluvial river channels are also classified with respect to the total sediment load delivered to the channel. Different authorities require the forecasted data of suspended sediments in the river to operate various hydraulic structures properly.

According to Collins and Walling (2004) SSC is required to estimate and predict soil erosion and sediment transport caused by changes in land use patterns. Several hydrological variables such as bed-form geometry, flow rate, friction factor and discharge have been used to develop different models for predicting sediment concentration in rivers (Karim & Kennedy, 1990; Lopes & Ffolliott, 1994).

Direct analysis of the Suspended Sediment Concentration (SSC) and the Sediment Rating Curve (SRC) method are among the two tools used in a wide range to obtain the suspended sediment load. Although direct analysis method is the most reliable method, but it is very costly, time consuming, and in many cases, problematic for inaccessible sections, especially during severe storm events, and cannot be conducted for all river gauge stations (Bayram et al., 2013).

On the other hand, the sediment rating curve (SRC) which utilizes a regression analysis to establish a relationship between sediments and river discharges, is a conventional and standard means of predicting SSL (Talebi & Taşar, 2017). In Padma River at Mawa station, the sampling for SSC is infrequent, this lack of continuous information about SSC can result in substantial errors in estimates of the SSC using the conventional SRC and RM methods. This imposed the necessity to use the artificial intelligence models for more accurate prediction (Kisi et al., 2012).

Because of this, researchers have turned themselves in the direction of artificial intelligence (A.I.) and its subset, which are machine learning (ML) and deep learning. During the last two decades, artificial intelligence techniques to estimate and predict various hydrological phenomena has being utilized (Tachi, 2017).

This study therefore aims to develop efficient ML (ANFIS and ANN) and DL(LSTM) models in predicting the SSC in Padma river and to compare their results with one another. This will be based on the available data concerning the inputs (discharge, water level and flow velocity) and the output (SSC) variables.

Objectives of the Study

  1. To Develop a model to predict the suspended sediment concentration using artificial intelligence modelling approaches.
  2. To identify performance indicators to evaluate the model performance and determine the model efficacy in the field of SSC prediction.

Activities are performed to achieve the specific objectives

  • Data collection (including river discharge, water level, SSC, water flow velocity) for the site over the period from 1990 to 2010.
  • Data classification and preprocessing using the Grubbs test for multiple outliers.
  • To develop a SSC predictive model using the ElasticNet LR model (a combination of inputs including river discharge, water level and flow velocity).
  • To develop a SSC predictive model using the ANN model (a combination of inputs including river discharge, water level and flow velocity).
  • To develop a SSC predictive model using the ANFIS model (a combination of inputs including river discharge, water level and flow velocity)
  • To develop a SSC predictive model using the LSTM model (a combination of inputs including river discharge, water level and flow velocity).
  • To compare the accuracy of each individual model in order to select the most accurate predictive model using different statistical parameters, including, Coefficient of determination R2, MAE, RMSE, RSE & NSE) between the measured and computed data results.
  • To compare the performance of the most accurate model from each approach with the others to select the best predictive model among all models using the same statistical measures.

 

 

 

 

 

Data and Methodology

Data Input

One of the main tasks in machine learning for prediction is to choose input variables that have an impact on the output. To find out a suitable model a good understanding of the underlying process and statistical analysis of inputs & outputs are required. Previous studies found that, sediment is affected by different hydro-morphological data like discharge, water level and flow velocity. In this study, four types of hydro-morphological data are collected from the Bangladesh Water Development Board (BWDB) through appropriate acquisition procedure. The data types are:

  • Sediment (ppm)
  • Discharge (cumec)
  • Water level (m)
  • Flow Velocity (m/s)

The methodology of the study presents the four most widely used artificial intelligence approaches which have been utilized by many scientists and researchers in the literature to simulate the suspended sediment concentration problem. The methods were divided into conventional ML & deep learning methods. In this study, we selected various baseline of ML methods to compare with our Proposed LSTM.

  • ElasticNet Liner Regression (ElasticNet LR)
  • Artificial Neural Networks (ANN)
  • Adaptive Neuro-Fuzzy Inference System (ANFIS)
  • Long Short-Term Memory (LSTM)

Model Performance evaluation

In this part, five standard evaluation metrics are used to indicate how well the results of each model compared with the measured data.

  1. Coefficient of determination (R2)
  2. Mean Absolute Error (MAE)
  3. Root Mean Square Error (RMSE)
  4. Relative Squared Error (RSE)
  5. Nash–Sutcliffe Efficiency (NSE)

The larger value of R2 refers to the better prediction performance of the model. However, R2 is not enough to determine whether the coefficient prediction is biased or not. Other error metrics are used such as MAE, RMSE, RSE and NSE to further investigate if a regression model provides a good fit to our data and to find the error or difference between the actual and predicted outcome.  The smaller value of MAE, RMSE & RSE refers to the model’s better prediction performance. NSE can range from minus infinity to one, where a value of 1.0 indicates a perfect match between modeled results and observed records.