Untitled

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[a4paper,left=2.9cm,right=2cm,top=2cm,bottom=2.25cm]{geometry}

\title{Response to reviewer’s comments}
\date{}
\usepackage{setspace}
\usepackage{amssymb}

\usepackage{subcaption}
\usepackage{float}
\usepackage{xcolor}
\captionsetup{compatibility=false}
\usepackage{url}
\usepackage{float}
\usepackage{tabularx}
\usepackage{lineno}
\usepackage{booktabs}
\usepackage{multirow}
\usepackage{graphicx}
\usepackage{natbib}
\begin{document}
	\onehalfspacing

	\maketitle

	\textbf{Manuscript Title:} {{Predicting Total Sediment Load Transport in Rivers using Regression Techniques, Extreme Learning and Deep Learning Models}} \\

	\textbf{Dear editors and reviewers:}

	We appreciate your valuable comments and suggestions. They were really
	helpful in improving the quality of our work. We have revised our paper
	carefully according to the comments and suggestions provided to us.
	The summary of the changes are as follows: \\

	\textbf{Response to Reviewer 2:}

	\textbf{General Comment:} The paper focuses on the application of various machine learning models to predict total sediment load transport in rivers. The manuscript is clearly written. As far as I am concerned, the methods are not novel in themselves but have been culled from a few existing papers. I think that the authors would do well to introduce the specific novelty in this paper. My comments are as follows:\\

	\textbf{\textbf{Q2.1)} The methods seem sound and well-explained. As far as I am concerned, the methods are not novel in themselves but have been culled from a few existing papers. I think the authors would do well to make their debt to these papers clearer (as well as the specific novelty introduced in this paper).}

	\textbf{Answer:} We thank the reviewer for reviewing our manuscript and
	providing valuable comments for improving it. We first describe the
	research gap that we are aiming to solve, which automatically helps to
	ascertain the novelty of this research.

	\textbf{Research gap}: The following are the research gaps that exists in the
	current studies with reference to total sediment load prediction.

	1) \textbf{Usage of limited data or specific environment for predicting total
		sediment load:} A major limitation that is present in majority of the existing
	studies on total sediment load prediction is that most of them have focused on
	utilizing ML algorithms to develop predictive models for only one hydrological
	station or river, or used a series of data collected from experiments performed on a laboratory flume. As the
	magnitude and
	behavior of the total sediment load for each river is different, the
	suitability of certain ML algorithms for the task of total sediment load
	prediction may vary depending on the river under consideration. Few ML
	algorithms may perform well and produce a good
	prediction for total sediment load for a hydrological station at a particular
	river, but may not perform well in predicting total sediment load for a
	different river, due to variance in anthropogenic and natural factors. Further,
	most of the studies are concentrated either on the bed load or
	suspended sediment load. So, usually models that are tuned to perform well on
	suspended sediment load might under-perform when applied to predict bed load
	and vice versa.

	2) \textbf{Empirical Equations:} Current studies that have been published on
	total sediment load transport either employ empirical methods derived out from
	the studies carried out on a individual laboratory experiment or specific
	rivers under a particular environment. Depending on the data collection
	conditions, the same empirical formula may yield completely different results
	when the underlying environment changes. As a result, it is challenging for a
	researcher to select an appropriate formula for a given river
	\citep{vanoni1975river, yang1996sediment}.

	\begin{table}[!h]
		\caption{The individual datasets that we have used in this study. All these
			have been compiled by Brownlie \citep{brownlie1981compilation}. The
			motivation of our research is to develop a generic predictive total
			sediment load model that comprises of dataset for both flume and field
			studies. It can be observed that the undertaken study has 11 flume
			experiments and 6 field experiments demonstrating a good mix of flume and
			field data.}
		\label{tab:brownlie-studies}
		\resizebox{\textwidth}{!}{%
			\begin{tabular}{@{}lllll@{}}
				\toprule
				\textbf{Sl} & \textbf{Author/Agency} & \textbf{Type} &
				\textbf{River Body / Flume} & \textbf{Citation/Agency} \\
				\midrule
				1 & U. S. Bureau of Reclamation & Field Data &
				Colarado river US & U. S. Bureau of Reclamation \\
				\midrule
				2 & Einstein, H. A. & Field Data &
				Mountain Creek &
				\cite{einstein1944bed} \\ \midrule
				3 & Mahmood & Field Data &
				ACOP Canal &
				\cite{mahmood1979selected} \\ \midrule
				4 & Milhous, R. T. & Field Data &
				Oak Creek &
				\cite{milhous1973sediment} \\ \midrule
				5 & Nordin and Beverage & Field
				Data &
				Rio Grande River &
				\cite{nordin1965sediment} \\ \midrule
				6 & Simons & Field Data &
				American Canal &
				\cite{simons1957theory} \\ \midrule
				7 & Guy et al. & Lab Data &
				CSU Data (Experimental) &
				\cite{guy1966summary} \\ \midrule
				8 & Einstein and Chien & Lab Data &
				Experimental Flume &
				\cite{einstein1955effect} \\ \midrule
				9 & Gilbert and Murphy & Lab
				Data &
				Experimental Flume &
				\cite{gilbert1914transportation} \\ \midrule
				10 & Meyer-Peter, E., and Muller, R., & Lab Data &
				Experimental Flume &
				\cite{meyer1948formulas} \\ \midrule
				11 & Paintal, A. S. & Lab Data &
				Experimental Flume &
				\cite{paintal1971concept} \\ \midrule
				12 & Satoh, S., et al. & Lab Data &
				Experimental Flume &
				\cite{satoh1958research} \\ \midrule
				13 & Soni, J. P. & Lab Data &
				Experimental Flume &
				\cite{soni1980short} \\ \midrule
				14 & Straub, L. G. & Lab Data &
				Experimental Flume & \cite{straub1954terminal},
				\cite{straub1958experiments} \\ \midrule
				15 & Taylor, B. D. & Lab Data &
				Experimental Flume &
				\cite{taylor1971temperature} \\ \midrule
				16 & USWES & Lab Data &
				Experimental Flume & \begin{tabular}[c]{@{}l@{}}U.S.
					Army Corps of Engineers\\ Waterways Experiment
					Station\end{tabular} \\ \midrule
				17 & Vanoni and Hwang & Lab Data &
				Experimental Flume &
				\cite{vanoui1967relation} \\ \bottomrule
			\end{tabular}
		}
	\end{table}

	3) \textbf{Effect of combination of different characteristics affecting total
		sediment load:} Most of the existing studies consider all available variables
	for predicting the total sediment load. Total sediment load transport depends
	on the following characteristics i.e., \textit{Sediment}, \textit{Geometry},
	and \textit{Dynamic}. However, it is important to ascertain the effect of each
	of these characteristics individually as well as their combination on the
	prediction of total
	sediment load. This is important as many studies may contain measurements from either of the \textit{Sediment}, \textit{Geometry},
	and \textit{Dynamic} or a combination of these characteristics.

	This creates a noteworthy research gap for our study, wherein lies the
	\textbf{novelty}
	of this research work. We analyze in depth to understand whether
	there is a model or algorithm that is capable of producing accurate total
	sediment load prediction for a dataset comprising of multiple different rivers
	and laboratory flume data. The present study contributes towards addressing
	this research gap through the development of predictive models for total
	sediment load based on the dataset compiled by Brownlie
	\citep{brownlie1981compilation}. Brownlie's dataset comprises of observations
	for both laboratory and field conditions. In our study, we have used 17 unique
	datasets comprising of 11 datasets from lab data, and 6 datasets from field
	data (see Table
	\ref{tab:brownlie-studies}). Moreover, Brownlie's dataset comprises of both bed
	load as well as suspended sediment load data. So, the eventual model tuned to
	Brownlie's dataset can be considered more robust as it comprises of multiple
	different rivers, laboratory flume data, measurements pertaining to bed load as
	well as suspended sediment load.

	The usage of such a comprehensive datasets
	consisting of data from heterogeneous sources
	allows
	a ML and DL methods to produce a robust
	model that can be used
	for prediction of total sediment load. Thus, usage of Brownlie's dataset helps
	to
	overcome the research gap induced in points 1 and 2 above.
	For addressing the
	research gap highlighted in point 3, we consider various combinations of
	characteristics i.e., \textit{Sediment}, \textit{Geometry},
	and \textit{Dynamic} that might effect the total sediment load transport. To
	the
	best of our knowledge, we are the first to analyze the impact of the various
	combinations of these characteristics on total sediment load transport. The
	combinations are as follows:
	\begin{enumerate}
		\item $C = f(Sediment) = f(d_{50}, C_g, G_s)$
		\item $C = f(Geometry) = f(y, BF)$
		\item $C = f(Dynamic) = f(Q, \tau_b, Sf)$
		\item $C = f(Sediment, Geometry) = f(d_{50}, C_g, G_s,y, BF)$
		\item $C = f(Geometry, Dynamic) = f(y, BF, Q, \tau_b, Sf)$
		\item $C = f(Sediment, Dynamic) = f(d_{50}, C_g, G_s, Q, \tau_b, Sf)$
		\item $C = f(Sediment, Geometry, Dynamic) = f(d_{50}, C_g, G_s, y, BF, Q, \tau_b, Sf)$
	\end{enumerate}

	The specific novelty introduced in this work are:
	\begin{enumerate}
		\item Usage of Brownlie's dataset that comprises of multiple
		different rivers, laboratory flume data, measurements pertaining to bed
		load as
		well as suspended sediment load helping us to develop a more robust
		prediction model.
		\item We consider various combinations of \textit{Sediment},
		\textit{Geometry},
		and \textit{Dynamic}
		characteristics and their combinations. We also analyze their impact on the total sediment load transport prediction.
		\item We compare and contrast deep learning and ML models. We
		have compared our proposed DNN model with extreme machine learning (ELM),
		support vector regression (SVR), linear regression (LR), and existing empirical
		equations. We conclude that
		DNN models are more effective as compared to ML models and empirical models
		for total sediment
		load prediction.
	\end{enumerate}

	We agree with the reviewer's point that the methods used in the research are
	standard ones, but to the best of our knowledge, they have never been compared
	and contrasted in the manner we have done in this work. In addition, methods
	like ELM, SVR, DNN are recommended methods for fitting data which exhibit
	non-linearity. The total sediment load transport is quite complex
	phenomenon as it
	involves a large number of variables (e.g., $d_{50}$, $C_g$, $G_s$, $Q$,
	$\tau_b$, $Sf$ etc.) that often have non-linear relationships between them
	making our proposed methods a viable choice to fit the data and provide robust prediction. (Pages 3,4)\\

	\textbf{\textbf{Q2.2)} The dataset used in the paper was published in 1981 by Brownlie et al. The authors should use the most recent dataset to demonstrate the usefulness of the algorithm.}

	\textbf{Answer:} We thank the reviewer for the critical comments. One of
	the major issues in the field of sediment transport is the
	lack of
	data sharing in open literature and also lack of comprehensive datasets.
	This prompted us to make use of the dataset compiled by Brownlie
	\citep{brownlie1981compilation}. Brownlie's dataset comprises of
	observations
	for both laboratory and field conditions. Moreover, Brownlie's dataset
	comprises of both
	bed
	load as well as suspended sediment load data. In our study, we have used 17
	unique
	datasets comprising of 11 datasets from lab data, and 6 datasets from field
	data (see Table
	\ref{tab:brownlie-studies}). So, the eventual model tuned
	to
	Brownlie's dataset can be considered more robust as it comprises of
	multiple
	different rivers, laboratory flume data, measurements pertaining to bed
	load as
	well as suspended sediment load. The ranges of the variables used in this
	study is shown in Table \ref{Data_description}. It can be seen based on the
	ranges that the dataset is comprehensive in nature and can aid in
	development of a more robust
	model that can be used
	for prediction of total sediment load. However, if the reviewer can provide
	us with the pointers to a more recent dataset on sediment transport, we
	shall be happy to evaluate our models on those too.\\

	\begin{table}[ht]
		\centering
		\caption{Statistical description of \citet{brownlie1981compilation}'s
			dataset used in this study.
			Notations: $y$ (flow depth), ${BF}$ (bed form of the channel), $Q$ (channel discharge), $Sf$ (friction/energy slope), ${\tau_b}$ (bed shear stress), $d_{50}$ (median diameter of sediment particles), $C_g$ (gradation coefficient of the sediment particles), $G_s$ (specific gravity) and $C$ (total sediment load).}
		\label{Data_description}
		\resizebox{\textwidth}{!}{%
			\begin{tabular}{@{}lllllllllll@{}}
				\toprule
				\multirow{4}{*}{} & & $Q$ & $y$ & $\tau_b$ & $Sf$ & $d_{50}$ & $C_g$ & $G_s$ & $BF$ & $C$ \\ \midrule
				& Mean & 15.0735 & 0.2989 & 4.4480 & 0.0045 & 0.0016 & 1.3973 & 2.6484 & 2.3340 & 3087.6026 \\ \cmidrule(l){2-11}
				& Standard deviation & 66.1484 & 0.6200 & 5.2332 & 0.0048 & 0.0034 & 0.4057 & 0.0327 & 2.2096 & 5861.1493 \\ \cmidrule(l){2-11}
				Overall set & Minimum & 0.0006 & 0.0079 & 0.2600 & 0 & 0.0002 & 1 & 2.2500 & 0 & 0.0010 \\ \cmidrule(l){2-11}
				& Maximum & 486.8233 & 4.2977 & 51.0500 & 0.0275 & 0.0270 & 3.8500 & 2.6800 & 8 & 52238 \\ \cmidrule(l){2-11}
				& Count & 1880 & 1880 & 1880 & 1880 & 1880 & 1880 & 1880 & 1880 & 1880 \\ \midrule
				\multicolumn{11}{l}{} \\ \midrule
				\multirow{3}{*}{}
				& Mean & 14.5515 & 0.2943 & 4.516 & 0.0045 & 0.0016 & 1.3962 & 2.6483 & 2.3218 & 3158.5306 \\ \cmidrule(l){2-11}
				& Standard deviation & 65.443 & 0.6105 & 5.2688 & 0.0048 & 0.0035 & 0.4083 & 0.0333 & 2.212 & 6054.0704 \\ \cmidrule(l){2-11}
				Training set & Minimum & 0.0006 & 0.0079 & 0.26 & 0 & 0.0002 & 1 & 2.25 & 0 & 0.001 \\ \cmidrule(l){2-11}
				& Maximum & 486.8233 & 4.2977 & 51.05 & 0.0275 & 0.027 & 3.85 & 2.68 & 8 & 52238 \\ \cmidrule(l){2-11}
				& Count & 1504 & 1504 & 1504 & 1504 & 1504 & 1504 & 1504 & 1504 & 1504 \\ \midrule
				\multicolumn{11}{l}{} \\ \midrule
				\multirow{3}{*}{}
				& Mean & 17.1618 & 0.3175 & 4.1758 & 0.0044 & 0.0014 & 1.4015 & 2.6489 & 2.383 & 2803.8908 \\ \cmidrule(l){2-11}
				& Standard deviation & 68.9481 & 0.6572 & 5.0857 & 0.0047 & 0.0031 & 0.3957 & 0.0298 & 2.2023 & 5013.0451 \\ \cmidrule(l){2-11}
				Testing set & Minimum & 0.0011 & 0.0133 & 0.3171 & 0.0001 & 0.0002 & 1 & 2.25 & 0 & 0.004 \\ \cmidrule(l){2-11}
				& Maximum & 412.2933 & 3.6576 & 47.4954 & 0.0247 & 0.026 & 3.46 & 2.68 & 7 & 27200 \\ \cmidrule(l){2-11}
				& Count & 376 & 376 & 376 & 376 & 376 & 376 & 376 & 376 & 376 \\ \bottomrule
			\end{tabular}%
		}
	\end{table}

	\textbf{\textbf{Q2.3)} In DNN model, the number of neurons in each layer
		are chosen 256, 256, 256, 64, 512 respectively, whether the model will have
		better prediction results by choosing other values.}

	\textbf{Answer:} The comment is well taken. There are no rules on the
	choice of the number of hidden layers and number of neurons; it's a trial
	and error method. We have used hyperparameter tuning to optimize the
	hyperparameters systematically. We have experimented with various
	configurations of the network architecture.

	Even though there are infinitely many combinations of hyperparameters
	possible and it is intractable to test all the configurations. By analyzing
	the bias-variance trade-off of the network on the available dataset, we
	have come up with this network configuration. Increasing the network
	capacity will lead to poor generalization on unseen data and result in poor
	prediction quality, as shown in Figure \ref{dnn_comb}.

	In addition, we conducted 20 independent runs to inspect the
	reproducibility of the established run results. We tried different values
	of neurons for the DNN model for different input variable
	combinations, which are shown in Table \ref{tab:results_neurons}. It can be
	seen from Table \ref{tab:results_neurons}, that among all the neuron
	combination, the proposed structure with one input layer with seven
	neurons, five
	hidden layers where the number of neurons in each layer is 256, 256, 256,
	64, 512, respectively, and one output layer with one neuron, 100 epochs
	with batch size one, \textit{`adam'} optimizer and learning rate 0.01 with
	`relu' activation function performs the best. Hence, we chose the number
	of neurons in each layer
	as 256, 256, 256, 64, 512 respectively.\\

	\begin{figure}[!h]
		\centering
		\includegraphics[scale=.45]{dnn_justification.pdf}
		\caption{Number of hidden layers with respect to R$^2$ in DNN model for all
			input variables. It can be seen that the model performs the best with 5
			hidden layers.}
		\label{dnn_comb}
	\end{figure}
	\begin{table}[ht]
		\centering
		\caption{Performance of DNN model using different values of number of neurons when we consider a combination of all characteristics i.e., Sediment, Geometry, Dynamic. \textbf{Bold} indicates the best performance.}
		\label{tab:results_neurons}
		\begin{tabular}{@{}lllllll@{}}
			\toprule
			Number of neurons & I$_d$ & PCC & MSE & NSE & RMSE & R$^2$ \\ \midrule
			128-128-128-64-512-1 & 0.975 & 0.959 & 0.065 & 0.911 & 0.255 & 0.920 \\ \midrule
			512-512-512-64-512-1 & 0.984 & 0.969 & 0.045 & 0.939 & 0.211 & 0.939 \\ \midrule
			\begin{tabular}[c]{@{}l@{}}256-256-256-64-512-1\\ (Our proposed model)\end{tabular} & \textbf{0.989} & \textbf{0.979} & \textbf{0.042} & \textbf{0.958} & \textbf{0.204} & \textbf{0.959} \\ \bottomrule
		\end{tabular}%
	\end{table}

	\textbf{\textbf{Q2.4)} Figure 6 in the paper is too blurry, the authors should have used a high resolution to accommodate the readers.}

	\textbf{Answer:} We thank the reviewer for the valuable comments. We have
	incorporated your suggestion made the figure sharper.
	(\textcolor{black}{Figure 6}, Page 21)\\
	\clearpage
	\textbf{\textbf{Q2.5)} The authors should give corresponding time complexity and space complexity analysis to demonstrate the usefulness of the model.}

	\textbf{Answer:} We thank the reviewer for the valuable suggestion. The time
	complexity and space complexity of each model is shown in Table
	\ref{tab:complexities} and we explain each of them in details.

	\begin{table}[htpb]
		\centering
		\caption{Summary of time and space complexity of each model used in this
			study.}
		\label{tab:complexities}
		\begin{tabular}{@{}ccc@{}}
			\toprule
			Models & Time Complexity & Space
			Complexity \\ \midrule
			DNN & $\mathcal{O}(ne(ij + jk + kp + pq + qr + rl))$ &
			$\mathcal{O}(z)$ \\
			ELM & $\mathcal{O}(n(ij+jk))$ &
			$\mathcal{O}(z)$ \\
			SVR & $\mathcal{O}(n^2)$ &
			$\mathcal{O}(k)$ \\
			Linear Regression & $\mathcal{O}(n)$
			& $\mathcal{O}(n)$ \\ \bottomrule
		\end{tabular}%
	\end{table}

	\section{DNN}
	\subsection{Time Complexity}

	\begin{itemize}
		\item \textbf{Time complexity of matrix multiplication:}
		Training DNN using back-propagation is usually implemented with matrices.\\
		{The time complexity of matrix multiplication for }
		$M_{ij}*M_{jk}$ is simply $\mathcal{O}(i*j*k)$.
		\item \textbf{Feed-forward propagation algorithm:}
		The feed-forward propagation algorithm is as follows: First, to go from layer $i$ to $j$, you do \\
		\begin{center}
			$S_j=W_{ji}*Z_i$.
		\end{center}
		Then you apply the activation function \\
		\begin{center}
			$Z_j=f(S_j)$
		\end{center}
	where $S_{j}$ is the intermediate feature after applying weights $W_{ji}$ and $Z_i$ is activation function of layer $i$,\\
		If we have $N$ layers (including input and output layer), this will
		run $ N-1 $ times.\\
		This study computes the time complexity for the forward pass algorithm
		for deep neural network with 7 layers including input and output layer,
		input layer $i$, $j$ neurons of first hidden layer, $k$ number of
		neurons in second hidden layer, $l$ neurons in third hidden layer
		layer, $m$ number of neurons in fourth hidden layer, $n$ neurons in
		fifth hidden layer, $p$ neuron at output layer.

		Since there are 7 layers, you need 6 matrices to represent weights
		between these layers. Let us denote them by W$_{ji}$, W$_{kj}$,
		W$_{lk}$, W$_{ml}$, W$_{nm}$, and W$_{pn}$, where W$_{ji}$ is a matrix
		with $j$ rows and $i$ columns (W$_{ji}$ thus contains the weights going
		from layer $i$ to layer $j$).

		Assuming $t$ training examples. For propagating from layer $i$ to $j$, we have first
		\begin{center}
			$S_{jt}=W_{ji}*Z_{it}$
		\end{center}
		and this operation (i.e., matrix multiplication) has $\mathcal{O}(j*i*t)$ time complexity. Then we apply the activation function
		\begin{center}
			$Z_{jt}=f(S_{jt})$
		\end{center}
		and this has $\mathcal{O}(j*t)$ time complexity, because it is an element-wise operation.

		So, in total, we have
		\begin{center}
			$\mathcal{O}(j*i*t+j*t)$ = $\mathcal{O}(j*t*(i+1))$ = $\mathcal{O}(j*i*t)$
		\end{center}
		Using same logic, for going $j \rightarrow k$, we have $\mathcal{O}(k*j*t)$, and vice-versa. In total, the time complexity for feed-forward propagation will be
		\begin{center}
			$\mathcal{O}(i*j*t + j*k*t + k*p*t + p*r*t + r*l*t)$ = $\mathcal{O}(t(ij + jk + kp + pr + rl))$
		\end{center}
		\item \textbf{Back-propagation algorithm:}
		The back-propagation algorithm proceeds as follows. Starting from the output layer $l \rightarrow k $, we compute the error signal, E$_{lt}$, a matrix containing the error signals for nodes at layer $l$\\
		\begin{center}
			$E_{lt}={f}'(S_{lt})\odot (Z_{lt}-O_{lt})$
		\end{center}
		here $\odot$ means element-wise multiplication. Note that $E_{lt}$ has $l$ rows and $t$ columns: it simply means each column is the error signal for training example $t$.\\
		We then compute the `delta weight', $D_{lk}\in \mathbb{R}^{l*k}$ (between layer $l$ and layer $k$)
		\begin{center}
			$D_{lk} = E_{lt} * Z_{tk}$
		\end{center}
		where $Z_{tk}$ is the transpose of $Z_{kt}$.\\
		We then adjust the weights,
		\begin{center}
			$W_{lk} = W_{lk}-D_{lk}$
		\end{center}
		For $l\rightarrow k$, we thus have the time complexity $\mathcal{O}(lt+lt+ltk+lk) = \mathcal{O}(l*t*k)$.\\
		Now, going back from $k \rightarrow j$. We first have
		\begin{center}
			$E_{kt} = f'(S_{kt})\cdot (W_{kl}*E_{lt})$
		\end{center}
		Then
		\begin{center}
			$D_{jk} = E_{kt}*Z_{tj}$
		\end{center}
		And then
		\begin{center}
			$W_{kj} = W_{kj}-D_{kj}$
		\end{center}
		where $W_{kl}$ is the transpose of $W_{lk}$. For $k \rightarrow j$, we have the time complexity
		\begin{center}
			$\mathcal{O}(kt+klt+ktj+kj) = \mathcal{O}(k*t(l+j))$
		\end{center}
		And finally, for $j \rightarrow i$, we have $\mathcal{O}(j*t(k+i))$. In total, we have
		\begin{center}
			$\mathcal{O}(ltk+tk(l+j)+tj(k+i)) = \mathcal{O}(t*(lk+kj+ji))$
		\end{center}
		which is the same as the feed-forward pass algorithm. Since, they are the same, the total complexity for one epoch will be
		\begin{center}
			$\mathcal{O}(t*(ij+jk+kl))$.
		\end{center}
		This time complexity is then multiplied by the number of epochs. So, we have
		\begin{center}
			$\mathcal{O}(e*t*(ij+jk+kl)$
		\end{center}
	\end{itemize}
Therefore,
\textit{Time Complexity} = $\mathcal{O}(ne(ij + jk + kp + pq + qr + rl))$\\
where $n$ is the number of data points, $e$ is the number of
epochs, $i$ is the number of input layer neurons, $j$ is the
neurons of first hidden layer, $k$ is the number of neurons in
second hidden layer, $p$ is the number neurons in third hidden
layer layer, $q$ is the number neurons in fourth hidden layer
layer, $r$ neurons in fifth hidden layer, and $l$ neurons in
output layer.
	\subsection{Space Complexity}
	The space complexity of DNN model will depend on the number of inputs that the model have, because the number of inputs will determine the number of weights in the first layer, which need to store in memory.\\
	If gradient descent (GD) and back-propagation (BP) are using to train
	the model, at each training iteration (i.e., a GD update), we need to
	store all the matrices that represent the parameters (or weights) of
	the model, as well as the gradients and the learning rate (or other
	hyper-parameters). Let us denote the vector that contains all
	parameters of the model as $\theta \in \mathbb{R}^z$ so it has $m$
	components. The gradient vector has the same dimensionality as
	$\theta$, so we need at least to store $2z+1$ parameters. So, we can
	write this $2z+1$ = $\mathcal{O}(z)$.\\
	\textit{Space Complexity} = $\mathcal{O}(z)$, where $z$ is the total
	number of neurons.

	In this study, we have $ n=1880 $ data points, $ e=100 $ epochs, input
	layer
	$ i=7 $, $ j = 256 $ neurons of first hidden layer, $ k=256 $ number of
	neurons
	in second hidden layer, $ p=256 $ neurons in third hidden layer layer,
	$ q= 64 $ number of neurons in fourth hidden layer, $ r=512 $ neurons in
	fifth hidden layer, $ l=1 $ neuron at the output layer as we need to
	predict
	only one variable i.e., total sediment load transport.
	\section{ELM}
	\subsection{Time Complexity}
	\textit{Time Complexity} = $\mathcal{O}(n(ij+jk))$

	\subsection{Space Complexity}
	\textit{ Space Complexity} = $\mathcal{O}(z)$\\
	where $n$ is number of observations, $i$ is the number of neurons of input layer, $j$ neurons in second layer, $k$ is the number of neurons in output layer, and $z$ is the total number of neurons.

	This study uses $n$ = 1880 observations, $i$ = 7 input layer, $j$ = 90 neurons, and $k$ = 1 neuron at output layer.
	\section{SVR}
	\subsection{Time Complexity}
	\textit{Time Complexity} = $\mathcal{O}(n^2)$
	\subsection{Space Complexity}
	\textit{Space Complexity} = $\mathcal{O}(k)$\\
	where $n$ is the number of data points and $ k $ is the number of
	support vectors.
	\section{Linear Regression}
	\subsection{Time Complexity}
	The linear regression is computed as
	\begin{center}
		$A = (X^{T}X))^{-1}X^{T}Y$
	\end{center}
	If $X$ is an $(n * k)$ matrix and $Y$ is an order $(n * 1)$ matrix
	\begin{enumerate}
		\item $(X^{T}X)$ takes $\mathcal{O}(n*k^2)$ time and produces a $(k * k)$ matrix.
		\item The matrix inversion of a $(k * k)$ matrix takes $\mathcal{O}(k^3)$ time.
		\item $(X^{T}Y)$ takes $\mathcal{O}(n*k)$ time and produces a $(k * 1)$ matrix.
		\item The final matrix multiplication of $(k * k)$ and $(k * 1)$ matrices takes $\mathcal{O}(k^2)$ time.
	\end{enumerate}
	So, \textit{Time Complexity} = $\mathcal{O}(k^2*n + k^3 + k*n + k^2)$ = $\mathcal{O}(k^2*n + k^3)$ = $\mathcal{O}(k^2(n+k))$.
%	\textit{Testing Time Complexity} = $\mathcal{O}(n)$
	\subsection{Space Complexity}
	In linear regression, after training the model we get $W$ and $b$ where $W$ is basically a vector of dimension $k$. Given any new point, we have to perform
	 \begin{center}
	 	Y = $W^T * X + b$
	 \end{center}
 to predict the new value of $Y$ and check the accuracy of the model. As, $b$ is independent of input size so the space required to store $b$ is $\mathcal{O}(1)$ and  $W^T * X + b$ takes $\mathcal{O}(k)$ space. Now, $W$ is a vector of size $k$.   So, the space complexity of $W$ is  $\mathcal{O}(k)$. \\
 Therefore, \textit{Space Complexity} = $\mathcal{O}(nk + n)$ \\


	\textbf{\textbf{Q2.6)} The workflow of the proposed model is more complicated and difficult to repeat. Can the author make the code open source to help more learners?}

	\textbf{Answer:} Thanks for this constructive suggestion. We have
	incorporated your suggestion and now the workflow of the proposed model
	is much simpler to understand. The workflow is shown in the reviewer
	response so that it can aid the reviewer to check the new image here itself
	(see Figure \ref{fig:workflow}). We would love to make our code open source to the learning community. After the review process is complete, we shall make the code open source for the
	benefit of the community. The figure can be found on Page 19 in the manuscript.\\
	\begin{figure}[!t]
		\centering
		\includegraphics[scale=.68]{architecture_sediment_transport.pdf}
		\caption{The overall flow of the proposed model for the prediction of total sediment load transport.}
		\label{fig:workflow}
	\end{figure}

	\textbf{\textbf{Q2.7)} I would suggest adding a paragraph about the limitation of the proposed methodology in the Conclusion part. Also, future works need to be discussed in detail.}

	\textbf{Answer:} We appreciate the reviewer's comment.

	\textbf{Limitations:} The total sediment load transport is quite
	complex phenomenon as it involves a large number of variables (e.g.,
	$y$ (flow depth), $BF$ (bed form of the channel), $Q$ (channel
	discharge), $Sf$ (friction slope), $\tau_b$ (bed shear stress),
	$d_{50}$ (median diameter of sediment particles), C$_g$ (gradation
	coefficient), $G_s$ (specific gravity) etc.) that often have non-linear
	relationships between them. However,
	it is certainly possible that parameters like Froude number, viscosity, water surface width, might have
	an impact on the total sediment load prediction. Since, the Brownlie's
	dataset did not contain these parameters, our models are not
	tuned to incorporate their effect. In general ML/DL models perform
	well to the dataset they are trained to. In our study, we have used
	Brownlie's dataset, which is a comprehensive dataset comprising of both
	bed load as well as suspended sediment load data, in addition to flume
	and field data from various researchers. Despite the comprehensive
	dataset, it is possible that some dataset would have ranges of
	variables which are outside the range of values considered in this
	study or has a different data distribution. For such datasets, it is
	possible that the proposed model may not perform well. However, this
	issue
	exists for all ML/DL models which have an assumption that the target
	data would be within the range of training sample or has a similar
	data distribution like that of the training sample.

	\textbf{Future Work:} Future work requires testing these prediction at a
	even larger field scale, investigating a larger range of input variables
	(e.g., $y$ (flow depth), $BF$ (bed form of the channel), $Q$ (channel
	discharge), $Sf$ (friction slope), $\tau_b$ (bed shear stress), $d_{50}$
	(median diameter of sediment particles), C$_g$ (gradation coefficient),
	$G_s$ (specific gravity) etc.) in order to test the
	efficacy of the proposed model. Also, we wish to setup an in-house
	laboratory flume and undertake different sets of experiments and collect
	the data in order to further test the performance of the proposed models.
	Although the current study uses ($d_{50}$, $C_g$, $G_s$, $Q$, $\tau_b$, $Sf$ etc.)
	as input variables for prediction, but variables like Froude number, viscosity, may have an impact on total sediment load prediction. Thus, we would like to
	explore those dataset(s) that includes these variables, so that their
	effect on the total sediment load prediction
	can be ascertained. We also aim to build a web based tool that can be used
	by the researchers to predict total sediment load using various ML/DL
	techniques. (Line 465-485, Page 13)\\

	\textbf{Response to Reviewer 3:}

	\textbf{General Comment:} The authors presented a study on total sediment load transport in rivers, which is challenging and complex. The paper provides a fluent read. Moreover, the paper is technically sound. With respect, I would like to figure out some points to improve the quality of the paper.\\

	\textbf{\textbf{Q3.1)} The paper has repetitive many abbreviations such as PCC and NSE. Where the abbreviation is used for the first time, its full name should be given only once time.}

	\textbf{Answer:} The comment is well taken. We have corrected it.\\

	\textbf{\textbf{Q3.2)} The motivation of the study should be given in
		Introduction section.}

	\textbf{Answer:} We thank the reviewer for the careful review. In water resource planning and management, total sediment transport challenges are significant. It is clear that prediction of total sediment load transport owes a significant importance in the area of hydraulics. The total sediment load varies as the underlying environment or the prevailing conditions change. The prediction of total sediment load transport is quite complex phenomenon as it involves a large number of variables (e.g., $y$ (flow depth), $BF$ (bed form of the channel), $Q$ (channel discharge), $Sf$ (friction slope), $\tau_b$ (bed shear stress), $d_{50}$ (median diameter of sediment particles), C$_g$ (gradation coefficient), $G_s$ (specific gravity) etc.). It involves the usage of a large number of parameters that are often non-linear and multi-dimensional in nature. Due to its complexity, nonlinearity, and multidimensionality, the system becomes difficult to analyse analytically. In addition, these variables appear to take on values that are unique to field and flume investigations. So, the assumption made for one particular environment may not hold true in another environment making the prediction erroneous and unusable. The primary motivation of this study is to check the applicability of the advanced ML and DNN models for prediction of total sediment load transport so that more accurate and generic sediment transport models could be built. The same can be found on Line 38-43, Page 1 and Line 50-54, Page 2.\\

	\textbf{\textbf{Q3.3)} Authors used linear regression, support vector
		regression, extreme learning machine, and DNN-based models
		for the prediction of the total sediment load transport. Are the sub-sets
		used in training and testing these models the same? This should be
		emphasized. It is also recommended to perform 5-fold or 10-fold
		cross-validation experiments.}

	\textbf{Answer:} Thank you for pointing this out. Yes, sub-sets used in
	training and testing of these models is same. We have performed 5-fold and
	10-fold cross-validation experiments for all models and their results are
	shown in Table \ref{tab:results_with_rank}.

	By observing Table \ref{tab:results_with_rank}, we can see that
	without using cross-validation, our proposed DNN performs
	better. Similarly, observation can be made for extreme learning machine (ELM)
	method. But in case
	of support vector regression (SVR) and linear regression (LR), the 10-fold
	cross-validation performs the best. However, both support vector regression
	and linear regression are under performing methods. So, even if they
	perform well in 10-fold
	cross-validation their performance is poor as compared to the results
	obtained for the proposed DNN method. In our case, we show the results
	for the combination of all three characteristics only, i.e.,
	\textit{Sediment},
	\textit{Geometry},
	and \textit{Dynamic} as they performed the best in all the methods. The
	other six combinations are ignored as they perform poor compared to
	the combination of all three characteristics.\\

	\begin{table}[ht]
		\centering
		\caption{Comparison of all models with and without using
			cross-validation. The rank $ 1 $ represents the best results while rank 3
			indicates the worst result.}
		\label{tab:results_with_rank}
		\resizebox{\textwidth}{!}{%
			\begin{tabular}{@{}cclllllllllllllll@{}}
				\toprule
				Models &
				\multicolumn{3}{c}{DNN} &
				&
				\multicolumn{3}{c}{ELM} &
				&
				\multicolumn{3}{c}{SVR} &
				&
				\multicolumn{3}{c}{LR} \\ \midrule
				\multicolumn{1}{l}{Fold} &
				\multicolumn{1}{l}{5} &
				10 &
				No &
				&
				5-fold &
				10-fold &
				No &
				&
				5-fold &
				10-fold &
				No &
				&
				5-fold &
				10-fold &
				No \\ \midrule
				\begin{tabular}[c]{@{}c@{}}I$_d$\\ Rank\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.946\\3 \end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.985\\2 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.989\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.933\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.963\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.970\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.961\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.981\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.967\\ 2\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.925\\3 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.964\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.927\\ 2\end{tabular}&\\ \midrule
				\begin{tabular}[c]{@{}c@{}}PCC\\ Rank\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.909\\ 3\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.977\\ 2\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.979\\1 \end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.874\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.932\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.943\\1 \end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.926\\3 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.964\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.943\\ 2\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.860\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.931\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.869\\ 2\end{tabular}& \\ \midrule
				\begin{tabular}[c]{@{}c@{}}MSE\\ Rank\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.141\\3 \end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.051\\2 \end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.042\\1 \end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.178\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.124\\2 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.111\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.098\\2 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.060\\1 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.115\\ 3\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.191\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.116\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.245\\ 3\end{tabular}& \\ \midrule
				\begin{tabular}[c]{@{}c@{}}NSE\\ Rank\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.790\\3 \end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.934\\ 2\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.958\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.735\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.841\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.889\\1 \end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.855\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.923\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.885\\ 2\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.715\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.852\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.755\\ 2\end{tabular}& \\ \midrule
				\begin{tabular}[c]{@{}c@{}}RMSE\\ Rank\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.376\\3 \end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.226\\ 2\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.204\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.422\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.353\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.333\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.312\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.245\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.339\\ 3\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.437\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.341\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.495\\ 3\end{tabular}& \\ \midrule
				\begin{tabular}[c]{@{}c@{}}R$^2$\\ Rank\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.827\\ 3\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.955\\ 2\end{tabular} &
				\begin{tabular}[c]{@{}c@{}}0.959\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.765\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.869\\ 2\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.889\\ 1\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.857\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.929\\ 1\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.889\\ 2\end{tabular}&
				&
				\begin{tabular}[c]{@{}c@{}}0.740\\ 3\end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.867\\1 \end{tabular}&
				\begin{tabular}[c]{@{}c@{}}0.755\\ 2\end{tabular}& \\ \midrule
				Average &
				\multicolumn{1}{c}{3} &
				\multicolumn{1}{c}{2} &
				\multicolumn{1}{c}{\textbf{1}} &
				\multicolumn{1}{c}{} &
				\multicolumn{1}{c}{3} &
				\multicolumn{1}{c}{2} &
				\multicolumn{1}{c}{\textbf{1}} &
				&
				\multicolumn{1}{c}{3} &
				\multicolumn{1}{c}{\textbf{1}} &
				\multicolumn{1}{c}{2} &
				&
				\multicolumn{1}{c}{3} &
				\multicolumn{1}{c}{\textbf{1}} &
				\multicolumn{1}{c}{2} \\ \bottomrule
			\end{tabular}%
		}
	\end{table}


	\textbf{\textbf{Q3.4)} Furthermore, what are the limitations of this study? Clarifying the limitations of a study allows the readers to understand better under which conditions the results should be interpreted.}

	\textbf{Answer:} Thank you for the feedback and pointing this out.

	\textbf{Limitations:} Total sediment load prediction is a complex
	phenomenon as it involves a large number of independent variables like
	(e.g.,
	$y$ (flow depth), $BF$ (bed form of the channel), $Q$ (channel
	discharge), $Sf$ (friction slope), $\tau_b$ (bed shear stress),
	$d_{50}$ (median diameter of sediment particles), C$_g$ (gradation
	coefficient), $G_s$ (specific gravity) etc.). However,
	it is certainly possible that parameters like Froude number, viscosity, might have
	an impact on the total sediment load prediction. Since, the Brownlie's
	dataset did not have datasets with these parameters, our models are not
	tuned to incorporate their effect. In general ML/DL models perform
	well to the dataset they are trained to. In our study, we have used
	Brownlie's dataset which is comprehensive dataset comprising of both
	bed load as well as suspended sediment load data in addition of flume
	and field data from various researchers. Despite the comprehensive
	dataset, it is possible that some dataset would have ranges of
	variables which are outside the range of values considered in this
	study or has a different data distribution. For such datasets, it is
	possible that the model may not perform well. However, this issue
	exists for all ML/DL models which have an assumption that the target
	data would be within the range of training sample or has a similar
	data distribution like that of the training sample. (Line 465-477, Page 13)\\

	\textbf{Response to Reviewer 4:}

	\textbf{General Comment:} The manuscript ``Predicting Total Sediment Load Transport in Rivers using Regression Techniques, Extreme Learning and Deep Learning Models" applied four machine learning algorithms to predict the total sediment load transport and compared their performance with the empirical models in previous studies. Acceptable accuracy was obtained based on the systematic comparison between models or selected variables. This manuscript is recommended to be accepted after minor revisions.\\

	\textbf{\textbf{Q4.1)} Introduction: The first paragraph: ``This study
		focuses on the prediction of total sediment load transport, which involves
		a combination of bed load as well as suspended load." We typically explain
		the purpose of the study in the last paragraph, after describing the
		background of the study step by step in preceding paragraphs of the
		introduction.}

	\textbf{Answer:}
	We thank the reviewer for the critical comments and review. We have now re-arranged the introduction section to incorporate the changes as suggested by the reviewer. (Line 160-171, Page 4)\\

	\textbf{\textbf{Q4.2)} Introduction: The first paragraph: ``Sediment particles found on … those sediments that that are transported …" "that that"?}

	\textbf{Answer:} We thank the reviewer for pointing this out. Now, we have
	corrected it. (Line 35, Page 1)\\

	\textbf{\textbf{Q4.3)} Introduction: 4th to 12th paragraph: the review of the applications of machine learning in predicting sediment load transport is too lengthy and unfocused. While it is necessary to overview the application scenarios, types, and performance of the applied machine learning in literature, it is also necessary to compare and comment on the variables used in these studies and their performance, as the authors described these variables in 2nd paragraph, and stated that one of the innovations of this paper is the comparison of these variables in the last paragraph.}

	\textbf{Answer:} We thank the reviewer for the suggestion. As per your
	suggestion, we have revised the introduction section and made it more
	succinct.

	With regards to comparing the variables used in this with the studies provided in the literature,
	a direct performance
	comparison of the proposed methodology with the existing studies in the literature is not possible. The reason being that, to the best of
	author's knowledge, there does not exists any studies which has used the
	exact dataset used in this study. Researchers have used a subset of
	Brownlie's dataset so a viz-a-viz, comparison is not possible.
	In Table \ref{tab:aire:comparison}, we highlight a few studies that have worked on prediction of sediment transport. As shown in Table \ref{tab:aire:comparison}, we can see that existing studies are either focused on data collected from river or flume in a specific environment. In addition, they either work on bed load and sediment load. They also either use all the available variables for prediction and do not analyze the effect of Sediment, Geometry, and Dynamic characteristics on prediction. Thus the analysis using the combinations of Sediment, Geometry, and Dynamic characteristics provides an innovative approach in determining their usefulness in the prediction of total sediment load. (Pages 3,4)\\

	\begin{table}[!t]
		\centering
		\caption{Comparison of proposed method with the previous studies. Here all the studies either have worked explicitly on rivers or flume. Brownlie's dataset that has been considered comprises of 11 flume
			experiments and 6 field experiments demonstrating a good mix of flume and
			field data. In addition the dataset contains only few points as compared to our proposed dataset that has 1880 points. None of the studies here provide analysis of effect of variables in the form of Sediment, Dynamic and Geometry characteristics. Notations: Flow discharge,$ Q (m^3/s) $, Flow velocity ($ V (m/s) $), Water-surface width ($B (m)$), Flow depth ($Yo (m)$), Cross sectional area of flow ($A (m^2)$), Hydraulic radius ($R (m)$), Channel slope ($So$), Bed load ($Tb (kg/s)$), Suspended load ($Tt (kg/s)$), Total bed material load ($Tj (kg/s)$), Median sediment size ($d50 (mm)$), Manning's $ n $, daily stream flow ($Q$), daily mean concentration of suspended sediment ($C$), and daily suspended-sediment discharge or load ($SL$).}
		\label{tab:aire:comparison}
		\resizebox{\textwidth}{!}{%
			\begin{tabular}{@{}lllll@{}}
				\toprule
				\textbf{Paper} & \textbf{Methods} & \textbf{Variables Used} & \textbf{Dataset} & \textbf{Cons} \\ \midrule

				\cite{melesse2011suspended} & ANN MLR MNLR ARIMA
				& \begin{tabular}[c]{@{}l@{}} $P,Q,Q(t-1)$\\ $SL, SL(t-1) $ \end{tabular}
				&\begin{tabular}[c]{@{}l@{}} Mississippi, Missouri, \\ Rio Grande river \end{tabular}
				& \begin{tabular}[c]{@{}l@{}} Only three rivers data. \\ Only suspended load considered. \\No explicit split of variables into\\ Sediment, Dynamic and Geometry. \end{tabular} \\ \midrule

				\cite{ghani2011prediction} & Various ANN methods
				& \begin{tabular}[c]{@{}l@{}} $Q, V, B, Yo, A, R$ \\ $ Tb, Tt, Tj, d_{50}, n$ \end{tabular} & \begin{tabular}[c]{@{}l@{}} Kurau, Langat, \\ and Muda River \end{tabular}

				& \begin{tabular}[c]{@{}l@{}} Total 214 points in the dataset. \\ Only 3 rivers data considered. \\Only one column combination used. \\No explicit split of variables into\\ Sediment, Geometry, and Dynamic. \end{tabular}
				\\ \midrule

				\cite{chang2012appraisal} & FFNN, ANFIS and GEP
				& \begin{tabular}[c]{@{}l@{}} $Q, V, B, Yo, A, R$ \\ $ Tb, Tt, Tj, d_{50}, n$ \end{tabular} & \begin{tabular}[c]{@{}l@{}} Kurau, Langat, \\ and Muda River \end{tabular}
				& \begin{tabular}[c]{@{}l@{}} Total 214 points in the dataset. \\ Only 3 rivers data considered. \\No explicit split of variables into\\ Sediment, Dynamic and Geometry. \\GEP takes 48 hours for training. \end{tabular}
				\\ \midrule

				\cite{waikhom2017prediction} & Empirical Eq. of \cite{yang1973incipient}
				& \begin{tabular}[c]{@{}l@{}} $Q, Yo, B, d_{50}, So$ \end{tabular} & Shetrunji River data
				& \begin{tabular}[c]{@{}l@{}} Only one river data. \\ No explicit split of variables into\\ Sediment, Geometry, and Dynamic. \end{tabular}
				\\ \midrule

				\cite{khosravi2020bedload} &\begin{tabular}[c]{@{}l@{}} M5P, RT, RF, REPT, \\ BA-M5P, BA-RF, BA-RT\\ and BA-REPT \end{tabular}
				& \begin{tabular}[c]{@{}l@{}} $V, \tau, Q, V*, S, Y, d_{50}, RR$ \end{tabular} & Flume Experiments
				& \begin{tabular}[c]{@{}l@{}} Total 72 points in the dataset. \\ Only flume data considered. \\No explicit split of variables into\\ Sediment, Geometry, and Dynamic. \end{tabular}
				\\ \midrule

				\textbf{Proposed Method} & LR, SVR, ELM, DNN
				& \begin{tabular}[c]{@{}l@{}} $d_{50}, C_g, G_s, y, BF, Q,
					\tau_b, Sf$ \end{tabular} & \begin{tabular}[c]{@{}l@{}} 11 flume
					and \\ 6 field experiments \end{tabular}
				& \begin{tabular}[c]{@{}l@{}} 1880 data points. \\Analysis of variables into\\ Sediment, Geometry, and Dynamic. \end{tabular}
				\\ \bottomrule

			\end{tabular}%
		}
	\end{table}

	\textbf{\textbf{Q4.4)} 3.1 and the preceding part of 3.2 until 3.2.1: It's
		better to put them in the Methodology section, meanwhile, the authors have
		explained the seven models (f, f, f, …) in Introduction.} \\
	\textbf{Answer:} We thank the reviewer for the valuable comments. We have
	incorporated your suggestions. We have included subsection 3.1 and 3.2 in
	methodology section (Section 2.4 and 2.5, Page 8). One of the novelty of
	the papers is that we have analyzed the impact of parameters that affect
	the total sediment load prediction individually as well as their
	combination. In order to emphasize this we briefly mentioned the seven
	models in Introduction section. (Line 150-159, Page 3)

	\clearpage
	\bibliographystyle{cas-model2-names}
	\textsc{}
	\bibliography{all_references}

\end{document}