Energy consumption is vital to the global costs of wastewater treatment plants (WWTPs). With the increase of installed WWTPs worldwide, the modeling and forecast of their energy consumption have become a critical factor in WWTP design to meet environmental and economic requirements. The accurate and swift energy consumption forecasting soft-sensors are not only supportive to the daily electric and financial budgeting by WWTP practitioners on the micro-scale, but also beneficial to local municipal operation and fundamental to regional environmental impact estimation on the macro-scale. Energy consumption in WWTPs is influenced by different biological and environmental factors, making it complicated and challenging to build soft-sensors. This paper intends to provide short-term forecasting of WWTP energy consumption based on data-driven soft sensors using traditional time-series and deep learning methods. Ten data-driven soft sensors, including the ordinary least square, exponential smoothing state space, local regression, auto-regressive integrated moving average (ARIMA), structural time series model, Bayesian structural time series, non-linear auto-regressive, long short-term memory with and without updates, and gated recurrent units have been investigated and compared for WWTP energy consumption forecasting. Energy consumption time-series data from a membrane bioreactor-based WWTP in the middle east is used to evaluate the performances of the proposed soft-sensors. Results showed that ARIMA achieved slightly improved performances, among others. The employment of adaptive deep learning-based soft sensors is expected to enhance the capabilities of the deep models to quickly and accurately follow the trend of future data.