The WAter Cycle Multi-mission Observation Strategy – EvapoTranspiration (WACMOS-ET) project aims to advance the development of land evaporation estimates on global and regional scales. Its main objective is the derivation, validation, and intercomparison of a group of existing evaporation retrieval algorithms driven by a common forcing data set. Three commonly used process-based evaporation methodologies are evaluated: the Penman–Monteith algorithm behind the official Moderate Resolution Imaging Spectroradiometer (MODIS) evaporation product (PM-MOD), the Global Land Evaporation Amsterdam Model (GLEAM), and the Priestley–Taylor Jet Propulsion Laboratory model (PT-JPL). The resulting global spatiotemporal variability of evaporation, the closure of regional water budgets, and the discrete estimation of land evaporation components or sources (i.e. transpiration, interception loss, and direct soil evaporation) are investigated using river discharge data, independent global evaporation data sets and results from previous studies. In a companion article (Part 1), Michel et al. (2016) inspect the performance of these three models at local scales using measurements from eddy-covariance towers and include in the assessment the Surface Energy Balance System (SEBS) model. In agreement with Part 1, our results indicate that the Priestley and Taylor products (PT-JPL and GLEAM) perform best overall for most ecosystems and climate regimes. While all three evaporation products adequately represent the expected average geographical patterns and seasonality, there is a tendency in PM-MOD to underestimate the flux in the tropics and subtropics. Overall, results from GLEAM and PT-JPL appear more realistic when compared to surface water balances from 837 globally distributed catchments and to separate evaporation estimates from ERAInterim and the model tree ensemble (MTE). Nonetheless, all products show large dissimilarities during conditions of water stress and drought and deficiencies in the way evaporation is partitioned into its different components. This observed inter-product variability, even when common forcing is used, suggests that caution is necessary in applying a single data set for large-scale studies in isolation. A general finding that different models perform better under different conditions highlights the potential for considering biome- or climatespecific composites of models. Nevertheless, the generation of a multi-product ensemble, with weighting based on validation analyses and uncertainty assessments, is proposed as the best way forward in our long-term goal to develop a robust observational benchmark data set of continental evaporation.