Efficient modelling methods to assess soil organic carbon (SOC) stocks have a pivotal importance as inputs for global carbon cycle studies and decision-making processes. However, laboratory analyses of SOC field samples are costly and time consuming. Global-scale estimates of SOC were recently made according to categorical variables, including land use and soil texture. Remote sensing (RS) data can contribute to the better modelling of the spatial distribution of SOC stock at a regional scale. In the present study, we used Stochastic Gradient Treeboost (SGT) to estimate the topsoil (0–30 cm) SOC stock of a Mediterranean semiarid area (Sicily, Italy, 25,286 km2). In particular, our study examined agricultural lands, which represent approximately 64% of the entire region. An extensive soil dataset (2202 samples, 1 profile/7.31 km2 on average) was acquired from the soil database of Sicily. The georeferenced field observations were intersected with remotely sensed environmental data and other spatial data, including climatic data from WORLDCLIM, land cover from CORINE, soil texture, topography and derived indices. Finally, the SGT was compared to published global estimates (GSOC) and data from the International Soil Reference and Information Centre (ISRIC) Soil Grids by comparing the pseudo-regressions of the SGT, GSOC and ISRIC with soil observations. The mean SOC stock across the entire region that was estimated by GSOC and ISRIC was 3.9% lower and 46.2% higher compared to the SGT. The SGT efficiently predicted SOC stocks that were < 70 t ha− 1 (corresponding to the 90th percentile of the observed values). On average, the coefficient of variation of the SGT model was 3.6% when computed on the whole dataset and remained lower than 23% when computed on a distribution basis. The SGT mean absolute error was 14.84 t ha− 1, 18.4% and 36.3% lower than GSOC and ISRIC, respectively. The mean annual rainfall, soil texture, land use, mean annual temperature and Landsat 7 ETM+ panchromatic Band 8 were the most important predictors of SOC stock. Finally, SOC stocks were estimated for each land cover class. SGT predicted SOC stock better than GSOC and ISRIC for most data. This resulted in a percentage of data in the prediction confidence interval ± 50% compared to the observed values of 71.4%, 65.8%, and 50.7% for SGT, GSOC, and SGT, respectively. This consisted of a higher R2 and a slope (β) that was closer to 1 for the pseudo-regression constructed with SGT compared to GSOC and ISRIC. In conclusion, the results of the present study showed that the integration of RS with climatic and soil texture spatial data could strongly improve SOC prediction in a semi-arid Mediterranean region. In addition, the panchromatic band of Landsat 7 ETM + was more predictive compared to the conventionally used NDVI. This information is crucial to guiding decision-making processes, especially at a regional scale and/or in semi-arid Mediterranean areas. The model performance of the SGT could be further improved by adopting predictors with greater spatial resolutions. The results of the present experiment yield valuable information, especially for assessing climate change or land use change scenarios for SOC stocks and their spatial distribution.