The radiative forcing of anthropogenic aerosol remains a key uncertainty in the understanding of climate change. This study quantifies the model spread in aerosol forcing associated with (i) variability internal to the atmosphere and (ii) differences in the model representation of weather. We do so by performing ensembles of atmosphere-only simulations with four state-of-the-art Earth system models, three of which will be used in the sixth coupled model inter-comparison project (CMIP6, Eyring et al., 2016). In those models we reduce the complexity of the anthropogenic aerosol by prescribing the same annually-repeating patterns of the anthropogenic aerosol optical properties and associated effects on the cloud reflectivity. We quantify a comparably small model spread in the long-term averaged ERF compared to the overall possible range in annual ERF estimates associated with model-internal variability. This implies that identifying the true model spread in ERF associated with differences in the representation of meteorological processes and natural aerosol requires averaging over a sufficiently large number of annual estimates. We characterize the model diversity in clouds and use satellite products as benchmarks. Despite major inter-model differences in natural aerosol and clouds, all models show only a small change in the global-mean ERF due to the substantial change in the global anthropogenic aerosol distribution between the mid-1970s and mid-2000s, the ensemble mean ERF being −0.47 Wm<sup>−2</sup> for the mid-1970s and −0.51 Wm<sup>−2</sup> for the mid-2000s. This result suggests that inter-comparing ERF changes between two periods rather than absolute magnitudes relative to pre-industrial might provide a more stringent test for a model's ability for representing climate evolutions.