This paper considers a new approach to using Markov chain Monte Carlo (MCMC) in contexts where one may adopt multilevel (ML) Monte Carlo. The underlying problem is to approximate expectations w.r.t. an underlying probability measure that is associated to a continuum problem, such as a continuous-time stochastic process. It is then assumed that the associated probability measure can only be used (e.g. sampled) under a discretized approximation. In such scenarios, it is known that to achieve a target error, the computational effort can be reduced when using MLMC relative to i.i.d. sampling from the most accurate discretized probability. The ideas rely upon introducing hierarchies of the discretizations where less accurate approximations cost less to compute, and using an appropriate collapsing sum expression for the target expectation. If a suitable coupling of the probability measures in the hierarchy is achieved, then a reduction in cost is possible. This article focused on the case where exact sampling from such coupling is not possible. We show that one can construct suitably coupled MCMC kernels when given only access to MCMC kernels which are invariant with respect to each discretized probability measure. We prove, under verifiable assumptions, that this coupled MCMC approach in a ML context can reduce the cost to achieve a given error, relative to exact sampling. Our approach is illustrated on a numerical example.