We consider the application of domain decomposition techniques to the solution of sparse linear systems arising from implicit PDE discretizations on parallel computers. Representatives of two popular MIMD architectures, message passing (the Intel iPSC/2-SX) and shared memory (the Encore Multimax 320), are employed. We run the same numerical experiments on each, namely stripwise and boxwise decompositions of the unit square, using up to 64 subdomains and containing up to 64K degrees of freedom. We produce a tight-fitting complexity model for the former and discuss the difficulty of doing so for the latter. We also evaluate which of three types of domain decomposition preconditioners that have appeared in the literature of self-adjoint elliptic problems are most efficient in different regions of machine-problem parameter space. Some form of global sharing of information in the preconditioner is required for efficient overall parallel implementation in the region of most practical interest (large problem sizes and large numbers of processors); otherwise, an increasing iteration count inveighs against the gains of concurrency. Our results on a per iteration basis also hold for sparse discrete systems arising from other types of partial differential equations, but in the absence of a theory for the dependence of the convergence rate upon the granularity of the decomposition, the overall results are only suggestive for more general systems.