Engineering efforts are currently attempting to build devices capable of collecting neural activity from one million neurons in the brain. Part of this effort focuses on developing dense multiple-electrode arrays, which require post-processing via ‘spike sorting’ to extract neural spike trains from the raw signal. Gathering information at this scale will facilitate fascinating science, but these dreams are only realizable if the spike sorting procedure and data pipeline are computationally scalable, at or superior to hand processing, and scientifically reproducible. These challenges are all being amplified as the data scale continues to increase. In this review, recent efforts to attack these challenges are discussed, which have primarily focused on increasing accuracy and reliability while being computationally scalable. These goals are addressed by adding additional stages to the data processing pipeline and using divide-and-conquer algorithmic approaches. These recent developments should prove useful to most research groups regardless of data scale, not just for cutting-edge devices.