We propose an approach for automatic generation of building models by assembling a set of boxes using a Manhattan-world assumption. The method first aligns the point cloud with a per-building local coordinate system, and then fits axis-aligned planes to the point cloud through an iterative regularization process. The refined planes partition the space of the data into a series of compact cubic cells (candidate boxes) spanning the entire 3D space of the input data. We then choose to approximate the target building by the assembly of a subset of these candidate boxes using a binary linear programming formulation. The objective function is designed to maximize the point cloud coverage and the compactness of the final model. Finally, all selected boxes are merged into a lightweight polygonal mesh model, which is suitable for interactive visualization of large scale urban scenes. Experimental results and a comparison with state-of-the-art methods demonstrate the effectiveness of the proposed framework.