When major natural disasters hit dense metropolitan areas, inspection is typically conducted by teams of engineers tasked with labeling buildings according to their damage state: safe, needs further evaluation, or unsafe. The physical inspection process can take from several days to weeks to be completed. Automated assessment is an attractive alternative to manual inspection but requires deploying a dense network of sensors at the granularity of each structure. Such a network may seem impractical with respect to cost or deployment time. However, with the advent of the Internet of Things (IoT) era, a massive network of citizen-owned smart devices such as tablets and smart-phones that contain vibration and vision sensors is already readily available and deployed. While prior work focused on using smart-phones to providing early warning, we focus specifically on utilizing smart-phone video capture to directly assess the structural health of buildings post event, thus providing emergency personnel with immediate actionable information regarding the state of the building. The fact that smart phone cameras are already located inside a given building makes the proposed solution insensitive to weather conditions and visibility range and does not require an off-structure reference point. Experimental results using shake tables show that the proposed technique can achieve sub-millimeter accuracy demonstrating its suitability for structural health monitoring applications.