Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches , and (2) retrieving-based approaches . Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set . The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages . This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead . © 2016 IEEE.
|Original language||English (US)|
|Title of host publication||2016 IEEE 32nd International Conference on Data Engineering (ICDE)|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Number of pages||2|
|State||Published - Jun 25 2016|