Dongdaemun
SEOUL | KOREA
APRIL 18-21, 2017
News
Sponsors
Previous Events
Channels
Logo
You can download the logo here.

SwiftTuna: Responsive and Incremental Visual Exploration of Large-scale Multidimensional Data

  • Jaemin Jo
    Department of Computer Science and Engineering, Seoul National University, Seoul, Korea, Republic of
  • Wonjae Kim
    Department of Computer Science and Engineering, Seoul National University, Seoul, Korea, Republic of
  • Seunghoon Yoo
    Department of Computer Science and Engineering, Seoul National University, Seoul, Korea, Republic of
  • Bohyoung Kim
    Division of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin-si, Korea, Republic of
  • Jinwook Seo
    Department of Computer Science and Engineering, Seoul National University, Seoul, Korea, Republic of

Abstract

For interactive exploration of large-scale data, a preprocessing scheme (e.g., data cubes) has often been used to summarize the data and provide low-latency responses. However, such a scheme suffers from a prohibitively large amount of memory footprint as more dimensions are involved in querying, and a strong prerequisite that specific data structures have to be built from the data before querying. In this paper, we present SwiftTuna, a holistic system that streamlines the visual information seeking process on large-scale multidimensional data. SwiftTuna exploits an in-memory computing engine, Apache Spark, to achieve both scalability and performance without building precomputed data structures. We also present a novel interactive visualization technique, tailed charts, to facilitate large-scale multidimensional data exploration. To support responsive querying on large-scale data, SwiftTuna leverages an incremental processing approach, providing immediate low-fidelity responses (i.e., prompt responses) as well as delayed high-fidelity responses (i.e., incremental responses). Our performance evaluation demonstrates that SwiftTuna allows data exploration of a real-world dataset with four billion records while preserving the latency between incremental responses within a few seconds.