HardBD & Active'18


HardBD & Active'18

Joint Workshop of HardBD (International Workshop on Big Data Management on Emerging Hardware)
          and Active (Workshop on Data Management on Virtualized Active Systems)

To be Sponsored by and Held in Conjunction with ICDE 2018

April 16, 2018 in Paris, France

bullet Description
bullet Topics
bullet Submission
bullet Important Dates
bullet Program
bullet Keynote
bullet Organizers
bullet PC Members
bullet Sponsor

  Description


HardBD : Data properties and hardware characteristics are two key aspects for efficient data management. A clear trend in the first aspect, data properties, is the increasing demand to manage and process Big Data in both enterprise and consumer applications, characterized by the fast evolution of Big Data Systems, such as Key-Value stores, Document stores, Graph stores, Spark, MapReduce/Hadoop, Graph Computation Systems, Tree-Structured Databases, as well as novel extensions to relational database systems. At the same time, the second aspect, hardware characteristics, is undergoing rapid changes, imposing new challenges for the efficient utilization of hardware resources. Recent trends include massive multi-core processing systems, high performance co-processors, very large main memory systems, persistent main memory, fast networking components, big computing clusters, and large data centers that consume massive amounts of energy. Utilizing new hardware technologies for efficient Big Data management is of urgent importance.

Active : Existing approaches to solve data-intensive problems often require data to be moved near the computing resources for processing. These data movement costs can be prohibitive for large data sets. One promising solution is to bring virtualized computing resources closer to data, whether it is at rest or in motion. The premise of active systems is a new holistic view of the system in which every data medium and every communication channel become compute-enabled. The Active workshop aims to study different aspects of the active systems' stack, understand the impact of active technologies (including but not limited to hardware accelerators such as SSDs, GPUs, FPGAs, and ASICs) on different applications workloads over the lifecycle of data, and revisit the interplay between algorithmic modeling, compiler and programming languages, virtualized runtime systems and environments, and hardware implementations, for effective exploitation of active technologies.

HardBD & Active'18 : Both HardBD and Active are interested in exploiting hardware technologies for data-intensive systems. The aim of this one-day joint workshop is to bring together researchers, practitioners, system administrators, and others interested in this area to share their perspectives on exploiting new hardware technologies for data-intensive workloads and big data systems, and to discuss and identify future directions and challenges in this area. The workshop aims at providing a forum for academia and industry to exchange ideas through research and position papers.

[ Go to Top ]

  Topics

 Topics of interest include but are not limited to:

  • Systems Architecture on New Hardware
  • Data Management Issues in Software-Hardware-System Co-design
  • Main Memory Data Management (e.g. CPU Cache Behavior, SIMD, Lock-Free Designs, Transactional Memory)
  • Data Management on New Memory Technologies (e.g., SSDs, NVMs)
  • Active Technologies (e.g., GPUs, FPGAs, and ASICs) in Co-design Architectures
  • Distributed Data Management Utilizing New Network Technologies (e.g., RDMA)
  • Novel Applications of New Hardware Technologies in Query Processing, Transaction Processing, or Big Data Systems (e.g., Hadoop, Spark, NoSQL, NewSQL, Document Stores, Graph Platforms etc.)
  • Novel Applications of Low-Power Modern Processors in Data-Intensive Workloads
  • Virtualizing Active Technologies on Cloud (e.g., Scalability and Security)
  • Benchmarking, Performance Models, and/or Tuning of Data Management Workloads on New Hardware Technologies

[ Go to Top ]

  Submission Guidelines

     We welcome submissions of original, unpublished research papers that are not being considered for publication in any other forum. Papers should be prepared in the IEEE format and submitted as a single PDF file. The paper length should not exceed 6 pages. The submission site is https://cmt3.research.microsoft.com/HardBDActive2018 .

[ Go to Top ]

  Important Dates


Paper submission: January 15, 2018 (Monday) 11:59:00 PM PT
January 21, 2018 (Sunday) 11:59:00 PM PT
Notification of acceptance: February 5, 2018 (Monday)
February 10, 2018 (Saturday)
Camera-ready copies: February 19, 2018 (Monday)
February 24, 2018 (Saturday)
Workshop: April 16, 2018 (Monday)

[ Go to Top ]

  Program


8:40-8:45am Welcome Messages

8:45-9:45am Session I: Keynote 1

9:45-10:30am Coffee Break

10:30-12:00pm Session II: Research Presentation 1

12:00-13:30am Lunch

13:30-15:00pm Session III: Research Presentation 2

15:00-15:30am Coffee Break

15:30-16:30pm Session IV: Keynote 2 16:30-17:00pm Session V: Industry Talk
  • X-DB: the Next Generation Database System of Alibaba Group.
    Tieying Zhang (Alibaba)

[ Go to Top ]

  Keynote Talks

Bitwise Dimensional Co-Clustering (BDCC):
Exploiting Fine-Grained Persistent Memories for OLAP


Peter Boncz (CWI & VU University Amsterdam)

Peter Boncz

With the current dominance of flash and trend towards even more fine-grained non-volatile memories (NVM), rightfully a lot of attention has been given to the implications of fine-grained persistence for OLTP. This keynote, however, focuses on OLAP, and describes a new processing and storage framework called Bitwise Dimensional Co-clustering (BDCC) that exploits the low-latency and small granularity access capabilities of such modern storage.

Analytical workloads in data warehouses often include heavy joins where queries involve multiple fact tables in addition to the typical star-patterns, dimensional grouping and selections. Deeply clustered storage is an optimizing database design that avoids replication and thus keeps updates fast, yet accelerates all foreign key joins, efficiently supports grouping and pushes down most dimensional selections, and sometimes eliminates dimensional joins. The framework is made up of side-ways information passing operators at execution, query optimization rules and costing and automated physical design algorithm that takes the schema as its only input.

While BDCC was designed for flash, its fine-grained philosophy matches up-and-coming NVM capabilities, and this talk is intended to inspire new research directions for analytics on these new memories.

Bio: Peter Boncz holds appointments as tenured researcher at CWI and full professor at VU University Amsterdam. His academic background is in core database architecture, with the architecture of MonetDB the main topic of his PhD thesis --MonetDB won the 2016 ACM SIGMOD systems award. This work focused on architecture-conscious database research, which studies the interaction between computer architecture and data management techniques. His specific contributions are in cache-conscious join methods, query and transaction processing in columnar database systems, and vectorized query execution. He has a track record in bridging the gap between academia and commercial application, receiving the Dutch ICT Regie Award 2006 for his role in the CWI spin-off company Data Distilleries. In 2008 he founded a new CWI spin-off company called Vectorwise, dedicated to state-of-the art business intelligence technology. He is also the co-recipient of the 2009 VLDB 10 Years Best Paper Award, and in 2013 received the Humboldt Research Award. He recently also provides advise on high performance data architectures to Databricks (the creators of Apache Spark), who recently openend an Amsterdam R&D office to collaborate with CWI.


Caching in the Memory Hierarchy:
5 Minutes Ought to Be Enough for Everybody


Anastasia Ailamaki (EPFL)

Anastasia Ailamaki

In 1987, Jim Gray and Gianfranco Putzolu introduced the five-minute rule for trading memory to reduce disk I/O using the then-current price-performance characteristics of DRAM and Hard Disk Drives (HDD). Since then, the five-minute rule has gained wide-spread acceptance as an important rule-of-thumb in data engineering. In this talk, I will revisit the five-minute rule three decades since its introduction and use it to identify impending changes in today's multi-tier storage hierarchy given recent trends in the storage hardware landscape. I will investigate the impact of the five-minute rule -- explicit or implicit -- on the way we perform analytics. We will see that the rule applies both in the bottom tiers of the hierarchy, which is based on new Cold Storage Devices (CSD), but also in main-memory databases, where researchers have been working on hot-cold data separation and on heterogeneity-aware caching techniques.

Bio: Anastasia Ailamaki is a Professor of Computer and Communication Sciences at the Ecole Polytechnique Federale de Lausanne (EPFL) in Switzerland and the co-founder of RAW Labs SA, a swiss company developing real-time analytics infrastructures for heterogeneous big data. Her research interests are in data-intensive systems and applications, and in particular (a) in strengthening the interaction between the database software and emerging hardware and I/O devices, and (b) in automating data management to support computationally- demanding, data-intensive scientific applications. She has received an ERC Consolidator Award (2013), a Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon (2007), a European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), eight best-paper awards in database, storage, and computer architecture conferences, and an NSF CAREER award (2002). She holds a Ph.D. in Computer Science from the University of Wisconsin-Madison in 2000. She is an ACM fellow, an IEEE fellow, and an elected member of the Swiss National Research Council. She has served as a CRA-W mentor, and is a member of the Expert Network of the World Economic Forum.

[ Go to Top ]

  Organizers


[ Go to Top ]

  PC Members


  • Manos Athanassoulis, Harvard University
  • Sebastian Breß, DFKI GmbH
  • Bingsheng He, National University of Singapore
  • Peiquan Jin, Univerisity of Science and Technology of China
  • Qiong Luo, Hong Kong University of Science and Technology
  • Roger Moussalli, Two Sigma
  • Ilia Petrov, Reutlingen University
  • Eva Sitaridi, Amazon
  • Xiaodong Zhang, Ohio State University

[ Go to Top ]

  Sponsor


Alibaba

[ Go to Top ]