|
Description
|
HardBD
: Data properties and hardware characteristics are two
key aspects for efficient data management. A clear trend
in the first aspect, data properties, is the increasing
demand to manage and process Big Data in both enterprise
and consumer applications, characterized by the fast
evolution of Big Data Systems, such as Key-Value stores,
Document stores, Graph stores, Spark, MapReduce/Hadoop,
Graph Computation Systems, Tree-Structured Databases, as
well as novel extensions to relational database systems.
At the same time, the second aspect, hardware
characteristics, is undergoing rapid changes, imposing
new challenges for the efficient utilization of hardware
resources. Recent trends include massive multi-core
processing systems, high performance co-processors, very
large main memory systems, persistent main memory, fast
networking components, big computing clusters, and large
data centers that consume massive amounts of energy.
Utilizing new hardware technologies for efficient Big
Data management is of urgent importance.
Active :
Existing approaches to solve data-intensive problems
often require data to be moved near the computing
resources for processing. These data movement costs can
be prohibitive for large data sets. One promising
solution is to bring virtualized computing resources
closer to data, whether it is at rest or in motion. The
premise of active systems is a new holistic
view of the system in which every data medium and every
communication channel become compute-enabled. The Active
workshop aims to study different aspects of the active
systems' stack, understand the impact of active
technologies (including but not limited to hardware
accelerators such as SSDs, GPUs, FPGAs, and ASICs) on
different applications workloads over the lifecycle of
data, and revisit the interplay between algorithmic
modeling, compiler and programming languages,
virtualized runtime systems and environments, and
hardware implementations, for effective exploitation of
active technologies.
HardBD &
Active'18 : Both HardBD and
Active are interested in exploiting hardware
technologies for data-intensive systems. The aim of this
one-day joint workshop is to bring together researchers,
practitioners, system administrators, and others
interested in this area to share their perspectives on
exploiting new hardware technologies for data-intensive
workloads and big data systems, and to discuss and
identify future directions and challenges in this area.
The workshop aims at providing a forum for academia and
industry to exchange ideas through research and position
papers.
[
Go to Top ]
|
Topics
|
Topics of interest include but
are not limited to:
- Systems Architecture on New Hardware
- Data Management Issues in Software-Hardware-System
Co-design
- Main Memory Data Management (e.g. CPU Cache
Behavior, SIMD, Lock-Free Designs, Transactional
Memory)
- Data Management on New Memory Technologies (e.g.,
SSDs, NVMs)
- Active Technologies (e.g., GPUs, FPGAs, and ASICs)
in Co-design Architectures
- Distributed Data Management Utilizing New Network
Technologies (e.g., RDMA)
- Novel Applications of New Hardware Technologies in
Query Processing, Transaction Processing, or Big Data
Systems (e.g., Hadoop, Spark, NoSQL, NewSQL, Document
Stores, Graph Platforms etc.)
- Novel Applications of Low-Power Modern Processors
in Data-Intensive Workloads
- Virtualizing Active Technologies on Cloud (e.g.,
Scalability and Security)
- Benchmarking, Performance Models, and/or Tuning of
Data Management Workloads on New Hardware Technologies
[
Go to Top ]
|
Submission Guidelines
|
We welcome submissions of
original, unpublished research papers that are not
being considered for publication in any other forum.
Papers should be prepared in the IEEE format and
submitted as a single PDF file. The paper length
should not exceed 6 pages. The submission site is
https://cmt3.research.microsoft.com/HardBDActive2018
.
[
Go to Top
]
|
Important Dates
|
Paper submission: |
January 15,
2018 (Monday) 11:59:00 PM PT
January 21, 2018 (Sunday) 11:59:00 PM PT |
Notification of
acceptance: |
February 5,
2018 (Monday)
February 10, 2018 (Saturday) |
Camera-ready
copies: |
February 19,
2018 (Monday)
February 24, 2018 (Saturday) |
Workshop: |
April 16, 2018
(Monday) |
[
Go to Top ]
|
Program
|
8:40-8:45am
Welcome Messages
8:45-9:45am
Session I: Keynote 1
9:45-10:30am Coffee Break
10:30-12:00pm
Session II: Research Presentation 1
- Exploiting
Automatic Vectorization to Employ SPMD on SIMD
Registers.
Stefan Sprenger
(Humboldt-Universität zu Berlin), Steffen Zeuch (DFKI
Berlin), Ulf Leser (Humboldt-Universität zu Berlin)
(slides)
- Conflict
Detection-based Run-Length Encoding -- AVX512-CD
Instruction Set in Action. Annett
Ungethüm, Johannes Pietrzyk, Patrick Damme, Dirk
Habich, Wolfgang Lehner (TU Dresden)
(slides)
- Fused
Table Scans: Using AVX-512 and JIT to Double the
Performance of Consecutive Table Scans.
Markus Dreseler, Jan Kossmann, Johannes Frohnhofen,
Matthias Uflacker, Hasso Plattner (Hasso Plattner
Institute)
(slides)
- An
NVM-aware Storage Layout for Analytical Workloads.
Philipp Götze, Stephan Baumann, Kai-Uwe Sattler (TU
Ilmenau)
12:00-13:30am Lunch
13:30-15:00pm
Session III: Research Presentation 2
15:00-15:30am Coffee Break
15:30-16:30pm Session IV:
Keynote 2
16:30-17:00pm Session V:
Industry Talk
- X-DB: the Next Generation Database System of Alibaba Group.
Tieying Zhang (Alibaba)
[
Go to Top ]
|
Keynote Talks
|
Bitwise Dimensional Co-Clustering (BDCC):
Exploiting Fine-Grained Persistent Memories for OLAP
Peter Boncz (CWI & VU University Amsterdam)
With the current dominance of flash and trend towards even more fine-grained
non-volatile memories (NVM), rightfully a lot of attention has been given to
the implications of fine-grained persistence for OLTP. This keynote, however,
focuses on OLAP, and describes a new processing and storage framework called
Bitwise Dimensional Co-clustering (BDCC) that exploits the low-latency and
small granularity access capabilities of such modern storage.
Analytical workloads in data warehouses often include heavy joins where queries
involve multiple fact tables in addition to the typical star-patterns,
dimensional grouping and selections. Deeply clustered storage is an optimizing
database design that avoids replication and thus keeps updates fast, yet
accelerates all foreign key joins, efficiently supports grouping and pushes
down most dimensional selections, and sometimes eliminates dimensional joins.
The framework is made up of side-ways information passing operators at
execution, query optimization rules and costing and automated physical design
algorithm that takes the schema as its only input.
While BDCC was designed for flash, its fine-grained philosophy matches
up-and-coming NVM capabilities, and this talk is intended to inspire new
research directions for analytics on these new memories.
Bio:
Peter Boncz holds appointments as tenured researcher at CWI and full professor
at VU University Amsterdam. His academic background is in core database
architecture, with the architecture of MonetDB the main topic of his PhD thesis
--MonetDB won the 2016 ACM SIGMOD systems award. This work focused on
architecture-conscious database research, which studies the interaction between
computer architecture and data management techniques. His specific
contributions are in cache-conscious join methods, query and transaction
processing in columnar database systems, and vectorized query execution. He has
a track record in bridging the gap between academia and commercial application,
receiving the Dutch ICT Regie Award 2006 for his role in the CWI spin-off
company Data Distilleries. In 2008 he founded a new CWI spin-off company called
Vectorwise, dedicated to state-of-the art business intelligence technology. He
is also the co-recipient of the 2009 VLDB 10 Years Best Paper Award, and in
2013 received the Humboldt Research Award. He recently also provides advise on
high performance data architectures to Databricks (the creators of Apache
Spark), who recently openend an Amsterdam R&D office to collaborate with CWI.
Caching in the Memory Hierarchy:
5 Minutes Ought to Be Enough for Everybody
Anastasia Ailamaki (EPFL)
In 1987, Jim Gray and Gianfranco Putzolu introduced the five-minute rule for
trading memory to reduce disk I/O using the then-current price-performance
characteristics of DRAM and Hard Disk Drives (HDD). Since then, the
five-minute rule has gained wide-spread acceptance as an important
rule-of-thumb in data engineering. In this talk, I will revisit the
five-minute rule three decades since its introduction and use it to identify
impending changes in today's multi-tier storage hierarchy given recent trends
in the storage hardware landscape. I will investigate the impact of the
five-minute rule -- explicit or implicit -- on the way we perform analytics.
We will see that the rule applies both in the bottom tiers of the hierarchy,
which is based on new Cold Storage Devices (CSD), but also in main-memory
databases, where researchers have been working on hot-cold data separation and
on heterogeneity-aware caching techniques.
Bio:
Anastasia Ailamaki is a Professor of Computer and Communication Sciences at the
Ecole Polytechnique Federale de Lausanne (EPFL) in Switzerland and the
co-founder of RAW Labs SA, a swiss company developing real-time analytics
infrastructures for heterogeneous big data. Her research interests are in
data-intensive systems and applications, and in particular (a) in strengthening
the interaction between the database software and emerging hardware and I/O
devices, and (b) in automating data management to support computationally-
demanding, data-intensive scientific applications. She has received an ERC
Consolidator Award (2013), a Finmeccanica endowed chair from the Computer
Science Department at Carnegie Mellon (2007), a European Young Investigator
Award from the European Science Foundation (2007), an Alfred P. Sloan Research
Fellowship (2005), eight best-paper awards in database, storage, and computer
architecture conferences, and an NSF CAREER award (2002). She holds a Ph.D. in
Computer Science from the University of Wisconsin-Madison in 2000. She is an
ACM fellow, an IEEE fellow, and an elected member of the Swiss National
Research Council. She has served as a CRA-W mentor, and is a member of the
Expert Network of the World Economic Forum.
[
Go to Top ]
|
Organizers
|
[
Go to Top ]
|
PC
Members
|
- Manos Athanassoulis, Harvard University
- Sebastian Breß, DFKI GmbH
- Bingsheng He, National University of Singapore
- Peiquan Jin, Univerisity of Science and Technology
of China
- Qiong Luo, Hong Kong University of Science and
Technology
- Roger Moussalli, Two Sigma
- Ilia Petrov, Reutlingen University
- Eva Sitaridi, Amazon
- Xiaodong Zhang, Ohio State University
[
Go to Top ]
|
Sponsor
|
[
Go to Top ]
|
|