|  | 
          
            | 
                 
                  Update: Due to the recent situation of COVID-19,
                  HardBD&Active 2020 will follow ICDE 2020's decision to be
                  held online using Zoom.  ICDE and the workshops are
                  free to register.
          
                  The morning sessions will be
                  in the HardBD & Active Zoom room.  The afternoon session
                  will be in the SMDB Zoom room. 
 |  
            |     Description |  
            | 
   HardBD
                : Data properties and hardware characteristics are two
                key aspects for efficient data management. A clear trend
                in the first aspect, data properties, is the increasing
                demand to manage and process Big Data in both enterprise
                and consumer applications, characterized by the fast
                evolution of Big Data Systems, such as Key-Value stores,
                Document stores, Graph stores, Spark, MapReduce/Hadoop,
                Graph Computation Systems, Tree-Structured Databases, as
                well as novel extensions to relational database systems.
                At the same time, the second aspect, hardware
                characteristics, is undergoing rapid changes, imposing
                new challenges for the efficient utilization of hardware
                resources. Recent trends include massive multi-core
                processing systems, high performance co-processors, very
                large main memory systems, persistent main memory, fast
                networking components, big computing clusters, and large
                data centers that consume massive amounts of energy.
                Utilizing new hardware technologies for efficient Big
                Data management is of urgent importance.    Active :
                Existing approaches to solve data-intensive problems
                often require data to be moved near the computing
                resources for processing. These data movement costs can
                be prohibitive for large data sets. One promising
                solution is to bring virtualized computing resources
                closer to data, whether it is at rest or in motion. The
                premise of active systems is a new holistic
                view of the system in which every data medium and every
                communication channel become compute-enabled. The Active
                workshop aims to study different aspects of the active
                systems' stack, understand the impact of active
                technologies (including but not limited to hardware
                accelerators such as SSDs, GPUs, FPGAs, and ASICs) on
                different applications workloads over the lifecycle of
                data, and revisit the interplay between algorithmic
                modeling, compiler and programming languages,
                virtualized runtime systems and environments, and
                hardware implementations, for effective exploitation of
                active technologies.     HardBD &
                      Active'20  : Both HardBD and
                Active are interested in exploiting hardware
                technologies for data-intensive systems. The aim of this
                one-day joint workshop is to bring together researchers,
                practitioners, system administrators, and others
                interested in this area to share their perspectives on
                exploiting new hardware technologies for data-intensive
                workloads and big data systems, and to discuss and
                identify future directions and challenges in this area.
                The workshop aims at providing a forum for academia and
                industry to exchange ideas through research and position
                papers.   [ 
                    Go to Top ]
 
 |  
            |     Topics |  
            |  Topics of interest include but
                are not limited to: 
                 Systems Architecture on New Hardware  Data Management Issues in Software-Hardware-System
                  Co-design  Main Memory Data Management (e.g. CPU Cache
                  Behavior, SIMD, Lock-Free Designs, Transactional
                  Memory)  Data Management on New Memory Technologies (e.g.,
                  SSDs, NVMs)  Active Technologies (e.g., GPUs, FPGAs, and ASICs)
                  in Co-design Architectures  Distributed Data Management Utilizing New Network
                  Technologies (e.g., RDMA)  Novel Applications of New Hardware Technologies in
                  Query Processing, Transaction Processing, or Big Data
                  Systems (e.g., Hadoop, Spark, NoSQL, NewSQL, Document
                  Stores, Graph Platforms etc.)  Novel Applications of Low-Power Modern Processors
                  in Data-Intensive Workloads  Virtualizing Active Technologies on Cloud (e.g.,
                  Scalability and Security)  Benchmarking, Performance Models, and/or Tuning of
                  Data Management Workloads on New Hardware Technologies
                  [ 
                    Go to Top ]
                
 
 |  
            |      Submission Guidelines  |  
            |  |  
            |        Important Dates |  
            | 
 
                
                  
                    | Paper submission: | January 20, 2020 (Monday) 11:59:00 PM PTJanuary 24, 2020 (Friday) 11:59:00 PM PT
 |  
                    | Notification of
                      acceptance: | February 10, 2020 (Monday) |  
                    | Camera-ready
                      copies: | February 28, 2020 (Friday) |  
                    | Workshop: | April 20, 2020
                      (Monday) |   [ 
                    Go to Top ]
 
 |  
            |     
                Program |  
              | Note: Times are displayed in CDT.
 
 HardBD&Active Zoom Room: all morning sessions 
 8:45-9:00am CDT  Welcome Messages
 9:00-9:45am CDT Keynote 1
 9:45-10:15am CDT Joint Invited Talk with SMDB
 10:15-10:30am CDT Break
 10:30-11:30am CDT Research Presentation
 
                On the Necessity of Explicit Cross-Layer Data Formats in Near-Data Processing Systems.Tobias Vinçon (Reutlingen University), 
                Arthur Bernhardt (Reutlingen University), 
                Lukas Weber (TU Darmstadt), 
                Andreas Koch (TU Darmstadt), 
                Ilia Petrov (Reutlingen University)
 
 
Selective Caching: A Persistent Memory Approach for Multi-Dimensional Index Structures.Muhammad Attahir Jibril (TU Ilmenau), 
                Philipp Götze (TU Ilmenau), 
                David Broneske (OvG University Magdeburg & Anhalt University of Applied Science), 
                Kai-Uwe Sattler (TU Ilmenau)
 
 
A Persistent Memory-Aware Buffer Pool Manager Simulator for Multi-Tenant Cloud Databases.Taras Basiuk (University of Oklahoma), 
                Le Gruenwald (University of Oklahoma),
                Laurent d'Orazio (Rennes 1 University), 
                Eleazar Leal (University of Minnesota)
 
 
 11:30-12:00pm CDT Break (Please switch to the SMDB Zoom room for the afternoon session) 
 
 SMDB Zoom Room: afternoon session 
 12:00-12:45pm CDT Joint Keynote 2 with SMDB
  [ 
                    Go to Top ]
 
 |  
            |     
                Joint Keynote Talks with SMDB 2020
                       |  
            | 
 
   |   |  | Software Hardware Co-Design for Cloud Native Database Systems
 
 
 Feifei Li
 Vice President of Alibaba Group, Professor at University of Utah
 |  Abstract: 
Cloud native database systems become increasingly popular on the cloud, which
leverages the virtualized resource pool provided by the underlying cloud
infrastructure to offer excellent elasticity, high availability, and
scalability. Decoupling resource usage and management across the stack (e.g,
compute and storage) is a critical path towards realizing cloud native
properties. Software-hardware co-design plays an important role in this
paradigm, such as using kernel bypassing, RDMA for shared distributed storage,
FPGA acceleration, NVM for tied memory hierarchy, TEE for secure and
trustworthy compute, to name a few. This talk shares our experience and lessons
learned from using software-hardware co-design principles towards building
cloud native database systems.
 
 Bio: 
Feifei Li is currently a Vice President of Alibaba Group, ACM Distinguished
Scientist, President of the Database Products Business Unit of Alibaba Cloud,
and director of the Database and Storage Lab of DAMO academy. He is a tenured
full professor at the School of Computing, University of Utah (on leave). He
has won multiple awards from NSF, ACM, IEEE, Visa, Google, HP, Microsoft, IBM,
etc. He is a recipient of the ACM SoCC 2019 Best Paper Award (runner-up), IEEE
ICDE 2014 10 Years Most Influential Paper Award, ACM SIGMOD 2016 Best Paper
Award, ACM SIGMOD 2015 Best System Demonstration Award, IEEE ICDE 2004 Best
Paper Award. He has been an associate editor, PC co-chairs, and core committee
members for many prestigious journals and conferences.
 
 
 
 Abstract: 
Autonomous Database is one of the hottest Oracle products where we have
attempted to use Machine Learning for several aspects of the service. We will
cover some use cases to find anomalies in them to troubleshoot them at a scale
of several petabytes a year using Log Anomaly timeline using semi-supervised
machine learning techniques to reduce logs and match them in near real time. We
will also cover how we detect changing workload, use Zscores to pinpoint
faults, use time series analysis to find good times to do backups or
maintenance, models to detect performance tuning issues and root cause analysis
as well as fleet learning to apply knowledge of trends and issues across
multiple symptoms affecting the fleet including rediscovery. We will cover
examples, code where applicable and frameworks we use for this.
 
 Bio: 
Sandesh Rao is a VP running the AIOps Automation for the Autonomous Database
Group at Oracle Corporation specializing using AI/ML for different use cases
from predicting faults before they happen to Anomaly Detection within log data,
metrics data. His previous positions have focused on performance tuning, high
availability, disaster recovery and architecting cloud-based solutions using
the Oracle Stack. With more than 18 years of experience working in the HA space
and having worked on several versions of Oracle with different application
stacks, he is a recognized expert in RAC, Database Internals, PaaS, SaaS, and
IaaS solutions and solving Big Data related problems. Most of his work involves
working with customers in the implementation of public and hybrid cloud
projects in the financial, retailing, scientific, insurance, biotech, and tech
space. He is also responsible for developing assessments for best practices for
the Oracle Grid Infrastructure 19c including products like RAC (Real
Application Clusters), Storage (ASM, ACFS). More details  
https://bit.ly/1UCL46K
 
  [ 
                    Go to Top ]
 
 |  
            |     
                Joint Invited Talk with SMDB 2020
                       |  
            | 
 
   |   |  | AI-Native Database 
 
 Guoliang Li
 Tsinghua University
 |  Abstract: 
In big data era, database systems face three challenges. Firstly, the
traditional heuristics-based optimization techniques (e.g., cost estimation,
join order selection, knob tuning) cannot meet the high-performance requirement
for large-scale data, various applications and diversified data. We can design
learning-based techniques to make database more intelligent. Secondly, many
database applications require to use AI algorithms, e.g., image search in
database. We can embed AI algorithms into database, utilize database techniques
to accelerate AI algorithms, and provide AI capability inside databases.
Thirdly, traditional databases focus on using general hardware (e.g., CPU), but
cannot fully utilize new hardware (e.g., AI chips). Moreover, besides
relational model, we can utilize tensor model to accelerate AI operations.
Thus, we need to design new techniques to make full use of new hardware.  To
address these challenges, we design an AI-native database. On one hand, we
integrate AI techniques into databases to provide self-configuring,
self-optimizing, self-healing, self-protecting and self-inspecting capabilities
for databases. On the other hand, we can enable databases to provide AI
capabilities using declarative languages, in order to lower the barrier of
using AI. In this talk, I will introduce the five levels of AI-native databases
and provide the open challenges of designing an AI-native database. I will also
take automatic database knob tuning, deep reinforcement learning based
optimizer, machine-learning based cardinality estimation, automatic index/view
advisor as examples to showcase the superiority of AI-native databases.
 
 Bio: 
Guoliang Li is a full Professor of Department of Computer Science, Tsinghua
University, Beijing, China.  His research interests include AI-native database,
big data analytics and mining, crowdsourced data management, big
spatio-temporal data analytics, large-scale data cleaning and integration. He
has published more than 100 papers in premier conferences and journals, such as
SIGMOD, VLDB, ICDE, SIGKDD, SIGIR, TODS, VLDB Journal, and TKDE. He will be the
General co-chair of SGIMOD 2021 and demo chair of VLDB 2021. He is working as
associate editor for IEEE Transactions and Data Engineering, VLDB Journal, ACM
Transaction on Data Science, IEEE Data Engineering Bulletin. He got several
best paper awards in top conferences, such as CIKM 2017 best paper award, ICDE
2018 best paper candidate, KDD 2018 best paper candidate, DASFAA 2014 best
paper runner-up, APWeb 2014 best paper award, etc. He received VLDB Early
Research Contribution Award 2017, and IEEE TCDE Early Career Award 2014.
 
  [ 
                    Go to Top ]
 
 |  
            |     
                Organizers |  
            | 
  [ 
                    Go to Top ]
 
 |  
            |     
                PC
                      Members |  
            | 
 
                Manos Athanassoulis, Boston UniversityBingsheng He, National University of SingaporePeiquan Jin, Univerisity of Science and Technology
                  of ChinaWolfgang Lehner, TU DresdenYinan Li, Microsoft ResearchQiong Luo, Hong Kong University of Science and
                  TechnologyStefan Manegold, CWIIlia Petrov, Reutlingen UniversityEva Sitaridi, AmazonTianzheng Wang, Simon Fraser UniversityXiaodong Zhang, Ohio State University  [ 
                    Go to Top ]
 
 |  |