Displayed time zone: GMT+9 (KST)
Use GMT-7 (PDT)
Use GMT-4 (EDT)
Use GMT+2 (CET)
Monday, September 27
00:00 – 05:00 | Workshop The 2nd International Workshop on Machine Learning for Software Hardware Co-Design (MLSH’21) |
15:00 – 16:00 | Keynote Chair: Albert Cohen Making Sparse Array Programming On Par With Dense Saman Amarasinghe |
16:00 – 16:30 | Break |
16:30 – 18:10 | Session 1: Tuning and Lifting Chair: Jaejin Lee A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, Samuel J. Kaufman PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi, Angshuman Parashar, Po-An Tsai, Sivasankaran Rajamanickam, Roberto Gioiosa, Tushar Krishna Polygeist: Raising C to Polyhedral MLIR William S. Moses, Lorenzo Chelini, Ruizhe Zhao, Oleksandr Zinenko Program Lifting using Gray-Box Behavior Bruce Collie, Michael O’Boyle |
Tuesday, September 28
00:00 – 01:00 | Keynote (mirrored session) Chair: Albert Cohen |
01:00 – 01:30 | Break |
01:30 – 03:10 | Session 1: Tuning and Lifting (mirrored session) Chair: Riyadh Baghdadi |
03:10 – 04:00 | Break |
04:00 – 08:00 | Tutorial Title: How to parallelize your own language using OpenCilk components |
15:00 – 16:40 | Session 2: Heterogeneous Systems Chair: Alex Zinenko NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models Joonsung Kim, Suyeon Hur, Eunbok Lee, Seungho Lee, Jangwoo Kim HERTI: a Reinforcement Learning-Augmented System for Efficient Real-Time Inference on Heterogeneous Embedded Systems Myeonggyun Han, Woongki Baek X-Layer: Building Composable Pipelined Dataflows for Low-Rank Convolutions Naveen Vedula, Reza Hojabr, Ahmad Khonsari, Arrvindh Shriraman InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-aware Inner Product Processing Daehyeon Baek, Soojin Hwang, Taekyung Heo, Daehoon Kim, Jaehyuk Huh PrecisionBatching: Bitserial Decomposition for Efficient Neural Network Inference on GPUs Maximilian Lam, Zachary Yedidia, Colby R Banbury, Vijay Janapa Reddi |
16:40 – 17:00 | Break |
17:00 – 18:40 | Session 3: Characterization and Near-Memory Computing Chair: Hyungmin Cho AIBench Scenario: Scenario-distilling AI Benchmarking Wanling Gao, Fei Tang, Jianfeng Zhan, Xu Wen, Lei Wang, Zheng Cao, Chuanxin Lan, Chunjie Luo, Xiaoli Liu, Zihan Jiang Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks Amirali Boroumand, Saugata Ghose, Berkin Akin, Ravi Narayanaswami, Geraldo F. Oliveira, Xiaoyu Ma, Eric Shiu, Onur Mutlu SEER: A Time Prediction Model for CNNs from GPU Kernel’s View Guodong Liu, Sa Wang, Yungang Bao PIM-DL: Boosting DNN Inference on Digital Processing In-Memory Architectures via Data Layout Optimizations Minxuan Zhou, Guoyang Chen, Mohsen Imani, Saransh Gupta, Weifeng Zhang, Tajana Rosing Ultra Efficient Acceleration for De Novo Genome Assembly via Near-Memory Computing Minxuan Zhou, Lingxi Wu, Muzhou Li, Niema Moshiri, Kevin Skadron, Tajana Rosing |
Wednesday, September 29
00:00 – 01:40 | Session 2: Heterogeneous Systems (mirrored session) Chair: Prasanth Chatarasi |
01:40 – 02:00 | Break |
02:00 – 03:40 | Session 3: Characterization and Near-Memory Computing (mirrored session) Chair: Albert Cohen |
15:00 – 16:40 | Session 4: Memory Hierarchy Chair: Milind Chabbi CBP: Coordinated management of cache partitioning, bandwidth partitioning and prefetch throttling Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström, Miquel Pericás Invalidate or Update? Revisiting Coherence for Tomorrow’s Cache Hierarchies Mingcan Zhu, Amna Shahab, Antonios Katsarakis, Boris Grot Write Prediction for Persistent Memory Systems Suyash Mahar, Sihang Liu, Korakit Seemakhupt, Vinson Young, Samira Khan nuKSM: NUMA-aware Memory De-duplication on Multi-socket Servers Akash Panda, Ashish Panwar, Arkaprava Basu CoPlace: Effectively Mitigating Cache Conflicts in Modern Clouds Xiaowei Shang, Weiwei Jia, Jianchen Shan, Xiaoning Ding |
16:40 – 17:00 | Break |
17:00 – 18:40 | Session 5: Graphs and Applications Chair: Albert Cohen Dryadic: Flexible and Fast Graph Pattern Matching at Scale Daniel Mawhirter, Sam Reinehr, Wei Han, Noah Fields, Miles Claver, Connor Holmes, Jedidiah McClurg, Tongping Liu, Bo Wu Skywalker: Efficient Alias-method-based Graph Sampling and Random Walk on GPUs Pengyu Wang, Chao Li, Jing Wang, Taolei Wang, Lu Zhang, Jingwen Leng, Quan Chen, Minyi Guo SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction Chuangyi Gui, Xiaofei Liao, Long Zheng, Pengcheng Yao, Qinggang Wang, Hai Jin SURFNet: Super-resolution of Turbulent Flows with Transfer Learning using Small Datasets Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, Aparna Chandramowlishwaran Accelerating Fourier and Number Theoretic Transforms using Tensor Cores and Warp Shuffles Sultan Durrani, Muhammad Saad Chughtai, Mert Hidayetoglu, Rashid Tahir, Abdul Dakkak, Lawrence Rauchwerger, Fareed Zaffar, Wen-mei Hwu |
18:40 – 18:50 | ACM SRC Award and Closing Remarks |
Thursday, September 30
00:00 – 01:40 | Session 4: Memory Hierarchy (mirrored session) Chair: Lawrence Rauchwerger |
01:40 – 02:00 | Break |
02:00 – 03:40 | Session 5: Graphs and Applications (mirrored session) Chair: Tyler Sorensen |
03:40 – 03:50 | ACM SRC Award and Closing Remarks (mirrored session) |
04:00 – 05:00 | PACT Steering Committee meeting |