Our position paper calling for a respite in the deep learning framework building arms race has been accepted to appear at this year’s HotOS workshop. We make a simple observation: too many frameworks are being proposed with little interoperability between them, even though many target the same or similar workloads; this inevitably leads to repetitions and reinventions from a machine learning perspective and suboptimal performance from a systems perspective. We identify two places for consolidation across many deep learning frameworks’ architectures that may enable interoperability as well as code, optimization, and resource sharing, benefitting both the machine learning and systems communities.
In recent years, deep learning has pervaded many areas of computing due to the confluence of an explosive growth of large-scale computing capabilities, availability of datasets, and advances in learning techniques. While this rapid growth has resulted in diverse deep learning frameworks, it has also led to inefficiencies for both the users and developers of these frameworks. Specifically, adopting useful techniques across frameworks — both to perform learning tasks and to optimize performance — involves significant repetitions and reinventions.
In this paper, we observe that despite their diverse origins, many of these frameworks share architectural similarities. We argue that by introducing a common representation of learning tasks and a hardware abstraction model to capture compute heterogeneity, we might be able to relieve machine learning researchers from dealing with low-level systems issues and systems researchers from being tied to any specific framework. We expect this decoupling to accelerate progress in both domains.
Our foray into deep learning systems started with a class project by Peifeng and Linh last Fall in my EECS 582 course. From a systems perspective, this is a very new and exciting area! We are learning new things everyday, ranging from low-level GPU programming to communication over the NVLink technology, and we are looking forward to a very exciting summer.
FTR, the HotOS PC accepted 29 papers out of 94 submissions this year.