Today’s cellular Radio Access Networks (RAN) are vast, and operators collect a variety of data to better understand and manage these networks. A fundamental challenge in any geo-distributed system — including cellular RANs — is the tradeoff between data collection latency and the accuracy of insights derived from the collected data. While the general problem is difficult to address, one can side-step some of the challenges by leveraging domain-specific knowledge. That’s exactly what we achieve in CellScope in the context of cellular RANs through (1) intelligent data grouping and (2) task formulations that leverage domain characteristics. Specifically, we enable multi-task learning for performance analysis of cellular RANs using the aforementioned techniques.
An increasing amount of mobile analytics is performed on data that is procured from mobile devices in a real-time fashion to make real-time decisions. Such tasks include simple reporting on streams to sophisticated model building. However, the practicality of such analyses are impeded in several domains because they are faced with a fundamental trade-off between data collection latency and analysis accuracy.
In this paper, we first study this trade-off in the context of a specific domain, Cellular Radio Access Networks (RAN). We find that the trade-off can be resolved using two broad, general techniques: intelligent data grouping and task formulations that leverage domain characteristics. Based on this, we present CellScope, a system that applies a domain specific formulation and application of Multi- task Learning (MTL) to RAN performance analysis. It uses three techniques: feature engineering to transform raw data into effective features, a PCA inspired similarity metric to group data from geographically nearby base stations sharing performance commonalities, and a hybrid online-offline model for efficient model updates. Our evaluation shows that its accuracy improvements over direct application of ML range from 2.5× to 4.4× while reducing the model update overhead by up to 4.8×. We have also used it to analyze an LTE network of over 2 million subscribers, where it reduced troubleshooting efforts by several magnitudes.
We then apply the underlying techniques in CellScope to another domain, mobile phone energy bug diagnosis, and show that the techniques are general.
I got involved in this project about 3 years ago right before I was about to start in Michigan. The project faced many challenges, but it survived and saw the light of day only because of Anand’s persistence. This is my first MobiCom paper, and I’m pleasantly surprised by the level of rigorous reviews the paper went through — 9 reviews + 1 blind shepherding process! This fits right into our efforts toward geo-distributed analytics with many more exciting results to come soon. Stay tuned!