Computer Science and Information Systems 2023 Volume 20, Issue 1, Pages: 157-173
https://doi.org/10.2298/CSIS220927055N
Full text (
2485 KB)
Cited by
Reinforcement learning - based adaptation and scheduling methods for multi-source DASH
Nguyen Nghia T. (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), ntnghia@hcmiu.edu.vn
Luu Long (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), ITITIU18079@student.hcmiu.edu.vn
Vo Phuong L. (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), vtlphuong@hcmiu.edu.vn
Nguyen Sang Thanh Thi (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), nttsang@hcmiu.edu.vn
Do Cuong T. (Department of Computer Engineering, Kyung Hee University, Korea), dtcuong@khu.ac.kr; dothecuong@gmail.com
Nguyen Ngoc-Thanh (Wroclaw University of Science and Technology, Poland), ngoc-thanh.nguyen@pwr.edu.pl
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user’s quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery networks, edge caching networks, contentcentric networks, etc. usually replicate video contents on multiple cache nodes. We study video streaming from multiple sources in this work. In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths. Hence, to guarantee a high QoE, the video client needs not only rate adaptation, but also chunk scheduling. Reinforcement learning (RL) has emerged as the state-of-the-art control method in various fields in recent years. This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS). We also build a simulation environment for training and evaluation. The efficiency of the proposed algorithms is proved via extensive simulations with real-trace data.
Keywords: multi-source streaming, reinforcement learning, proximal policy optimization, dynamic adaptation streaming over HTTP
Show references
Cisco: Cisco Visual Networking Index: Forecast and Methodology, 2016-2021.
T. Stockhammer: Dynamic adaptive streaming over HTTP: standards and design principles. In Proceedings of the second annual ACM conference on Multimedia systems, 133-144. (2011)
I. Sodagar: The MPEG-DASH Standard for Multimedia Streaming Over the Internet. IEEE MultiMedia, Vol. 18, Issue 4, 62-67. (2011)
S. Lederer, C. M¨uller and C. Timmerer: Dynamic Adaptive Streaming over HTTP Dataset. In Proceedings of the ACM Multimedia Systems Conference, 22-24. (2012) Online: https://dash.itec.aau.at/dash-dataset/.
ISO/IEC 23009-1:2014: Dynamic Adaptive Streaming over HTTP (DASH)- part 1: Media Description and Segments format.
DASH Reference Client. Accessed: Jun. 28, 2019. [Online]. Available: https://reference.dashif.org/dash.js/
J. Jiang, V. Sekar, and H. Zhang: Improving Fairness, Efficiency, and Stability in HTTPbased Adaptive Video Streaming with FESTIVE. In Proceedings of CoNEXT. (2012)
K. Spiteri, R. Urgaonkar, and R. K. Sitaraman: BOLA: Near-optimal bitrate adaptation for online videos. In Proceedings of 35th Annual IEEE International Conference on Computer Communications (INFOCOM). (2016)
T. Y. Huang, R. Johari, N. McKeown, M. Trunnell, and M. Watson: A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM conference on SIGCOMM, 187-198. (2014)
Z. Li, X. Zhu, J. Gahm, R. Pan, H. Hu, A. C. Begen, and D. Oran: Probe and adapt: Rate adaptation for HTTP video streaming at scale. IEEE Journal on Selected Areas in Communications, Vol. 32, No. 4, 719-733. (2014)
Y.C. Chen, D. Towsley, and R. Khalili: MSPlayer: Multisource and multi-path video streaming. IEEE Journal on Selected Areas in Communications, Vol.34, Issue 8, 2198- 2206. (2016)
A. Nikravesh, Y. Guo, X. Zhu, F. Qian, and Z. M. Mao: MP-H2: a Client-only Multipath Solution for HTTP/2. In Proceedings of The 25th Annual International Conference on Mobile Computing and Networking, 1-16. (2019)
A. Bentaleb, P.K. Yadav,W.T. Ooi, and R. Zimmermann: DQ-DASH: A Queuing Theory Approach to Distributed Adaptive Video Streaming. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 16, No. 1, 1-24. (2020)
M. Claeys, S. Latre, J. Famaey, and F. De Turck, “Design and evaluation of a self-learning HTTP adaptive video streaming client,” IEEE communications letters, vol. 18, issue 4, pp. 716-719, 2014.
H. Mao, R. Netravali, and M. Alizadeh: Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, 197-210. (2017)
M. Gadaleta, F. Chiariotti, M. Rossi, and A. Zanella: D-DASH: A deep Q-learning framework for DASH video streaming. IEEE Transactions on Cognitive Communications and Networking, Vol. 3, Issue 4, 703-718. (2017)
D. Wischik, C. Raiciu, A. Greenhalgh, and M. Handley: Design, implementation and evaluation of congestion control for multipath TCP. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, Vol. 11, 8-8. (2011)
C. Raiciu, M. Handley, and D. Wischik: Coupled congestion control for multipath transport protocols, RFC6356. (2011)
Phuong Luu Vo, Tuan Anh Le, S. Lee, C. S. Hong, B. Kim, H. Song: mReno: a practical multipath congestion control for communication networks. Computing, Vol. 96, No. 3, 189-205. (2014)
Nghia T. Nguyen, Phuong L. Vo, Thi Thanh Sang Nguyen, Quan M. Le, Cuong T. Do, and Ngoc-Thanh Nguyen: A Reinforcement Learning Framework for Multi-source Adaptive Streaming. In Proceedings of International Conference on Computational Collective Intelligence, 416-426. (2021)
S. Huang and S. Ontanon: A Closer Look at Invalid Action Masking in Policy Gradient Algorithms. In Proceedings of the Thirty-Fifth International Florida Artificial Intelligence Research Society Conference, (FLAIRS 2022), Florida, USA, May 15-18. (2022)
US Federal Communications Commission (FCC). [Online]. Available: https://data.fcc.gov/download/measuring-broadband-america/2019/data-raw-2019-sept.tar.gz
Tenth Measuring Broadband America Fixed Broadband Report [Online]. Available: Measuring Fixed Broadband - Tenth Report - Federal Communications Commission (fcc.gov)
D. Raca, J.J. Quinlan, A.H. Zahran, C.J. Sreenan: Beyond Throughput: a 4G LTE Dataset with Channel and Context Metrics. In Proceedings of ACM Multimedia Systems Conference (MMSys 2018), Amsterdam, The Netherlands. (2018)
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov: Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347. (2017)
A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus and N. Dormann: Stable- Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research, Vol. 22, No. 268, 1-8. (2021)
T. M. Moerland, J. Broekens, and C. M. Jonker: Model-based reinforcement learning: A survey. arXiv preprint arXiv:2006.16712. (2020)