Computer Science and Information Systems 2023 Volume 20, Issue 1, Pages: 157-173
https://doi.org/10.2298/CSIS220927055N
Full text ( 2485 KB)
Cited by


Reinforcement learning - based adaptation and scheduling methods for multi-source DASH

Nguyen Nghia T. (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), ntnghia@hcmiu.edu.vn
Luu Long (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), ITITIU18079@student.hcmiu.edu.vn
Vo Phuong L. (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), vtlphuong@hcmiu.edu.vn
Nguyen Sang Thanh Thi (School of Computer and Engineering, International University, Ho Chi Minh City, Vietnam + Vietnam National University, Ho Chi Minh City, Vietnam), nttsang@hcmiu.edu.vn
Do Cuong T. (Department of Computer Engineering, Kyung Hee University, Korea), dtcuong@khu.ac.kr; dothecuong@gmail.com
Nguyen Ngoc-Thanh (Wroclaw University of Science and Technology, Poland), ngoc-thanh.nguyen@pwr.edu.pl

Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user’s quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery networks, edge caching networks, contentcentric networks, etc. usually replicate video contents on multiple cache nodes. We study video streaming from multiple sources in this work. In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths. Hence, to guarantee a high QoE, the video client needs not only rate adaptation, but also chunk scheduling. Reinforcement learning (RL) has emerged as the state-of-the-art control method in various fields in recent years. This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS). We also build a simulation environment for training and evaluation. The efficiency of the proposed algorithms is proved via extensive simulations with real-trace data.

Keywords: multi-source streaming, reinforcement learning, proximal policy optimization, dynamic adaptation streaming over HTTP


Show references