暂无分享,去创建一个
[1] Alexander L. Stolyar,et al. Scheduling for multiple flows sharing a time-varying channel: the exponential rule , 2000 .
[2] Anurag Kumar,et al. Hybrid MAC Protocols for Low-Delay Scheduling , 2016, 2016 IEEE 13th International Conference on Mobile Ad Hoc and Sensor Systems (MASS).
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Shie Mannor,et al. Improper Learning with Gradient-based Policy Optimization , 2021, ArXiv.
[5] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[6] Leandros Tassiulas,et al. Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks , 1992 .