Environment reconstruction with hidden confounders for reinforcement learning based recommendation. Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng and Jieping Ye.In: Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI'19), Macao, China, 2019. Reinforcement learning experience reuse with policy residual representation. Wen-Ji Zhou, Yang Yu, Yingfeng Chen, Kai Guan, Tangjie Lv, Changjie Fan and Zhi-Hua Zhou.Cascaded algorithm-selection and hyper-parameter optimization with extreme-region upper confidence bound bandit. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS'19), Montreal, Canada, 2019, pp.1880-1882. Reinforcement Learning with Derivative-Free Exploration. In: Proceedings of the 1st International Conference on Distributed Artificial Intelligence (DAI'19), Beijing, China, 2019. Asynchronous Classification-Based Optimization. Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Yang Yu.In: Advances in Neural Information Processing Systems 32 (NeurIPS'19), Vancouver, Canada, 2019. Bridging machine learning and logical reasoning by abductive learning. Wang-Zhou Dai, Qiuling Xu, Yang Yu, and Z.-H.In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI'20), New York, NY, 2020. An efficient evolutionary algorithm for subset selection with general cost constraints. Chao Bian, Chao Feng, Chao Qian, and Yang Yu.In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI'20), Santiago de Compostela, Spain, 2020. Derivative-free optimization with adaptive experience for efficient hyper-parameter tuning. Yi-Qi Hu, Zelin Liu, Hua Yang, Yang Yu, and Yunfeng Liu.In: Advances in Neural Information Processing Systems 33 (NeurIPS'20), Virtual Conference, 2020. Offline imitation learning with a misspecified simulator. Shengyi Jiang, Jing-Cheng Pang, Yang Yu.Error bounds of imitating policies and environments. In: Proceedings of the 9th International Conference on Learning Representations (ICLR'21), Virtual Conference, 2021. QPLEX: Duplex dueling multi-agent Q-Learning. Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang.The Top Ten Algorithms in Data Mining, Boca Raton, FL: Chapman & Hall, 2009. Evolutionary Learning: Advances in Theories and Algorithms, Berlin: Springer, 2019. ZOOpt: Toolbox for derivative-free optimization. Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Yang Yu, and Chao Qian.Short, Ze Jun Ding, Zhi Zeng and Ju Li, Scientific Reports 5 (2015) 18130. ![]() ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |