Dynamically changing sequencing rules with reinforcement learning in a job shop system with stochastic influences
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
Proceedings of the 2020 Winter Simulation Conference, WSC 2020. ed. / K.-H. Bae; B. Feng; S. Kim; S. Lazarova-Molnar; Z. Zheng; T. Roeder; R. Thiesing. IEEE - Institute of Electrical and Electronics Engineers Inc., 2020. p. 1608 - 1618 9383903 (Proceedings - Winter Simulation Conference; Vol. 2020-December).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - Dynamically changing sequencing rules with reinforcement learning in a job shop system with stochastic influences
AU - Heger, Jens
AU - Voß, Thomas
PY - 2020/12/14
Y1 - 2020/12/14
N2 - Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ~ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing a robust performance.
AB - Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ~ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing a robust performance.
KW - Engineering
UR - http://www.scopus.com/inward/record.url?scp=85103874223&partnerID=8YFLogxK
U2 - 10.1109/WSC48552.2020.9383903
DO - 10.1109/WSC48552.2020.9383903
M3 - Article in conference proceedings
AN - SCOPUS:85103874223
T3 - Proceedings - Winter Simulation Conference
SP - 1608
EP - 1618
BT - Proceedings of the 2020 Winter Simulation Conference, WSC 2020
A2 - Bae, K.-H.
A2 - Feng, B.
A2 - Kim, S.
A2 - Lazarova-Molnar, S.
A2 - Zheng, Z.
A2 - Roeder, T.
A2 - Thiesing, R.
PB - IEEE - Institute of Electrical and Electronics Engineers Inc.
T2 - Winter Simulation Conference - WSC 2020
Y2 - 14 December 2020 through 18 December 2020
ER -