FlexSim Knowledge Base
Announcements, articles, and guides to help you take your simulations to the next level.
Sort by:
This article describes an example of Reinforcement Learning being used to solve scheduling problems. See the model and python files in the attached zip file. SchedulingRL.zip Problem Description This model represents a generic sheet metal processing plant. There are four machines in series. Each job requires time on all four machines. Jobs come in batches of 10. A poor sequence of jobs will cause blocking between items, lowering throughput. If the time between batches is long, such as a shift or a day, you could use the optimizer to determine the best sequence. If the time between batches is short, however, the optimizer may not be feasible. For real sequencing problems, the time to find a good sequence can be anywhere from 5 minutes to an hour, or even longer. This makes it impractical for high-velocity situations. The attached model requests a decision every time the first machine in the series is available. The only action is an index for the Nth available job. So the decision can be interpreted as "which job should I do next?" Solution The general solution is to use reinforcement learning. However, this problem required customized python scripts: The model uses custom parameters for observations. This allows arbitrary values for observations. The model uses a custom observation space. The observations include a table of the required times at each station for the remaining jobs. They also include an array of the in-progress jobs and their predicted remaining times. By using a Dict space, the python scripts can combine all the observations into a single space. The model uses an Action Mask. An Action Mask is a binary array with one value per value of the action. This tells the RL algorithm about invalid options. The python scripts require the sb3-contrib package. Use pip install sb3-contrib to install it. Results After training for 500k time-steps, the agent learns to choose jobs moderately well. If you run the inference script, you can use the experimenter to compare a random policy to a trained agent:
View full article
FlexSim 2022 introduced a Reinforcement Learning tool that enables you to configure your model to be used as an environment for reinforcement learning algorithms. That tool makes connecting to FlexSim from a reinforcement learning algorithm easier, but that tool is not absolutely necessary for this type of connectivity. The same socket communication protocols that are used by that tool are available generally in FlexScript. Attached (ChangeoverTimesRL_V22.0.fsm) is the FlexSim 2022 model that you build as part of the Using Reinforcement Learning documentation that walks you through the process of building and preparing a FlexSim model for reinforcement learning, training an agent within that model environment, evaluating the performance of the trained reinforcement learning model, and using that trained model in a real production environment. Also attached (ChangeoverTimesRL_V6.0.fsm) is a model built with FlexSim 6.0.2 from 2012 that does the exact same thing, but with custom FlexScript user commands instead of the Reinforcement Learning tool. You can use this model with the example python scripts and FlexSim 6.0.2 in the same way that you can use the other model with those same scripts in FlexSim 2022. I'm providing this FlexSim 6 model as an example that demonstrates how you can communicate between FlexSim and other programs. The Reinforcement Learning tool certainly makes this type of communication easier and simpler, with a nice UI for specifying RL-specific parameters, but the fundamental principles of how this works have been available in FlexSim for many years using FlexScript. Hopefully this example can help teach and inspire those who wish to control or communicate with FlexSim from external sources for purposes other than just reinforcement learning. FlexSim is flexible, and the possibilities are endless.
View full article
Top Contributors