The Randall Museum in San Francisco hosts a large HO-scale model train layout. Created by the Golden Gate Model Railroad Club starting in 1961, the layout was donated to the Museum in 2015. Since then I have started automatizing trains running on the layout. I am also the de-facto layout maintainer. This blog describes various updates on the Randall project and I maintain a separate blog for all my electronics not directly related to Randall.
2018-03-28 - Error ConditionsCategory Randall
Now that we have some basic automation running and, most importantly, some feedback on how it is running, it’s time to account for error conditions. The post from 2018-02-12 did mention that earlier but it was more academic with any actual basis. Now I do have basis:
- On the Branchline, a repeated issue is the RDC failing to stop and reverse on the reverse block.
- Now that the RDC is moved to the mainline, I did experience the same issue at least once. However that’s not clear cut as in this instance I was also trying to update/restart RTAC remotely. It happened in a second instance for the RDC.
- At least once the same issue happened with the Amtrak train.
That picture is, ironically, exactly what I want to avoid and which happened:
In trains parlance, this is called a runaway.
There is an obvious yet naive way to avoid this specific kind of runaway: turn blocks off at the boundaries of the automation. So if a train overshoots its stopping or reversal point, it will run in the dead block and natural stop. This “solution” has its share of issues though:
- The mainline passenger train is a push-pull configuration. Entering a dead block means the lead engine will stop being powered and the rear engine will continue pushing. That is an ideal condition to make the whole train derail by pushing on the coaches in the middle. As a side effect, if the rear engine does not also reach the dead block, it will try to keep rolling yet be stationary, thus creating wear on wheels and track.
- The automation is not detecting or fixing the condition. This relies on a human report that “something went wrong” and implies human intervention to fix the situation.
So turning blocks off is maybe a possible failsafe scenario, yet one to be used sparingly understanding it’s not ideal.
One goal is to keep the automation simple. Too many fancy rules makes it harder to understand and/or predict what is going on for a human observer. We can’t expect to have an exhaustive set of error conditions to detect, as by definition the impredictible cannot be predicted and these are not life-threatening scenario. Worse case, the museum automation is broken for a few days or a week.
So what kind of error can we reasonably expect, and which ones can we detect, and can we fix them?
- Train fails to stop at the expected block.
- This can be seen with the Rapido RDC failing to stop at the reverse block and overshooting that block.
- The error condition is detecting the train in the next block. My original planning was to do that for the station where trains stop by wiring a block detector on an extra block; Also turnouts need to be aligned at the same time (to avoid shorting against a turntout).
- The fix would be in the automation to e-stop that train and reverse it to its expected location.
- Train failing to reach an expected next block or staying on the current block for too long.
- This could be due to the train derailing in between or a sensor malfunction.
- The error condition is having a timer on how long the train should be in the current block and how long before it reaches the next block. The timing should be generous enough to account for fluctuations.
- There is no likely fix in that case except issue an e-stop to the train and wait for manual intervention.
The RDC overshooting the reverse block could also be addressed using a timer. Trying to detect the “next” block may be counter-productive. For example on the branchline I can wire the next block, but I’m unlikely to detect the whole branchline, so eventually there’s always going to be a “next block” that has no detection.
On the station tracks, trying to detect the next block might also be counter-productive. Each station track starts with one turnout, a short block (about 1 foot), the main station block, another short block, and another turnout. “Ideally” I just need to detect each short block, but these blocks might actually be a tad too short. The Amtrak train makes it relatively easy since it’s composed of two engines in push-pull with three lighted cars in the middle so the whole train can be detected. However a short engine like the RDC can easily run over that block before we get a chance to detect it as the JMRI / NCE-AIU sampling time is quite slow.
Generally speaking we can model the train route as a succession of blocks, including gaps. Turnouts should be preset so that the train does not foul or run against a turnout if it overshoots its stopping blocks. In that case, given a model of the route and the known travel direction of the train, we can detect the train has been on the end-route block and is not on the block anymore. In that case, issue an e-stop followed by a reversal should “logically” bring the train back into the expected end-route block. This should be combined with a timer, e.g. if the train does not reach the expected block back within some time, issue an e-stop, stop the automation, and ask for manual intervention.