The robustness of Linux is widely acknowledged, but it can’t quite match the microsecond management of a real-time operating system (RTOS) for time critical situations such as CNC machine instructions, vehicular control, or health sensor collection. If your software must record, manage, or control events within a narrow and precise time window and you’re invested in Linux for core development, you can consider some of these strategies for handling time-critical tasks without abandoning your familiar environment.
Back in the early aughts, QNX was making their flagship QNX 6 RTOS available for free for x86 developers. They were already quite successful with QNX 4 in medical and other mission critical equipment, and QNX 6 came with some very interesting innovations.
It ran a microkernel that did almost nothing but scheduling and message passing. Drivers were all in user space. Their Photon X server had a micro kernel architecture as well, so you could shutdown parts of it (or device drivers, for that matter) and bring them back up through the command line. You could of course set up watchdogs and automate restarts of anything but the microkernel itself.
I had some fun with it with and old and failing 486.
I’ve been working in that exact problem for the last couple weeks. My solution for now is rt patch and a dedicated cpu core for rt tasks. This already works pretty reliable, but I notice small delays from time to time. I gather from the article that my problem might be page swapping. I don’t know how to improve that, yet.
Also for anybody working on rt problems: I highly recommend the
stress-ng
tool for stress testing and finding bottlenecks of your system.If you’re working on something that truly needs faster response times at the kernel, you might be better off looking at Zephyr or FreeRTOS for more consistency. “Real-time” mode in the plain Linux sense is just a series of patches which work towards one goal (skipping schedulers and such), but not all coherently working together. Other RTOS’s out there are designed from the outset to streamline such things.
That’s why many modern SoCs have a smaller core for realtime in addition to larger application processors. TI Sitara (Beaglebone) has 2 fast custom arch coprocessors for IO with access to most pins and the ability to DMA into the AP’s address space. All Raspberry Pis up through Pi4 run a proprietary ThreadX runtime on a graphics processor (VPU) to handle bootstrapping the ARM APs, housekeeping, and a large part of the IO.
Been ages since I had to recompile a Linux kernel to deal with hard real time (via RTAI) but I recall emc2 being a great alternative to all the fussing around recompiling as some one did all the work for that.
I also recall using this resource . Eventually I just made a class for the threads I was using to wrap POSIX and RTAI calls for periodic tasks and chose which was the underlying method on a compiler flag. If I was on my desktop I could proof of concept most things in POSIX and then test on the RTAI machine. If I need to revisit this again I may dust off my old class and add freeRTOS stuff to it so I can prototype on Linux then try to squeeze it on to an esp32.
I haven’t recompiled a kernel since 2002. LoL!
I remember hearing about this in the context of space missions, Linux just isn’t a good fit for critical systems
Yeah, space software is crazy restrictive. I read they don’t allow you to allocate ANYTHING on the heap. It’s all static stack-based all the way.
A 10 year long memory leak on the way to Pluto isn’t a good thing I guess…
The new mars helicoptor, ingenuity, runs linux.
https://www.theverge.com/2021/2/19/22291324/linux-perseverance-mars-curiosity-ingenuity
Their solution is to hold two copies of memory and double check operations as much as possible, and if any difference is detected they simply reboot. Ingenuity will start to fall out of the sky, but it can go through a full reboot and come back online in a few hundred milliseconds to continue flying.
https://news.ycombinator.com/item?id=26181763
Dunno if future one’s will run linux though, since this is just an experiment.