MikroTik Solutions: Musings on Docker

Motivation

MikroTik added container support to RouterOS in 7.4beta4, which allows you to run third-party background services directly on the router. While there is likely to be a small explosion of services that work quite well in this environment, there is a large number that will not. I thought it was interesting to think through one such problem, that of a fail2ban configuration for monitoring SSH login failures. In principle, that could indeed run on some RouterOS devices, but as it turns out, there are a number of reasons it’s impractical, which I find instructive.

For some MikroTik routers, there’s a single fatal limitation that takes them out of the running, and for others, an unhappy concatenation of weak workarounds yields the same end result.

The Many, Many Problems with Containers on RouterOS

Before we get into the case study proper, I wish to direct you to my later and more broadly-based article here, “Container Limitations.” I’ve tightened this one’s focus on a single example case while the other goes into detail on each limitation relative to the big-boy container runtimes.¹

1. CPU Compatibility

This is best left to the CPU Limitations section of the companion article. For the purposes of this case study, it suffices to offer my belief that MikroTik’s incentive to support anything other than Intel and ARM CPUs will remain near-zero for the foreseeable future.

2. Third-Party Requirements

If you’re able get past problem #1, you then have to build a container — or find one pre-built — that meets all the run-time requirements of your background service. The ideal case is a single statically-linked executable, but the vast majority of useful software has substantial requirements that one typically doesn’t even think about when running the software on laptops, desktops, and big server computers. It’s either all just there, ready to be shared, or you’ve got so much storage that it doesn’t matter if you cart some third-party dependencies around.

To build a container, you have to provide all of that inside the container, somehow. If it duplicates resources provided by the host, too bad: containers purposefully isolate the internal processes from the host, so there is no more sharing of programs, libraries, and so forth than you get between any two standalone computers.

To take our example of fail2ban, it’s historically a fairly portable tool, but any given version runs on only a subset of Python versions; roughly, those contemporaneous with the release. The practical path out of this trap is to start with an existing Linux distribution that has a version of fail2ban ported to it, which then brings us to the next problem.

3. Resource Usage

Python is not terribly resource efficient. A Linux distro and all of the dependencies needed to run fail2ban may exceed the persistent storage space and/or free RAM available on your router.

For instance, this fail2ban image hosted on DockerHub will take about half the space on the broad class of MikroTik routers with 128 MiB of storage space. This includes all current members of the CCR2004 line and the RB3011. It even includes otherwise high-end products like the CCR2116 and CCR2216. If we step across into MikroTik’s SOHO WiFi router range, we find that most of them don’t have even 128 MiB of storage space. The only two that did at the time MikroTik added container support to their OS were the hAP ac³ and the Audience. More products have since been released with more RAM and storage space, but it remains a small subset of current products; I expect it to take years before you have many choices of hardware capable of running flabby containers.

Consider the RB4011 and the RB5009, both of which are big enough to comfortably host such a container on the SoC’s built-in flash. These are fairly expensive routers; there’s a pretty premium to run a service like fail2ban on your router.

Worse, as we will see below, this problem has a way of expanding by surprise: for the fail2ban case, we also need a copy of rsyslog and an SSH client inside the container. This is true even though the host (RouterOS) has a logging subsystem and an SSH client built-in. We can’t share them with the processes running inside the container; we must duplicate them, at a storage, RAM, and CPU cost.

4. Run-Time Storage Requirements

The ideal use case for containers on RouterOS is when you have a background service that either requires no additional storage space beyond that covered in the prior section, or at least has a predictable and rarely-increasing requirement, such as stable configuration data. A good example is a DNS server where the local zone files rarely change, and data pulled from remote servers can safely be cached in RAM, rather than persisted to flash storage.

In all other cases, you must account for the growing storage space required by your container.

To return to the fail2ban example, you must account for the space the logs require, since you still have to redirect log data into the container, else fail2ban will have no input to crawl through. That requirement is likely to exclude all of the 128 MiB routers all by itself. For the routers left in the running after that bottle-necking requirement, you’re then back to problem #2 at the top of the rsyslog article: storing logs on flash storage is likely to materially shorten the service life of the device. Only the swappable m.2 SSD in the RB1100 is immune from this.

If MikroTik ever releases my dream device — an ARM-based multi-core hEX S+, including a microSD slot and an SFP+ port — it may be a suitable container host, but only if I’m willing to replace the SD card from time to time, as each one wears out.

5. Host Isolation

If you can fix all of the above, you’re left with the fact that container engines purposefully isolate the internal processes from the host-side facilities.

Take the fail2ban example one last time: in order for it to reach out and issue “/ip/firewall/filter” commands on the RouterOS host, you’ll have to burn even more space in the container by installing an SSH client and configuring it to connect out to the host.

This will work, but it calls into question the logic behind the initial wish to run fail2ban on the router itself: surely you were hoping to avoid all this rsyslog and SSH stuff? Sorry; can’t be done.

Conclusion

It is my opinion that all of the above forces you to the bare-metal x86 version of RouterOS or to CHR for any service requiring substantial amounts of storage. (That, or possibly the RB1100 Dude edition.)

Sites that can justify the hardware expense for that likely have an underutilized Linux box sitting around, or at least a VM hypervisor or container host that can much more easily host a small fail2ban host than the router.

Services like fail2ban were written with the assumption of a big server-class CPU, plenty of RAM, and ludicrous amounts of storage to host logs with. Don’t fight the design.

One last thing: I do not point all of this out to say that RouterOS’s support of containers is terrible. Indeed, I think it’s wonderful. I’m simply pointing out that it has serious limitations that are worth taking into account when planning what services run, where. For some, containers on RouterOS will be just the thing. For others, the service is likely better run elsewhere.

License

^{^} One such is Docker Engine, but beware that this article doesn’t focus on “Docker” proper at all despite the title. Early MikroTik documentation misused “docker” in places where it should have used a more generic term like “OCI containers,” and while I fixed all of that in later editions of this article, I can’t go back and change the title without breaking the URL to the article itself.