in Linux

Linux is for Bitches

Pardon the slight profanity; I don’t generally like to swear when I’m trying to make a point, but I didn’t invent the name of this site.

The views espoused by the author are obviously not much different from those in this excellent article in USENIX’s own journal, ;login:. (You’ll need to be a member to access that link, by the way) I’ve complained before about the proliferation of poorly-configured, poorly-managed Linux boxes taking over from the Windows boxes. It’s obviously still continuing to happen. Of course, the vendors are partly to blame, too. When the author of linuxforbitches.org writes about /var being an inappropriate place for web content (I wholeheartedly agree) you have many vendors to thank for that.

I lay the blame for the kernelized web-server, though, at the foot of Linus himself. Given that Linus is so militant about accepting patches, idiotic or not, I’m surprised — no, shocked — that he accepted this one. Considering that many kernel hackers are the same folks who probably bitched and whined about insecurity and instability when Windows NT 4.0 moved the drivers from user mode to supervisor mode (or Ring 1 to Ring 0, I don’t remember the exact terminology), the kernelized web server is a completely brain-damaged idea. It should be removed from the kernel at once, if it hasn’t already been so excised.

You know, despite all the claims about Linux’s stability, it still has a long way to go before it achieves the stability level of the BSDs. Under heavy workload, Linux still doesn’t cut mustard. Andrew Hume from AT&T Research presented a paper at HotOS-iX entitled Operating Systems: Shouldn’t They Be Better? True, he takes Solaris 2.6 to task in this paper as well, but the Linux flaws he describes are pretty shocking (these are from David Oppenheimer’s summary notes in August’s ;login::

Hume described eight problems the Gecko [his billing system] implementers experienced with Linux (versions 4.18 through 4.20), including Linux’s forcing all I/O through a file-system buffer cache with highly unpredictable performance scaling (30MB/sec. to write to one file system at a time, 2MB/sec. to write to two at a time), general I/O flakiness (1-5% of the time corrupting data read into gzip), TCP/IP networking that was slow and that behaved poorly under overload, lack of a good file system, nodes that didn’t survive two reboots, and slow operation of some I/O utilities such as df. In general, Hume said that he has concluded that "Linux is good if you want to run Apache or compile the kernel. Every other application is suspect."

The problem with many people measuring "stability" of Linux is that they think it’s a relative measurement: as long as it’s more stable than Windows, then it’s good. This is obviously a stupid way to look at it. Just because my Kia[1] doesn’t have exploding tires, doesn’t mean that it’s a particularly safe car.

People working on performance and stability in the Linux kernel are far outnumbered by the people trying to get their little pet project into the tree — vis à vis the kernelized webserver. Admittedly, performance and stability aren’t the most exciting research areas, but making Linux as stable as the BSDs is critical to its long term success. I mean, who cares if Linux can run on a zSeries or S/390 if the thing goes down like a ton of bricks when you throw a heavy workload at it?

Ultimately as a system administrator, I care much more about stability, and failing that, predictable, recoverable failure, rather than "feature-niftiness". When you have 1000 user accounts to manage and you get DDoSed, I want an OS that is feature-conservative but rock solid.

And that, in a convoluted way of my saying so, is why I don’t run Linux on my servers.

[1] I don’t, for the record, own a Kia. 🙂

Write a Comment

Comment