BSDCan 2005 Remarks

I’ve just returned from the 2nd annual BSDCan conference in Ottawa, Ontario, organized by the very capable Dan Langille. In addition to being a super nice guy, Dan is also the founder of the FreeBSD Diary, FreshPorts, FreshSource and is involved with the BSD Certification project. Let nobody say that Dan hasn’t given enough to the BSD community!

In the spirit and style of the USENIX conference summaries published in ;login: magazine, I’m going to summarize (and pollute with my own personal remarks) the sessions that I attended.
Continue reading

tape hardware, part two

While on the topic of tape hardware and backups… never mind my little DLT7000 drive at home. How do you back up a 4TB Titan NAS?

We bought one of these servers at work last year; we’re finally getting around to using it for something. Our current challenge is trying to figure out how to back up a 1TB Interwoven content store (we’ve just bought almost the entire product line from Interwoven) without IT screaming at us for taking up their entire tape rotation schedule. This is on top of having to back up a large MediaBin store as well.

I’ll be happy when the Titan is actually up and running, though. We’ve been having some problems getting the CIFS partitions running, because the Titan really needs an Active Directory server in order to enforce permissions, and all we have is a Windows NT 4 domain controller (think again about hacking it; it’s on an internal network). The problem is that we never originally intended the Titan to be used for Windows shares; the unit was purchased long before we decided to go with Interwoven on Windows entirely.

Interesting technical challenges abound…

[belated] report from BSDCan 2004

I attended BSDCan 2004 last month and I’ve been remiss in not speaking about it. In my view it is an example of a conference “done right”. Here’s why:

  • Inexpensive (registration fees were $150)
  • Excellent speakers
  • Good crowd, from developers to administrators to plain old users and students

There’s really not much more than that to ask for at a conference. Lots of FreeBSD developers attended; among them, Wes Peters, Robert Watson, Poul-Henning Kamp, Jacques Vidrine, and so on. It just proves that you don’t need to charge an arm and a leg to demonstrate how “professional” your conferences are, or to attract quality speakers.

The full text of all the papers presented is here.

RWL and maddog’s talk

Last Thursday I ducked out of work to hear maddog give a talk at the Real World Linux trade show — conveniently located across the street from my office. Given that RWL was largely a trade show for PHBs (Pointy-Haired Bosses), I was bracing myself for a PHB-oriented talk, and in many ways, it was. His subject matter was clearly intended to help win over whatever proportion of the audience not already enamoured with Linux. That’s fair, and I applaud him for that. Linux is still suffering slow adoption in large, conservative corporations — financial institutions, for example — and anyone making an effort to loosen the ties of conservative CTOs, on whatever grounds, should be applauded.

I do want to point out the hilarious juxtaposition of some of maddog’s talking points with the circumstances of the show. Let me summarize the central points of maddog’s talk:

  1. Between the 50’s and the 70’s all software development was open source — when you paid for software, you got the source code if you wanted it. (Ignore the historical inaccuracies of this generalization.)
  2. In the late 70’s and early 80’s when a company was developing (closed-source) software, they had, for example, 100 engineers, and 2500 customers. Each customer would generate on average one feature request and one bug fix per year, so per year you would have 5000 requests. No problem; each engineer would handle 50 requests a year.
  3. Once IT became a huge industry, the company in question might now have 200 engineers, but 2.5 million customers, each generating two requests a year. Therefore each engineer would be required (theoretically) to handle 250,000 requests a year, which is clearly untenable.
  4. Therefore, open-source software development is better because even if there are 2.5 million consumers, the number of developers is limitless.

Obviously this is a gross oversimplification, and I’m not trying to criticize maddog on these grounds. As I pointed out above, he’s trying to convince PHBs to use Linux, and why the quality of Linux as an OS can be better, due to more eyes looking at the code.

Maddog went on to talk about how large commercial organizations are unresponsive to customers’ concerns due to this very reason (scope/feature creep), and also used this to justify OSS development as better. Okay, that’s probably a reasonable statement too.

While I was sitting there and listening to maddog outline these truisms about how OSS software development and community support, etc. is better than that of commercial software development and commercial support, not ten feet away we had an entire trade show floor of exactly the same closed-source-type, commercial organizations, pitching their products the same as they would be pitching them at COMDEX or CeBIT! The only difference is that, perhaps, some of the products were built on OSS technology, or they ran on Linux. Nevertheless, when I go up to ACCPAC’s booth at RWL and talk to the sales drone, how is this any different than when I go up to ACCPAC’s booth at COMDEX and talk to the same drone? There’s no difference; ACCPAC is still the same, massive, monolithic commercial company with the same problems regarding creeping featurism that maddog outlined in his talk!

The fundamental problem I now have with Linux is that rather than companies developing software the way OSS developers would develop software (which, if you believe maddog, would be the better way), those same companies are just taking Linux (as they have every right to do, mind you), inserting it into their own corporate framework, and selling it just like any other product that they would sell. It doesn’t matter whether they contribute the code back to the community; the development model is still all wrong. To see this in action, I point you to Novell.

Novell made their money selling a proprietary server operating system called Netware, and now makes some money selling copies of SuSE Linux, Red Carpet, Evolution, and so on. Problem is, they’re selling these things like they used to sell Netware. They haven’t realized there’s a paradigm shift here: all the benefits of OSS development that maddog pointed out in his talk aren’t worth a damn if they have to be funnelled through a vendor who’s just as inflexible (in terms of support) with their distribution of Linux as they are with their own, proprietary, closed-source software!

RedHat is another example of this: in order to meet the demands of their customers, they heavily bastardize the stock Linux kernel with their own patches, written by their own developers. But there’s nothing to say that these patches have to be incorporated back into the kernel: that’s up to Linus’ personal discretion. Eventually RedHat winds up with The RedHat Linux Kernel which is significantly different than the stock kernel, and voilà you lose the benefits of having the greater OSS community available to help you with all those feature requests and bug fixes. We’re back again to the situation where only the vendor’s 200 engineers understand the end product, and the support sucks again because those 200 engineers can’t handle the five million support requests.

Explain to me how this is different than non-OSS, commercial software?

In conclusion, what I find most perverse about RWL and the state of Linux in general is that rather than it changing the paradigm of the way software development is done in the world, it is, in fact, being subsumed into the closed-source software development paradigm. To put it another way: rather than leading formerly closed-source companies to open their source in order to reap the
benefits of limitless development manpower, Linux is now, by virtue of the vendors, being closed down.

I should note that this closing-down isn’t absolute, and it won’t ever be, so long as we have independent distributions like Debian. But I was still surprised to see maddog get up on stage and trumpet the virtues of the OSS development model, when those benefits are being circumvented by many of the vendors before his very eyes.

The Windows Crash Gallery

If you want a good laugh, head on over to windowscrash.com. There’s nothing like a whole gallery of BSODs showing up on public billboards, airport screens and payphones to demonstrate just how unstable and unusable Windows is.

There’s also a fantastic collection of meaningless and just downright hilarious Windows error messages there. My favourite is this one entitled Mouse. This one comes a close second though.

messy Linux dmesgs

Season’s greetings, everyone! It’s time for yet-another-edition of Things In IT That Bug Me. Today’s victim is: overly chatty Linux dmesgs. This may seem a bit frivolous of a complaint. However, I feel that since the dmesg is one of the first things one seems when one boots an operating system, having a ridiculously chatty and verbose bootup sequence makes Linux look like it’s patched together with no overarching control. Basically, I don’t think 90% of end-users care about seeing:

  • Memory address space allocation dumps
  • The compiler used to create the kernel
  • RCS ID strings, version numbers, names and companies of the authors of various pieces
  • Debugging information only useful to the developers of a particular piece.

I’m a big fan of the way the BSD kernel messages are
structured. With a few exceptions, all one really needs to know when
the OS is booting up is what devices were detected. And that’s all.

Just have a look at the following bootup sequence from my work machine. Do you really think an end-user cares, for example, that "Linux NET4.0 for Linux 2.4" is "[b]ased upon Swansea University Computer Society NET3.039" or that the USB UHCI driver was committed on October 11 at 3:36 p.m. with revision 1.275, or that Richard Gooch ([email protected]) wrote the mtrr driver? I highly doubt someone is going to e-mail Richard Gooch directly based on the contents of the dmesg, but this shows up on every Linux dmesg.

The following dmesg is nearly 140 lines long. Booting FreeBSD on the same machine yields a dmesg that’s around 80 lines. It’s time that Linux got its act together and cleaned up the messy dmesg, or the problem will continue to balloon out of control.

My dmesg:


Linux version 2.4.20-20.9.XFS1.3.1 ([email protected]) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #1 Sat Oct 11 15:23:43 CDT 2003
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003ff77000 (usable)
BIOS-e820: 000000003ff77000 - 000000003ff79000 (ACPI NVS)
BIOS-e820: 000000003ff79000 - 0000000040000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
On node 0 totalpages: 262007
zone(0): 4096 pages.
zone(1): 225280 pages.
zone(2): 32631 pages.
Kernel command line: auto BOOT_IMAGE=2.4.20-20.9.XFS ro BOOT_FILE=/boot/vmlinuz-2.4.20-20.9.XFS1.3.1 hdd=ide-scsi root=LABEL=/
ide_setup: hdd=ide-scsi
Initializing CPU#0
Detected 1993.983 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 3971.48 BogoMIPS
Memory: 1026556k/1048028k available (1407k kernel code, 17896k reserved, 1072k data, 136k init, 130524k highmem)
kdb version 4.3 by Keith Owens, Scott Lurndal. Copyright SGI, All Rights Reserved
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode cache hash table entries: 65536 (order: 7, 524288 bytes)
Mount cache hash table entries: 512 (order: 0, 4096 bytes)
Buffer-cache hash table entries: 65536 (order: 6, 262144 bytes)
Page-cache hash table entries: 262144 (order: 8, 1048576 bytes)
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: bfebfbff 00000000 00000000 00000000
CPU: Common caps: bfebfbff 00000000 00000000 00000000
CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz stepping 07
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch ([email protected])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfbe5e, last bus=2
PCI: Using configuration type 1
PCI: Probing PCI hardware
Transparent bridge - Intel Corp. 82801BA/CA/DB PCI Bridge
PCI: Using IRQ router PIIX [8086/2440] at 00:1f.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16)
Starting kswapd
allocated 32 pages and 32 bhs reserved for the highmem bounces
VFS: Disk quotas vdquot_6.5.1
pty: 2048 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ SERIAL_PCI ISAPNP enabled
ttyS0 at 0x03f8 (irq = 4) is a 16550A
ttyS1 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10e
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
NET4: Frame Diverter 0.46
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00beta3-.2.4
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH2: IDE controller at PCI slot 00:1f.1
ICH2: chipset revision 4
ICH2: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:DMA
hda: ST340016A, ATA DISK drive
blk: queue c03ed4e0, I/O limit 4095Mb (mask 0xffffffff)
hdc: Lite-On LTN486S 48x Max, ATAPI CD/DVD-ROM drive
hdd: HL-DT-ST GCE-8481B, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: attached ide-disk driver.
hda: host protected area => 1
hda: 78165360 sectors (40021 MB) w/2048KiB Cache, CHS=4865/255/63, UDMA(100)
ide-floppy driver 0.99.newide
Partition check:
hda: hda1 hda2 hda3
ide-floppy driver 0.99.newide
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 394k freed
VFS: Mounted root (ext2 filesystem).
SGI XFS 1.3.1 with ACLs, no debug enabled
SGI XFS Quota Management subsystem
XFS mounting filesystem ide0(3,2)
Ending clean XFS mount for filesystem: ide0(3,2)
Freeing unused kernel memory: 136k freed
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
usb-uhci.c: $Revision: 1.275 $ time 15:36:30 Oct 11 2003
usb-uhci.c: High bandwidth mode enabled
PCI: Found IRQ 11 for device 00:1f.2
PCI: Setting latency timer of device 00:1f.2 to 64
usb-uhci.c: USB UHCI at I/O 0xff80, IRQ 11
usb-uhci.c: Detected 2 ports
usb.c: new USB bus registered, assigned bus number 1
hub.c: USB hub found
hub.c: 2 ports detected
PCI: Found IRQ 9 for device 00:1f.4
PCI: Setting latency timer of device 00:1f.4 to 64
usb-uhci.c: USB UHCI at I/O 0xff60, IRQ 9
usb-uhci.c: Detected 2 ports
usb.c: new USB bus registered, assigned bus number 2
hub.c: USB hub found
hub.c: 2 ports detected
usb-uhci.c: v1.275:USB Universal Host Controller Interface driver
usb.c: registered new driver hiddev
usb.c: registered new driver hid
hid-core.c: v1.8.1 Andreas Gal, Vojtech Pavlik
hid-core.c: USB HID support drivers
mice: PS/2 mouse device common for all mice
hub.c: new USB device 00:1f.2-1, assigned address 2
Adding Swap: 1044216k swap-space (priority -1)
input0: USB HID v1.10 Mouse [Logitech USB Optical Mouse] on usb1:2.0
XFS mounting filesystem ide0(3,1)
Ending clean XFS mount for filesystem: ide0(3,1)
hdc: attached ide-cdrom driver.
hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
SCSI subsystem driver Revision: 1.00
hdd: attached ide-scsi driver.
scsi0 : SCSI host adapter emulation for IDE ATAPI devices
Vendor: HL-DT-ST Model: CD-RW GCE-8481B Rev: C102
Type: CD-ROM

When I get a chance, I’ll capture a FreeBSD dmesg on this same box and you can see how much cleaner it is.

*LET* the Lawsuits Fly.

Wow. It’s been a while since I last wrote. Happy Hallowe’en, by the way.

I now work for CBC.ca in the New Media Production & Operations department as a software developer. Primarily this involves Java development, but I also do a bit of Perl (as much as I hate to). I also weigh in on system administration matters quite a bit, since I think of myself as a half-time sysadmin, half-time programmer.

I just had to respond to this slightly brain-damaged article which appeared in eWeek recently. Now I know eWeek is one of these magazines for PHBs but I still like to flip through it (very briefly) to see what the PHBs are being told these days, and how I can counteract that. So this article — if you go and read it — basically says well, Microsoft has to make "Longhorn" really secure, and improve their security in general, or else legislatures will impose security warranties upon software developers, and this will impact all developers and not just Microsoft.

My reaction — as both a sysadmin and a developer — is: so what? Isn’t that a good thing? I’ve often railed about the fact that software is but one of the few industries where you can sell an expensive product to someone and not be held to any legal liability whatsoever. No warranty to speak of beyond the value of the actual compact disc that the software arrived on. In my mind, this is a bad thing. So when Brian Livingston says something like

Such an earthquake could emerge not just from legislatures but also from courts. All it would take would be a precedent-setting ruling that the “we’re-not-liable” language that’s commonplace in shrink-wrap licenses is “unconscionable and unenforceable.” The lawsuits would fly.

I think he’s bang on — but the lawsuits should fly, in fact, if the software is defective. This would certainly stop not only Microsoft from shipping poorly-tested products, but all software vendors.

On a lighter note — check this out. If you work for Allstate, you can submit your resignation online. No word on whether a security guard will be e-mailed to you to escort you out of the building, too. (Speaking of which — someone at work was joking about dressing up for Hallowe’en as a manager who was unceremoniously sacked some time ago. “I’ll just dress up two mannequins in security guard uniforms, put one under each arm, and I’ll be [name removed] being escorted out of the building!”)

California Gubernatorial Race

Now that the California gubernatorial race has turned into a complete circus sideshow, with both Arnold Schwarznegger and Larry Flynt of Hustler running, I’m suggesting that Darl McBride should mount a campaign, as well. Since the state of California isn’t doing so well financially, he can mount frivolous lawsuits against other states in an attempt to prop up the economy.

In fact, he could have the State of CalifOrnia (SCO) claim to own the copyright to the concept of rolling blackouts, which they purchased from PG&E. Then, he can sue, say, Idaho, for initiating blackouts without paying proper licensing fees.

Or perhaps, after IBM’s lawyers are finished breaking his spine on the Catherine wheel, he’ll just have to find another ailing public company in need of a business model that involves suing people.

Linux is for Bitches

Pardon the slight profanity; I don’t generally like to swear when I’m trying to make a point, but I didn’t invent the name of this site.

The views espoused by the author are obviously not much different from those in this excellent article in USENIX’s own journal, ;login:. (You’ll need to be a member to access that link, by the way) I’ve complained before about the proliferation of poorly-configured, poorly-managed Linux boxes taking over from the Windows boxes. It’s obviously still continuing to happen. Of course, the vendors are partly to blame, too. When the author of linuxforbitches.org writes about /var being an inappropriate place for web content (I wholeheartedly agree) you have many vendors to thank for that.

I lay the blame for the kernelized web-server, though, at the foot of Linus himself. Given that Linus is so militant about accepting patches, idiotic or not, I’m surprised — no, shocked — that he accepted this one. Considering that many kernel hackers are the same folks who probably bitched and whined about insecurity and instability when Windows NT 4.0 moved the drivers from user mode to supervisor mode (or Ring 1 to Ring 0, I don’t remember the exact terminology), the kernelized web server is a completely brain-damaged idea. It should be removed from the kernel at once, if it hasn’t already been so excised.

You know, despite all the claims about Linux’s stability, it still has a long way to go before it achieves the stability level of the BSDs. Under heavy workload, Linux still doesn’t cut mustard. Andrew Hume from AT&T Research presented a paper at HotOS-iX entitled Operating Systems: Shouldn’t They Be Better? True, he takes Solaris 2.6 to task in this paper as well, but the Linux flaws he describes are pretty shocking (these are from David Oppenheimer’s summary notes in August’s ;login::

Hume described eight problems the Gecko [his billing system] implementers experienced with Linux (versions 4.18 through 4.20), including Linux’s forcing all I/O through a file-system buffer cache with highly unpredictable performance scaling (30MB/sec. to write to one file system at a time, 2MB/sec. to write to two at a time), general I/O flakiness (1-5% of the time corrupting data read into gzip), TCP/IP networking that was slow and that behaved poorly under overload, lack of a good file system, nodes that didn’t survive two reboots, and slow operation of some I/O utilities such as df. In general, Hume said that he has concluded that "Linux is good if you want to run Apache or compile the kernel. Every other application is suspect."

The problem with many people measuring "stability" of Linux is that they think it’s a relative measurement: as long as it’s more stable than Windows, then it’s good. This is obviously a stupid way to look at it. Just because my Kia[1] doesn’t have exploding tires, doesn’t mean that it’s a particularly safe car.

People working on performance and stability in the Linux kernel are far outnumbered by the people trying to get their little pet project into the tree — vis à vis the kernelized webserver. Admittedly, performance and stability aren’t the most exciting research areas, but making Linux as stable as the BSDs is critical to its long term success. I mean, who cares if Linux can run on a zSeries or S/390 if the thing goes down like a ton of bricks when you throw a heavy workload at it?

Ultimately as a system administrator, I care much more about stability, and failing that, predictable, recoverable failure, rather than "feature-niftiness". When you have 1000 user accounts to manage and you get DDoSed, I want an OS that is feature-conservative but rock solid.

And that, in a convoluted way of my saying so, is why I don’t run Linux on my servers.

[1] I don’t, for the record, own a Kia. 🙂