Linux’s Curse (Again)

By Adrian Sutton

July 25, 2004

The story so far:

Preston Gralla commented
I commented
Brian McCallister commented
I commented again
Brian McCallister commented again At least I think that’s how it went. Firstly, Brian was right to call me on my use of cygwin to bring UNIX capabilities to Windows. It’s not in the default install, it’s not at all obvious and 99% of Windows users will never even hear about it. As Brian says, “if you don’t use it, you don’t learn it”. So if we conceed that the command line is a killer attraction then Linux has it’s big advantage over windows. That’s where I think Brian and I may disagree. First let me start by saying that the command line is great, it’s an incredibly powerful tool with a lot of really great advantages. It does however require a lot of learning and it’s not once off learning either. The command line requires you to constantly learn – every new task requires research to find out what command does what you want. Then you have to constantly remember all these different commands so that you can use them when you need them. Nothing is obvious, nothing is intuitive. Everything is powerful.

This is a paradigm thing — the drive to ubiquitize computers required them to have an interface comparable to that of a toaster. Now that they are ubiquitous, lets bring back the idea of a powerful interface. Please. I agree that we need to make user interfaces more powerful and let people do more with their computers but don’t throw the baby out with the bath water. People are no more capable of learning interfaces today as they were 15 years ago. The ubiquity of the GUI does not make it easier for people to learn the command line, in fact it makes it harder due to the unlearning required. Having a command line available with powerful tools is great for advanced users that want to get into that but its still not an option for the vast majority of computer users because they will never get enough benefit out of it to justify the learning cost. Furthermore, the learning cost will be excessively high for casual users because they will continually forget the commands that are available to them. So how do we reconcile the two goals – having a simple interface and providing full power to advanced users. Most people will suggest creating two interfaces, one for novices and one for advanced users (the novice interface is usually called a “wizard”). This is outright bad user interface design. Jef Raskin provides the best argument I’ve seen on why this is bad user interface in The Humane Interface but sadly I don’t have a copy at hand to give an exact reference. Essentially though the argument is that instead of users having to learn one interface, they must now learn two to be able to use the software. When they first start using the program they are a novice and learn to use the wizard interface. Then they become more familiar with the program and consider themselves advanced so they switch to the advanced interface. Unfortunately once they change the interface they are no longer advanced users – they are completely new to the interface and are in fact beginners. All the learning they did with the beginner interface is worthless and they have to start from scratch with the advanced interface. Worse still, the advanced interface will almost certainly have been designed with “these are advanced users in mind, they’ll work it out” in mind and is thus much more difficult to learn that it should be. The other big problem with having two interface modes is the amount of extra developer time that is required to achieve it. That time could have been better spent making the advanced interface easier to learn. How does this relate to the GUI vs Command line debate? Firstly it shows a weakness in interfaces like Linux where you can do a lot with the GUI but quite often have to switch to the command line, as well as a weakness with Windows where you can do a lot with the command line but often have to switch to the GUI. It’s also a weakness with OS X both ways (some things are GUI only, some things are command line only). More importantly though it explains why we can’t expect people to learn a command line interface now any more than we could when computers first got started. So how do we make things more powerful while keeping the baby firmly in the bathtub? The first thing I’d point to is AppleScript which is an awesomely cool way to bring some of the power of the command line to the GUI. The ability to pipe one program into another is realized through AppleScript and in fact extended much beyond what the command line pipes can do. AppleScript is shell scripting for the GUI. AppleScript however is difficult to learn and the language is awful but these are implementation details – the idea itself is still sound. The biggest problem with the AppleScript concept though is that you effectively always have to write a shell script which involves firing up the script editor. Too slow. What if we mixed the concept of the GUI and the command line together though? Most of the time you’re in the GUI just like normal because it’s easy to use and for the most common computing tasks it’s the most efficient way to do things (how many sighted people surf the web exclusively from lynx?). When you need the power of the command line though you hit a key combination and a command line pops up to allow you to write AppleScript snippets (though in a more intuitive language that AppleScript). Oddly enough, HyperCard contains pretty much this exact interface. If you hit Apple-M in a HyperCard stack, the message box pops up and you can enter any command you like to control that stack, another stack or even execute an AppleScript. One key thing here though is that it’s not a terminal window that pops up, it’s a floating window that by default operates on the current application. So if I’m in Microsoft Word typing away and I think to myself: “I need to insert all the jpg images of my charts into this appendix” today I would have to click “Insert->Image…->From File…->chart.gif” however many times but with the built in command prompt I’d just hit the magical key combination to bring it up and then “insert /Users/aj/Documents/charts/*.gif” and let Word do the rest. Note that insert would be an AppleScript command defined by Word and tab completion is a necessity. Similarly, if I wanted to attach a zip archive of a particular folder to an email, I’d bring up the command prompt with a keystroke and enter something like “attach `zip /Users/aj/Documents/emailDocs`” or better “zip /Users/aj/Documents/emailDocs and attach it” which is much more HyperCard like. That scheme combines the power of command lines with the power and simplicity of GUIs. Coming back to Brian’s comments though:

The huge thing linux has that nothing else does is that it provides this interface, integrates everything with it, is free to obtain, straightforward to set up, and available right now. Linux provides a command line interface sure, so does OS X and Windows (though the DOS command line is pretty ordinary). Linux is free but that’s not really a consideration because the cost of the initial computer set up is dwarfed by the amount of costs associated with running it. Besides I want good software not necessarily free software (though I’d be most happy if it were good and free). Linux is available now but so is everything else that’s currently available (oddly enough). Linux is getting a lot easier to set up but it’s definitely not straight forward in many cases. Most people who try installing linux seem to have at least a few issues but I agree it’s nothing insurmountable. So we’re left with the integration of Linux and the command line. Frankly I wouldn’t consider it integrated very well at all. Sure a terminal emulator is readily available and pretty obvious in Linux interfaces by default but the world of command line mostly stays in the terminal emulator. There are very few GUI applications that really integrate the command line and certainly GNOME and KDE don’t. If the command line is really Linux’s killer feature, it needs to be put to use a heck of a lot more, right throughout the system. I should be able to tell Mozilla to send the currently selected text of the web page through ‘sort’ and show me the result. I should be able to select a folder in KDE and run grep or zip or grep -l ‘Vector’ **/*.java | kdeSelect and have all Java files inside the selected folder which contain the word Vector become selected in KDE. That’s integration and that’s not available in Linux as far as I know. That’s the kind of innovation that Linux is missing and it would make Linux more powerful not less powerful. Update: I just noticed that Tyler Mitchell was talking about “How do you bridge the CLI vs. GUI gap in app. design?” which I think I just answered so let me throw a ping over that way too. Many of the comments there are interesting as well.