So! I have all these cool things I want to write about, but I broke my thumbnail. Can you tell that’s a long story?
See, this summer I got excited about playing guitar again. I usually switch between all-guitar and all-piano every other year or so. This summer I dusted off the guitars and learned a bunch of pieces, and even composed one. I was prepping for — among other things — a multimedia blog entry. It was going to have a YouTube video, and a detailed discussion of a wacky yet powerful music programming language you’ve probably heard of but never used, and generally just be really cool.
And then it all came crashing down when I busted my thumbnail off. And I mean off — it broke off at least a quarter inch below where the nail and skin meet. Ick. I just accidentally jabbed my steering wheel, and that was that.
I remember reading an interview with some dude who said he had punched a shark in the nose. He said it was like punching a steering wheel. So now I know more or less what it’s like to punch a shark in the nose, I guess. There’s always an upside!
Anyway, that was going to be my magnum opus (literally: Op. 1) for the year, but it fell through for now. I’ll have to revisit the idea next year. My thumbnail’s back, but it’s been at least 2 months since I touched my guitar, so I’ll have to start over again.
Work has been extraordinarily busy, what with having to collect all these Nuka-Cola Quantum bottles and so on. I’m sure you can imagine. So I haven’t had much time to blog lately.
But I do like to publish at least once a month, whether or not anyone actually cares. It’s been about a month, or it feels that way anyway, and all I have to show for it is this box of Blamco Mac and Cheese.
So I’m cheating this month.
You know how on Halloween how you walk around in your costume holding your little bag and you say “trick or treat”, and every once in a while some asshole does a trick instead of dumping half a pound of candy into your bag? And then he has to explain to all the dumbfounded and unhappy kids that “Trick or Treat” means that a trick is perfectly legal according to the semantics of logical-OR, and the kids remember that a-hole for the rest of their childhoods and avoid his house next year?
So I’m doing a trick this time. Hee. It’s actually kind of fun when you’re on the giving end.
My trick is this: in lieu of saying anything meaningful or contemporarily relevant, I’m writing about something I did over a year ago. And there isn’t much to say, so this really will be short.
Around a year ago, I wrote a blog called Stevey’s Boring Status Update, mostly in response to wild rumors that I’d been fired from Google. Not so. Not yet, anyway.
This meant I wound up having to write my own Ecma-262 runtime, so that the evaluator would have something to chew on. In particular, the Ecma-262 runtime consists of all the built-in properties, functions and objects: Object, Function, Array, String, Math, Date, RegExp, Boolean, Infinity, NaN, parseInt, encodeURIComponent, etc. A whole bunch of stuff.
I didn’t know Emacs-Lisp all that well before I started, but boy howdy, I know it now.
Emacs actually has a pretty huge runtime of its own — bigger than you would ever, ever expect given its mundane title of “text editor”. Emacs has arbitrary-precision mathematics, deep Unicode support, rich Date and Calendar support, and an extensive, fairly complete operating system interface. So a lot of the porting time was just digging through Emacs documentation (also extensive) looking for the Emacs version of whatever it was I was porting. That was nice.
And then I broke my thumbnail.
Actually, what happened was js2-mode.
I spent a little time trying to beef up the parser, then realized it would be a lot faster to just rewrite it by porting Mozilla Rhino’s parser, which is (only) about 2500 lines of Java code. Ejacs is something like 12,000 lines of Emacs-Lisp code, all told, so that didn’t seem like a big deal.
So I jumped in, only to find that while the parser is 2500 lines of code, the scanner is another 2000 lines of code, and there’s another 500 or so lines of support code in other files. So I was really looking at porting 5000 lines of Java code.
Moreover, the parse tree Rhino builds is basically completely incompatible with the Ejacs parse tree. It was richer and more complex, and needed more complicated structures to represent it.
So after I’d ported the Rhino parse tree, what I really had was a different code base. I went ahead and finished up the editing mode, or at least enough to make it barely workable (another 5000 lines of code), and launched it. It was a surprisingly big effort.
And it left poor Ejacs lying unused in the basement.
So today, faced with nothing to write about, I figured I’d dust off Ejacs, launch it with lots of fanfare, and then you’d hardly notice that I cheated you. Right?
You’re not coming to my house next year. I can tell already.
Anyway, here’s the code: http://code.google.com/p/ejacs/
There’s a README and a Wiki and installation instructions and stuff. I can’t remember how to put the code in SVN, and I’m having trouble finding it on the code.google.com site. As soon as I figure it out I’ll also make it available via SVN.
So… the best way to compare programming languages is by analogy to cars. Lisp is a whole family of languages, and can be broken down approximately as follows:
- Scheme is an exotic sports car. Fast. Manual transmission. No radio.
- Emacs Lisp is a 1984 Subaru GL 4WD: “the car that’s always in front of you.”
- Common Lisp is Howl’s Moving Castle.
This succinct yet completely accurate synopsis shows that all Lisps have their attractions, and yet each also has a niche. You can choose a Lisp for the busy person, a Lisp for someone without much time, or a Lisp for the dedicated hobbyist, and you’ll find that no matter which one you choose, it’s missing the library you need.
Emacs Lisp can get the job done. No question. It’s a car, and it moves. It’s better than walking. But it pretty much combines the elegance of Common Lisp with the industrial strength of Scheme, without hitting either of them, if you catch my drift.
Problem #1: Momentum
It’s easier to resign yourself to a workaround when you know it’s temporary. If you know the language is going to be enhanced, you can even design your code to accommodate the enhancements more easily when they appear.
Problem #2: No encapsulation
Every symbol in Emacs-Lisp is in the global namespace. There is rudimentary support for hand-rolled namespaces using obarrays, but there’s no equivalent to Common Lisp’s
in-package, making obarrays effectively useless as a tool for code isolation.
The only effective workaround for this problem is to prefix every symbol with the package name. This practice has become so entrenched in Emacs-Lisp programming that many packages (e.g.
apropos and the
elp elisp profiler) rely on the convention for proper operation.
The main adverse consequence of this problem in practice is program verbosity; it makes Emacs-Lisp more difficult to read and write than Common Lisp or Scheme. It can also have a non-negligible impact on performance, especially of interpreted code, as the prefix characters can approach 5% to 10% of total program size in some cases.
The problems run slightly deeper than simple verbosity. Without namespaces you have no real encapsulation facility: there is no convenient way to make a “package-private” variable or function. In practice there’s little problem with program integrity; it’s hard for an external package to change a “private” variable inadvertently in the presence of symbol prefixes. However, it makes it annoyingly difficult for users of the package to discern the “important” top-level configuration variables and program entry points from the unimportant ones. Elisp attempts a few conventions here, but it’s a far cry from real encapsulation support.
- elisp is not lexically scoped and has no closures
- elisp nested defuns are still entered into the global namespace
`flet'and `labels’ are only weakly supported, via macros, and they frequently confuse the debugger, indenter, and other tools.
Some elisp code (e.g. much of the code in cc-engine) prefers to work around the namespace problem by using enormous functions that can be thousands of lines long, since let-bound variables are slightly better encapsulated. Even this is broken by elisp’s dynamic scope:
(defun foo ()
(setq x 7))
(defun bar ()
(let ((x 6))
x)) ; you would expect x to still be 6 here
7 ; d'oh!
So let-bound variables in elisp can still be destroyed by your callee: a dangerous situation at best.
Emacs is basically one big program soup. There’s almost no encapsulation to speak of, and it hurts.
Problem #3: No delegation
One of the big advantages to object-oriented programming is that there is both syntactic support and runtime support for automatic delegation to a “supertype”. You can specialize a type and delegate some of the functionality to the base type. Call it virtual methods or prototype inheritance or whatever you like; most successful languages support some notion of automatic delegation.
Emacs Lisp is a lot like ANSI C: it gives you arrays, structs and functions. You don’t get pointers, but you do get garbage collection and good support for linked lists, so it’s roughly a wash.
Object (and in some cases,
Function, which inherits from
Writing your own virtual method dispatch is just not something you should have to do in 2008.
Problem #4: Properties
And so I did. Your implementation choice for object property lists has a huge impact on runtime performance. Emacs has hashtables, but they’re heavyweight: if you try to instantiate thousands of them it slows Emacs to a crawl. So they’re no good for the default
Object property list. Emacs also has associative arrays (alists), but their performance is O(n), making them no good for objects with more than maybe 30 or 40 properties.
I wound up writing a hybrid model, where the storage starts with lightweight alists, and as you add properties to an object instance, it crosses a threshold (I set it to 50, which seemed to be about right from profiling), it copies the properties into a hashtable. This had a dramatic increase in performance, but it was a lot of work.
I experimented with using a splay tree. I implemented Sleater and Tarjan’s splay tree algorithm in elisp; Ejacs comes with a standalone
splay-tree.el that you can use in your programs if you like. I was hoping that its LRU cache-like properties would help, but I never found a use case where it was faster than my alist/hashtable hybrid, so it’s not currently used for anything.
You really want syntactic support. Sure, people have ported subsets of CLOS to Emacs Lisp, but I’ve always found them a bit clunky. And even in CLOS it’s hard to implement the Properties Pattern. You don’t get it by default. CLOS has lots of support for compile-time slots and virtual dispatch, but very little support for dynamic properties. It’s not terribly hard to build in, but that’s my point: for something that fundamental, you don’t want to have to build it.
Problem #5: No polymorphic
In Emacs Lisp, some objects have first-class print representations. Lists and vectors do, for instance:
(let ((my-list '()))
(push 1 my-list)
(push 2 my-list)
(push 3 my-list)
(3 2 1)
(let ((v (make-vector 3 nil)))
(aset v 0 1)
(aset v 1 2)
(aset v 2 "three")
[1 2 "three"]
But in Emacs Lisp, many built-in types (notably hashtables and functions) do NOT have a way to serialize back as source code. This is a serious omission.
Also, trying to print a sufficiently large tree made entirely of
defstructs will crash Emacs, which caused me a lot of grief until I migrated my parse tree to use a mixture of defstructs and lists. Note that simply typing the name of a defstruct, or passing over it ephemerally in the debugger, will cause Emacs to try to print it, and crash. Fun.
So it sucks. Printing data structures in Emacs just sucks.
Emacs advantages: Macros and S-expressions
Elisp does have a few places where it shines, though. One of them is the
cl (Common Lisp emulation) package, which provides a whole bunch of goodies that make Elisp actually usable for real work. Defstruct and the loop macro are especially noteworthy standouts.
Some programmers are still operating under the (ancient? legacy?) assumption that the
cl package is somehow deprecated or distasteful or something. They’re just being silly; don’t listen to them. Practicality should be the ONLY consideration.
Emacs Lisp has
defmacro, which makes up for a LOT of its deficiencies. However, it really only has one flavor. Ideally, at the very least, it should support reader macros. The Emacs documentation says they were left out because they felt it wasn’t worth it. Who are they to make the call? It’s the users who need them. Implementer convenience is a pretty lame metric for deciding whether to support a feature, especially after 20 years of people asking for it.
Elisp is s-expression based, which is a mixed bag. It has some advantages, no question. However, it fares poorly in two very common domains: object property access, and algebraic expressions.
But 30,000 lines is a pretty good hunk of code for getting to know a language. Especially if you’re writing an interpreter for one language in another language: you wind up knowing both better than you ever wanted to know them.
I would love to see Emacs Lisp get reader macros, closures, some namespace support, and the ability to install your own print functions. This reasonably small set of features would be a huge step in usability.
I tell ya: if you’re a programming language, it’s a very good thing to have smart people liking you.
In the meantime, it doesn’t actually do squat except interpret EcmaScript in a little isolated console, so don’t get your hopes up.
Reminder — here’s the Ejacs URL: http://code.google.com/p/ejacs – enjoy!
And with that, I’m off to find some Nuka-Cola Quantum. I just wish those bastards hadn’t capped me at level 20.
Source: Steve Yegge