Python Success Stories | Python.org
Notice:
This page displays a fallback because interactive scripts did not run. Possible causes include disabled JavaScript or failure to load scripts or stylesheets.
First published in Linux Journal, May 2000.
Copyright 2000 Specialized Systems Consultants, Inc. All rights reserved.
Introduction
My first look at Python was an accident, and I didn't much like what I
saw at the time. It was early 1997, and Mark Lutz's book
Programming
Python
from O'Reilly & Associates had recently come out. O'Reilly books
occasionally land on my doorstep, selected from among the new releases
by some mysterious benefactor inside the organization using a random
process I've given up trying to understand.
One of them was
Programming Python
. I found this somewhat interesting,
as I collect computer languages. I know over two dozen general-purpose
languages, write compilers and interpreters for fun, and have designed
any number of special-purpose languages and markup formalisms myself.
My most recently completed project, as I write this, is a
special-purpose language called SNG for manipulating PNG (Portable
Network Graphics) images. Interested readers can surf to the SNG home
page at
. I have also written
implementations of several odd general-purpose languages on my
Retrocomputing Museum page,
I had already heard just enough about Python to know that it is what is
nowadays called a "scripting language", an interpretive language with
its own built-in memory management and good facilities for calling and
cooperating with other programs. So I dived into
Programming Python
with one question uppermost in my mind: what has this got that Perl
does not?
Perl, of course, is the 800-pound gorilla of modern scripting
languages. It has largely replaced shell as the scripting language of
choice for system administrators, thanks partly to its comprehensive
set of UNIX library and system calls, and partly to the huge collection
of Perl modules built by a very active Perl community. The language is
commonly estimated to be the CGI language behind about 85% of the
"live" content on the Net. Larry Wall, its creator, is rightly
considered one of the most important leaders in the Open Source
community, and often ranks third behind Linus Torvalds and Richard
Stallman in the current pantheon of hacker demigods.
At that time, I had used Perl for a number of small projects. I'd found
it quite powerful, even if the syntax and some other aspects of the
language seemed rather ad hoc and prone to bite one if not used with
care. It seemed to me that Python would have quite a hill to climb as
yet another scripting language, so as I read, I looked first for what
seemed to set it apart from Perl.
I immediately tripped over the first odd feature of Python that
everyone notices: the fact that whitespace (indentation) is actually
significant in the language syntax. The language has no analog of the C
and Perl brace syntax; instead, changes in indentation delimit
statement groups. And, like most hackers on first realizing this fact,
I recoiled in reflexive disgust.
I am just barely old enough to have programmed in batch FORTRAN for a
few months back in the 1970s. Most hackers aren't these days, but
somehow our culture seems to have retained a pretty accurate folk
memory of how nasty those old-style fixed-field languages were. Indeed,
the term "free format", used back then to describe the newer style of
token-oriented syntax in Pascal and C, has almost been forgotten; all
languages have been designed that way for decades now. Or almost all,
anyway. It's hard to blame anyone, on seeing this Python feature, for
initially reacting as though they had unexpectedly stepped in a
steaming pile of dinosaur dung.
That's certainly how I felt. I skimmed through the rest of the language
description without much interest. I didn't see much else to recommend
Python, except maybe that the syntax seemed rather cleaner than Perl's
and the facilities for doing basic GUI elements like buttons and menus
looked fairly good.
I put the book back on the shelf, making a mental note that I should
code some kind of small GUI-centered project in Python sometime, just
to make sure I really understood the language. But I didn't believe
what I'd seen would ever compete effectively with Perl.
Perl Wears Thin
A lot of other things conspired to keep that note way down on my
priority list for many months. The rest of 1997 was eventful for me; it
was, among other things, the year I wrote and published the original
version of "The Cathedral and the Bazaar". But I did find time to
write several Perl programs, including two of significant size and
complexity. One of them,
keeper
, is the assistant still used to file
incoming submissions at the Metalab software archive. It generates the
web pages you see at
. The
other,
anthologize
, was used to automatically generate the PostScript
for the sixth edition of Linux from the Linux Documentation Project's
archive of HOWTOs. Both programs are available at Metalab.
Writing these programs left me progressively less satisfied with Perl.
Larger project size seemed to magnify some of Perl's annoyances into
serious, continuing problems. The syntax that had seemed merely
eccentric at a hundred lines began to seem like a nigh-impenetrable
hedge of thorns at a thousand. "More than one way to do it" lent
flavor and expressiveness at a small scale, but made it significantly
harder to maintain consistent style across a wider code base. And many
of the features that were later patched into Perl to address the
complexity-control needs of bigger programs (objects, lexical scoping,
"use strict", etc.) had a fragile, jerry-rigged feel about them.
These problems combined to make large volumes of Perl code seem
unreasonably difficult to read and grasp as a whole after only a few
days' absence. Also, I found I was spending more and more time
wrestling with artifacts of the language rather than my application
problems. And, most damning of all, the resulting code was ugly -- this
matters. Ugly programs are like ugly suspension bridges: they're much
more liable to collapse than pretty ones, because the way humans
(especially engineer-humans) perceive beauty is intimately related to
our ability to process and understand complexity. A language that makes
it hard to write elegant code makes it hard to write good code.
With a baseline of two dozen languages under my belt, I could detect
all the telltale signs of a language design that had been pushed to the
edge of its functional envelope. By mid-1997, I was thinking "there
has to be a better way" and began casting about for a more elegant
scripting language.
One course I did not consider was going back to C as a default
language. The days when it made sense to do your own memory management
in a new program are long over, outside of a few specialty areas like
kernel hacking, scientific computing and 3-D graphics -- places where you
absolutely must get maximum speed and tight control of memory usage,
because you need to push the hardware as hard as possible.
For most other situations, accepting the debugging overhead of buffer
overruns, pointer-aliasing problems,
malloc/free
memory leaks and all
the other associated ills is just crazy on today's machines. Far better
to trade a few cycles and a few kilobytes of memory for the overhead of
a scripting language's memory manager and economize on far more
valuable human time. Indeed, the advantages of this strategy are
precisely what has driven the explosive growth of Perl since the
mid-1990s.
I flirted with Tcl, only to discover quickly that it scales up even
more poorly than Perl. Old LISPer that I am, I also looked at various
current dialects of Lisp and Scheme -- but, as is historically usual for
Lisp, lots of clever design was rendered almost useless by scanty or
nonexistent documentation, incomplete access to POSIX/UNIX facilities,
and a small but nevertheless deeply fragmented user community. Perl's
popularity is not an accident; most of its competitors are either worse
than Perl for large projects or somehow nowhere near as useful as their
theoretically superior designs ought to make them.
Python Reconsidered
My second look at Python was almost as accidental as my first. In
October 1997, a series of questions on the fetchmail-friends mailing
list made it clear that end users were having increasing trouble
generating configuration files for my
fetchmail
utility. The file uses
a simple, classically UNIX free-format syntax, but can become
forbiddingly complicated when a user has POP3 and IMAP accounts at
multiple sites. As an example, see Listing 1 for a somewhat simplified
version of mine.
set postmaster "esr"
set daemon 300
poll imap.ccil.org with proto IMAP and options no dns
aka snark.thyrsus.com locke.ccil.org ccil.org
user esr there is esr here options fetchall dropstatus warnings 3600
poll imap.netaxs.com with proto IMAP
user "esr" there is esr here options dropstatus warnings 3600
skip imap.21cn.com with proto IMAP
user esr here is tranxww there options fetchall
skip pop.tems.com with proto POP3:
user esr here is ed there options fetchall
skip mail.frequentis.com with proto IMAP:
user esr here is imaptest there with options fetchall
Listing 1
I decided to attack the problem by writing an end-user-friendly
configuration editor,
fetchmailconf
. The design objective of
fetchmailconf was clear: to completely hide the control file syntax
behind a fashionable, ergonomically correct GUI interface replete with
selection buttons, slider bars and fill-out forms.
The thought of implementing this in Perl did not thrill me. I had seen
GUI code in Perl, and it was a spiky mixture of Perl and Tcl that
looked even uglier than my own pure-Perl code. It was at this point I
remembered the bit I had set more than six months earlier. This could
be an opportunity to get some hands-on experience with Python.
Of course, this brought me face to face once again with Python's
pons
asinorum
, the significance of whitespace. This time, however, I charged
ahead and roughed out some code for a handful of sample GUI elements.
Oddly enough, Python's use of whitespace stopped feeling unnatural
after about twenty minutes. I just indented code, pretty much as I
would have done in a C program anyway, and it worked.
That was my first surprise. My second came a couple of hours into the
project, when I noticed (allowing for pauses needed to look up new
features in
Programming Python
) I was generating
working
code nearly as
fast as I could type. When I realized this, I was quite startled. An
important measure of effort in coding is the frequency with which you
write something that doesn't actually match your mental representation
of the problem, and have to backtrack on realizing that what you just
typed won't actually tell the language to do what you're thinking. An
important measure of good language design is how rapidly the percentage
of missteps of this kind falls as you gain experience with the
language.
When you're writing working code nearly as fast as you can type and
your misstep rate is near zero, it generally means you've achieved
mastery of the language. But that didn't make sense, because it was
still day one and I was regularly pausing to look up new language and
library features!
This was my first clue that, in Python, I was actually dealing with an
exceptionally good design. Most languages have so much friction and
awkwardness built into their design that you learn most of their
feature set long before your misstep rate drops anywhere near zero.
Python was the first general-purpose language I'd ever used that
reversed this process.
Not that it took me very long to learn the feature set. I wrote a
working, usable fetchmailconf, with GUI, in six working days, of which
perhaps the equivalent of two days were spent learning Python itself.
This reflects another useful property of the language: it is
compact
-- you can hold its entire feature set (and at least a concept
index of its libraries) in your head. C is a famously compact language.
Perl is notoriously not; one of the things the notion "There's more
than one way to do it!" costs Perl is the possibility of compactness.
Delving Deeper
But my most dramatic moment of discovery lay ahead. My design had a
problem: I could easily generate configuration files from the user's
GUI actions, but editing them was a much harder problem. Or, rather,
reading them into an editable form was a problem.
The parser for fetchmail's configuration file syntax is rather
elaborate. It's actually written in YACC and Lex, two classic UNIX
tools for generating language-parsing code in C. In order for
fetchmailconf to be able to edit existing configuration files, I
thought it would have to replicate that elaborate parser in Python. I
was very reluctant to do this, partly because of the amount of work
involved and partly because I wasn't sure how to ascertain that two
parsers in two different languages accept the same. The last thing I
needed was the extra labor of keeping the two parsers in
synchronization as the configuration language evolved!
This problem stumped me for a while. Then I had an inspiration: I'd let
fetchmailconf use fetchmail's own parser! I added a
--configdump
option
to fetchmail that would parse .fetchmailrc and dump the result to
standard output in the format of a Python initializer. For the file
above, the result would look roughly like Listing 2 (to save space,
some data not relevant to the example is omitted).
fetchmailrc = {
'poll_interval':300,
"logfile":None,
"postmaster":"esr",
'bouncemail':TRUE,
"properties":None,
'invisible':FALSE,
'syslog':FALSE,
# List of server entries begins here
'servers': [
# Entry for site `imap.ccil.org' begins:
"pollname":"imap.ccil.org",
'active':TRUE,
"via":None,
"protocol":"IMAP",
'port':0,
'timeout':300,
'dns':FALSE,
"aka":["snark.thyrsus.com", "locke.ccil.org", "ccil.org"],
'users': [
"remote":"esr",
"password":"Malvern",
'localnames':["esr"],
'fetchall':TRUE,
'keep':FALSE,
'flush':FALSE,
"mda":None,
'limit':0,
'warnings':3600,
, ]
# Entry for site `imap.netaxs.com' begins:
"pollname":"imap.netaxs.com",
'active':TRUE,
"via":None,
"protocol":"IMAP",
'port':0,
'timeout':300,
'dns':TRUE,
"aka":None,
'users': [
"remote":"esr",
"password":"d0wnthere",
'localnames':["esr"],
'fetchall':FALSE,
'keep':FALSE,
'flush':FALSE,
"mda":None,
'limit':0,
'warnings':3600,
, ]
# Entry for site `imap.21cn.com' begins:
"pollname":"imap.21cn.com",
'active':FALSE,
"via":None,
"protocol":"IMAP",
'port':0,
'timeout':300,
'dns':TRUE,
"aka":None,
'users': [
"remote":"tranxww",
"password":None,
'localnames':["esr"],
'fetchall':TRUE,
'keep':FALSE,
'flush':FALSE,
"mda":None,
'limit':0,
'warnings':3600,
, ]
# Entry for site `pop.tems.com' begins:
"pollname":"pop.tems.com",
'active':FALSE,
"via":None,
"protocol":"POP3",
'port':0,
'timeout':300,
'dns':TRUE,
'uidl':FALSE,
"aka":None,
'users': [
"remote":"ed",
"password":None,
'localnames':["esr"],
'fetchall':TRUE,
'keep':FALSE,
'flush':FALSE,
"mda":None,
'limit':0,
'warnings':3600,
, ]
# Entry for site `mail.frequentis.com' begins:
"pollname":"mail.frequentis.com",
'active':FALSE,
"via":None,
"protocol":"IMAP",
'port':0,
'timeout':300,
'dns':TRUE,
"aka":None,
'users': [
"remote":"imaptest",
"password":None,
'localnames':["esr"],
'fetchall':TRUE,
'keep':FALSE,
'flush':FALSE,
"mda":None,
'limit':0,
'warnings':3600,
, ]
Listing 2
Python could then evaluate the fetchmail
--configdump
output and have
the configuration available as the value of the variable "fetchmail".
This wasn't quite the last step in the dance. What I really wanted
wasn't just for fetchmailconf to have the existing configuration, but
to turn it into a linked tree of live objects. There would be three
kinds of objects in this tree:
Configuration
(the top-level object
representing the entire configuration),
Site
(representing one of the
sites to be polled) and
User
(representing user data attached to a
site). The example file describes five site objects, each with one user
object attached to it.
I had already designed and written the three object classes (that's
what took four days, most of it spent getting the layout of the widgets
just right). Each had a method that caused it to pop up a GUI edit
panel to modify its instance data. My last remaining problem was
somehow to transform the dead data in this Python initializer into live
objects.
I considered writing code that would explicitly know about the
structure of all three classes and use that knowledge to grovel through
the initializer creating matching objects, but rejected that idea
because new class members were likely to be added over time as the
configuration language grew new features. If I wrote the
object-creation code in the obvious way, it would be fragile and tend
to fall out of sync when either the class definitions or the
initializer structure changed.
What I really wanted was code that would analyze the shape and members
of the initializer, query the class definitions themselves about their
members, and then adjust itself to impedance-match the two sets.
This kind of thing is called
metaclass hacking
and is generally
considered fearsomely esoteric -- deep black magic. Most object-oriented
languages don't support it at all; in those that do (Perl being one),
it tends to be a complicated and fragile undertaking. I had been
impressed by Python's low coefficient of friction so far, but here was
real
test. How hard would I have to wrestle with the language to get
it to do this? I knew from previous experience that the bout was likely
to be painful, even assuming I won, but I dived into the book and read
up on Python's metaclass facilities. The resulting function is shown in
Listing 3, and the code that calls it is in Listing 4.
def copy_instance(toclass, fromdict):
# Initialize a class object of given type from a conformant dictionary.
class_sig = toclass.__dict__.keys(); class_sig.sort()
dict_keys = fromdict.keys(); dict_keys.sort()
common = intersect(class_sig, dict_keys)
if 'typemap' in class_sig:
class_sig.remove('typemap')
if tuple(class_sig) != tuple(dict_keys):
print "Conformability error"
# print "Class signature: " + `class_sig`
# print "Dictionary keys: " + `dict_keys`
print "Not matched in class signature: " + `setdiff(class_sig, common)`
print "Not matched in dictionary keys: " + `setdiff(dict_keys, common)`
sys.exit(1)
else:
for x in dict_keys:
setattr(toclass, x, fromdict[x])
Listing 3
# The tricky part -- initializing objects from the configuration global
# `Configuration' is the top level of the object tree we're going
# to mung
Configuration = Controls()
copy_instance(Configuration, configuration)
Configuration.servers = [];
for server in configuration[`servers']:
Newsite = Server()
copy_instance(Newsite, server)
Configuration.servers.append(Newsite)
Newsite.users = [];
for user in server['users']:
Newuser = User()
copy_instance(Newuser, user)
Newsite.users.append(Newuser)
Listing 4
That doesn't look too bad for deep black magic, does it? Thirty-two
lines, counting comments. Just from knowing what I've said about the
class structure, the calling code is even readable. But the size of
this code isn't the real shocker. Brace yourself: this code only took
me about ninety minutes to write -- and it worked correctly
the first
time I ran it
To say I was astonished would have been positively wallowing in
understatement. It's remarkable enough when implementations of
simple
techniques work exactly as expected the first time; but my first
metaclass hack in a new language, six days from a cold standing start?
Even if we stipulate that I am a fairly talented hacker, this is an
amazing testament to Python's clarity and elegance of design.
There was simply no way I could have pulled off a coup like this in
Perl, even with my vastly greater experience level in that language. It
was at this point I realized I was probably leaving Perl behind.
Conclusion
This was my most dramatic Python moment. But, when all is said and
done, it was just a clever hack. The long-term usefulness of a language
comes not in its ability to support clever hacks, but from how well and
how unobtrusively it supports the day-to-day work of programming. The
day-to-day work of programming consists not of writing new programs,
but mostly reading and modifying existing ones.
So the real punchline of the story is this: weeks and months after
writing fetchmailconf, I could still read the fetchmailconf code and
grok what it was doing without serious mental effort. And the true
reason I no longer write Perl for anything but tiny projects is that
was never true when I was writing large masses of Perl code. I fear the
prospect of ever having to modify keeper or anthologize again -- but
fetchmailconf gives me no qualms at all.
Perl still has its uses. For tiny projects (100 lines or fewer) that
involve a lot of text pattern matching, I am still more likely to
tinker up a Perl-regexp-based solution than to reach for Python. For
good recent examples of such things, see the
timeseries
and
growthplot
scripts in the fetchmail distribution. Actually, these are much like
the things Perl did in its original role as a sort of combination
awk/sed/grep/sh, before it had functions and direct access to the
operating system API. For anything larger or more complex, I have come
to prefer the subtle virtues of Python -- and I think you will, too.
All listings referred to in this article are available by anonymous
download in the file
ftp://ftp.linuxjournal.com/pub/lj/listings/issue73/3882.tgz
About the Author
Eric Raymond is a Linux advocate and the author of The Cathedral & The
Bazaar. He can be reached via e-mail at esr@thyrsus.com.
The PSF
The Python Software Foundation is the organization behind Python. Become a member of the PSF and help advance the software and our mission.