SOBGTR OCCC AILD FUNEX?

August 10th, 2012

Suppose you had to pick a very small set of character strings that you, and only you, could identify without hesitation in a particular way. What would you choose? How small a set could you choose and still be unique? For example, SOBGTR OCCC AILD FUNEX? is a set of strings that I think would uniquely identify me. (My interpretation is below.) I’m pretty sure that almost any subset of 3 of them would suffice. Coming up with a set of two wouldn’t be hard, I don’t think – but it feels risky.

There are 7 billion people on the planet. So if you just pick 3 reasonably obscure acronyms, e.g., things that only 1 person in 2000 would recognize, you’re heading in the right direction (since 2000 cubed is 8 billion). But that’s only if the obscurity of the things you pick is independent. For example, it’s less good to pick 3 computer acronyms from the 1960s than to choose just one of them plus some things from very different areas of your knowledge.

The rules

  1. Each of your strings with its meaning to you must be findable on Google.
  2. To match with you, another person must interpret all your strings the way you do.

Rule 1 prevents you from choosing something like your bank PIN number, that only you could possibly know. Without this rule, everyone could trivially choose a set of one string. The rule makes thinking up a uniquely identifying set for yourself like a game. Given that all your strings and their interpretations are on Google, each of your strings will likely be recognized by someone in the way you recognize it, so your set will probably have at least 2 strings. You need to choose a set of strings whose set of interpretations, taken as a whole, make you unique (Rule 2).

Why is this interesting?

I find this interesting for many reasons. It seems clear that uniquely identifying sets are fairly easy to construct for people and they’re very small. Certainly small enough to fit in a tweet. Although it’s easy to make a set for yourself, it’s hard to make one for someone else – you might even argue that by definition it’s not possible. If someone else makes one, you can’t produce their set of interpretations without spending time on Google, and even then you’d probably have to know the person pretty well.

Is there a new authentication scheme here somewhere? It’s tempting to think yes, but there probably isn’t. This is less secure than asking people for a set of secrets that are not each findable in Google, so anything you come up with is almost certain to be less secure than the same thing based on a set of actual secrets. It’s more of a fun thought exercise (or Twitter game). It’s not hard to imagine some form of authentication. For example, identify which of a set of symbols are special to you (avoiding others chosen randomly from, say, the set of all acronyms), and their correct interpretations for you, and do it rapidly. Or if a clone shows up one day, claiming to be you, and you’ve thoughtfully put a sealed set of unique symbol strings in your safe, you should be able to convince people that you’re the real you :-)

Answer

Here’s my unhesitating interpretation of the set of 4 strings above:

Remember, to be me you have to get them all. It’s not enough to get a couple, or even three of them.

describejson – a Python script for summarizing JSON structure

August 9th, 2012

Yesterday I was sent a 24M JSON file and needed to look through it to give someone an opinion on its contents. I did what I normally do to look at JSON, piping it into python -m json.tool. The result looked pretty good, I scrolled through some screens with a single long list and jumped to the bottom to see what looked like the end of the list. What I didn’t know at the time was that the output was 495,647 lines long! And there was some important stuff in the middle of the output that I didn’t see at all.

So I decided to write a quick program to recursively summarize JSON. You can grab it from Github at https://github.com/terrycojones/describejson.

Usage is simple, just send it JSON on stdin. Here’s an example input:

{
  "cats": 3,
  "dogs": 6,
  "parrots": 1
}

Which gets you this output:

$ python describejson.py < input.json
1 dict of length 3. Values:
  3 ints

The output is a little cryptic, but you’ll get used to it (and I may improve it). In words, this is telling you that (after loading the JSON) the input contained 1 Python dictionary with 3 elements. The values of the 3 elements are all integers. The indentation is meaningful, of course. You can see that the script is summarizing the fact that the 3 values in the dict were all of the same type.

Here’s another sample input:

[
  ["fluffy", "kitty", "ginger"],
  ["fido", "spot", "rover"],
  ["squawk"]
]

Which gets you:

$ python describejson.py < input.json
1 list of length 3. Values:
  2 lists of length 3. Values:
    3 unicodes
  1 list of length 1. Values:
    1 unicode

In words, the input was a list of length 3. Its contents were 2 lists of length 3 that both contained 3 unicode strings, and a final list that contains just a single unicode string.

Specifying equality strictness

The script currently takes just one option, --strictness (or just -s) to indicate how strict it should be in deciding whether things are “the same” in order to summarize them. In the above output, the default strictness length is used, so the script considers the first two inner lists to be collapsible in the summary, and prints a separate line for the last list since it’s of a different length. Here’s the output from running with --strictness type:

$ python describejson.py –strictness type < input.json
1 list of length 3. Values:
  3 lists of length 3. Values:
    3 unicodes

The lists are all considered equal here. The output is a little misleading, since it tells us there are 3 lists of length 3, each containing 3 unicodes. I may fix that.

We can also be more strict. Here’s the output from --strictness keys:

$ python describejson.py –strictness keys < input.json
1 list of length 3. Values:
  1 list of length 3. Values:
    3 unicodes
  1 list of length 3. Values:
    3 unicodes
  1 list of length 1. Values:
    1 unicode

The 3 inner lists are each printed separately because their contents differ. The keys argument is also a bit confusing for lists, it just means the list values. It’s clearer when you have dictionaries in the input.

This input:

[
  {
    "a": 1,
    "b": 2,
    "c": 3
  },
  {
    "d": 4,
    "e": 5,
    "f": 6
  }
]

produces

$ python describejson.py < input.json
1 list of length 2. Values:
  2 dicts of length 3. Values:
    3 ints

I.e., one list, containing 2 dictionaries, each containing 3 int values. Note that this is using the default of --strictness length so the two dicts are considered the same. If we run that input with strictness of keys, we’ll instead get this:

$ python describejson.py –strictness keys < input.json
1 list of length 2. Values:
  1 dict of length 3. Values:
    3 ints
  1 dict of length 3. Values:
    3 ints

The dicts are considered different because their keys differ. If we change the input to make the keys the same:

[
  {
    "a": 1,
    "b": 2,
    "c": 3
  },
  {
    "a": 4,
    "b": 5,
    "c": 6
  }
]

and run again with --strictness keys, the dicts are considered the same:

$ python describejson.py –strictness keys < input.json
1 list of length 2. Values:
  2 dicts of length 3. Values:
    3 ints

but if we use --strictness equal, the dicts will be considered different:

$ python describejson.py –strictness equal < input.json
1 list of length 2. Values:
  1 dict of length 3. Values:
    3 ints
  1 dict of length 3. Values:
    3 ints

Finally, making the dicts the same:

[
  {
    "a": 1,
    "b": 2,
    "c": 3
  },
  {
    "a": 1,
    "b": 2,
    "c": 3
  }
]

and running with --strictness equal will collapse the summary as you’d expect:

$ python describejson.py –strictness equal < input.json
1 list of length 2. Values:
  2 dicts of length 3. Values:
    3 ints

Hopefully it’s clear that by being less strict on matching you’ll get more concise output in which things are casually considered “the same” and if you’re more strict you’ll get more verbose output, all the way to using strict equality for both lists and dicts.

Here’s the full set of --strictness options:

  • type: compare things by type only.
  • length: compare lists and objects by length.
  • keys: compare lists by equality, dicts by keys.
  • equal: compare lists and dicts by equality.

Improvements

The naming of the --strictness options could be improved. The keys option should probably be called values (but that is confusing, since dictionaries have values and it’s a comparison based on their keys!). A values option should probably also compare the value of primitive things, like integers and strings.

There are quite a few other things I might do to this script, if I ever have time. It would be helpful to print out some keys and values when these are short and unchanging. It would be good to show an example representative value of something that repeats (modulo strictness) many times. It might be good to be able to limit the depth to go into a JSON structure.

Overall though, I already find the script useful and I’m not in a rush to “improve” it by adding features. You can though :-)

You might also find it helpful to take what you learn about a JSON object via describe JSON and use that to grep out specific pieces of the structure using jsongrep.py.

If you’re curious, here’s the 24-line output summary of the 24M JSON I received. Much more concise than the nearly 1/2 a million lines from python -m json.tool:

1 dict of length 3. Values:
  1 int
  1 dict of length 4. Values:
    1 list of length 17993. Values:
      17993 dicts of length 5. Values:
        1 unicode
        1 int
        1 list of length 0.
        2 unicodes
    1 list of length 0.
    1 list of length 11907. Values:
      11907 dicts of length 5. Values:
        1 unicode
        1 int
        1 list of length 1. Values:
          1 unicode
        2 unicodes
    1 list of length 28068. Values:
      28068 dicts of length 5. Values:
        1 unicode
        1 int
        1 list of length 0.
        2 unicodes
  1 unicode

Autovivification in Python: nested defaultdicts with a specific final type

May 26th, 2012

I quite often miss the flexibility of autovivification in Python. That Wikipedia link shows a cute way to get what Perl has:

from collections import defaultdict

def tree():
    return defaultdict(tree)

lupin = tree()
lupin["express"][3] = "stand and deliver"

It’s interesting to think about what’s going on in the above code and why it works. I really like defaultdict.

What I more often want, though, is not infinitely deep nested dictionaries like the above, but a (known) finite number of nested defaultdicts with a specific type at the final level. Here’s a tiny function I wrote to get just that:

from collections import defaultdict

def autovivify(levels=1, final=dict):
    return (defaultdict(final) if levels < 2 else
            defaultdict(lambda: autovivify(levels – 1, final)))

So let’s say you were counting the number of occurrences of words said by people by year, month, and day in a chat room. You’d write:

words = autovivify(5, int)

words["sam"][2012][5][25]["hello"] += 1
words["sue"][2012][5][24]["today"] += 1

Etc. It’s pretty trivial, but it was fun to think about how to do it with a function and to play with some alternatives. You could do it manually with nested lambdas:

words = defaultdict(lambda: defaultdict(int))
words["sam"]["hello"] += 1

But that gets messy quickly and isn’t nearly as much fun.

The love of pleasure and the love of action

April 28th, 2012

There are two very natural propensities which we may distinguish in the most virtuous and liberal dispositions, the love of pleasure and the love of action. If the former is refined by art and learning, improved by the charms of social intercourse, and corrected by a just regard to economy, to health, and to reputation, it is productive of the greatest part of the happiness of private life. The love of action is a principle of a much stronger and more doubtful nature. It often leads to anger, to ambition, and to revenge; but when it is guided by the sense of propriety and benevolence, it becomes the parent of every virtue, and if those virtues are accompanied with equal abilities, a family, a state, or an empire, may be indebted for their safety and prosperity to the undaunted courage of a single man. To the love of pleasure we may therefore ascribe most of the agreeable, to the love of action we may attribute most of the useful and respectable, qualifications. The character in which both the one and the other should be united and harmonized, would seem to constitute the most perfect idea of human nature. The insensible and inactive disposition, which should be supposed alike destitute of both, would be rejected, by the common consent of mankind, as utterly incapable of procuring any happiness to the individual, or any public benefit to the world. But it was not in this world, that the primitive Christians were desirous of making themselves either agreeable or useful.

Edward Gibbon
From Chapter XV: Progress Of The Christian Religion.
http://ancienthistory.about.com/library/bl/bl_text_gibbon_1_15_5.htm

A more melancholy duty is imposed on the historian

April 8th, 2012

The theologian may indulge the pleasing task of describing Religion as she descended from Heaven, arrayed in her native purity. A more melancholy duty is imposed on the historian. He must discover the inevitable mixture of error and corruption, which she contracted in a long residence upon earth, among a weak and degenerate race of beings.

From Gibbon vol 1 ch 15.

The ruling passion of his soul

February 28th, 2012

Yet Commodus was not, as he has been represented, a tiger born with an insatiate thirst of human blood, and capable, from his infancy, of the most inhuman actions. Nature had formed him of a weak rather than a wicked disposition. His simplicity and timidity rendered him the slave of his attendants, who gradually corrupted his mind. His cruelty, which at first obeyed the dictates of others, degenerated into habit, and at length became the ruling passion of his soul.

Gibbon.

It is almost superfluous to enumerate the unworthy successors of Augustus

February 7th, 2012

It is almost superfluous to enumerate the unworthy successors of Augustus. Their unparalleled vices, and the splendid theatre on which they were acted, have saved them from oblivion. The dark, unrelenting Tiberius, the furious Caligula, the feeble Claudius, the profligate and cruel Nero, the beastly Vitellius, and the timid, inhuman Domitian, are condemned to everlasting infamy. During fourscore years (excepting only the short and doubtful respite of Vespasian’s reign) Rome groaned beneath an unremitting tyranny, which exterminated the ancient families of the republic, and was fatal to almost every virtue and every talent that arose in that unhappy period.

From Gibbon, Decline & Fall of the Roman Empire (vol 1).

Women’s guide to HTTP status codes for dealing with unwanted geek advances

January 21st, 2012

Here’s a women’s guide to the most useful HTTP status codes for dealing with unwanted geek advances. Someone should make and sell a deck of these.


301 MOVED PERMANENTLY
303 SEE OTHER
305 USE PROXY
306 RESERVED
307 TEMPORARY REDIRECT
400 BAD REQUEST
401 UNAUTHORIZED
402 PAYMENT REQUIRED
403 FORBIDDEN
404 NOT FOUND
405 METHOD NOT ALLOWED
406 NOT ACCEPTABLE
407 PROXY AUTHENTICATION REQUIRED
408 REQUEST TIMEOUT
409 CONFLICT
410 GONE
411 LENGTH REQUIRED
412 PRECONDITION FAILED
413 REQUEST ENTITY TOO LARGE
414 REQUEST-URI TOO LONG
415 UNSUPPORTED MEDIA TYPE
416 REQUESTED RANGE NOT SATISFIABLE
417 EXPECTATION FAILED
422 UNPROCESSABLE ENTITY
423 LOCKED
424 FAILED DEPENDENCY
426 UPGRADE REQUIRED
500 INTERNAL SERVER ERROR
501 NOT IMPLEMENTED
502 BAD GATEWAY
503 SERVICE UNAVAILABLE
504 GATEWAY TIMEOUT
507 INSUFFICIENT STORAGE
508 LOOP DETECTED
510 NOT EXTENDED

Source: Hypertext Transfer Protocol (HTTP) Status Code Registry

Emacs buffer mode histogram

November 10th, 2011

Tonight I noticed that I had over 200 buffers open in emacs. I’ve been programming a lot in Python recently, so many of them are in Python mode. I wondered how many Python files I had open, and I counted them by hand. About 90. I then wondered how many were in Javascript mode, in RST mode, etc. I wondered what a histogram would look like, for me and for others, at times when I’m programming versus working on documentation, etc.

Because it’s emacs, it wasn’t hard to write a function to display a buffer mode histogram. Here’s mine:

235 buffers open, in 23 distinct modes

91               python +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
47          fundamental +++++++++++++++++++++++++++++++++++++++++++++++
24                  js2 ++++++++++++++++++++++++
21                dired +++++++++++++++++++++
16                 html ++++++++++++++++
 7                 text +++++++
 4                 help ++++
 4           emacs-lisp ++++
 3                   sh +++
 3       makefile-gmake +++
 2          compilation ++
 2                  css ++
 1          Buffer-menu +
 1                 mail +
 1                 grep +
 1      completion-list +
 1                   vm +
 1                  org +
 1               comint +
 1              apropos +
 1                 Info +
 1           vm-summary +
 1      vm-presentation +

Tempting as it is, I’m not going to go on about the heady delights of having a fully programmable editor. You either already know, or you can just drool in slack-jawed wonder.

Unfortunately I’m a terrible emacs lisp programmer. I can barely remember a thing each time I use it. But the interpreter is of course just emacs itself and the elisp documentation is in emacs, so it’s a really fun environment to develop in. And because emacs lisp has a ton of support for doing things to itself, code that acts on emacs and your own editing session or buffers is often very succinct. See for example the save-excursion and with-output-to-temp-buffer functions below.

(defun buffer-mode-histogram ()
  "Display a histogram of emacs buffer modes."
  (interactive)
  (let* ((totals ‘())
         (buffers (buffer-list()))
         (total-buffers (length buffers))
         (ht (make-hash-table :testequal)))
    (save-excursion
      (dolist (buffer buffers)
        (set-buffer buffer)
        (let
            ((mode-name (symbol-name major-mode)))
          (puthash mode-name (1+ (gethash mode-name ht 0)) ht))))
    (maphash (lambda (key value)
               (setq totals (cons (list key value) totals)))
             ht)
    (setq totals (sort totals (lambda (x y) (> (cadr x) (cadr y)))))
    (with-output-to-temp-buffer "Buffer mode histogram"
      (princ (format "%d buffers open, in %d distinct modes\n\n"
                      total-buffers (length totals)))
      (dolist (item totals)
        (let
            ((key (car item))
             (count (cadr item)))
          (if (equal (substring key -5) "-mode")
              (setq key (substring key 0 -5)))
          (princ (format "%2d %20s %s\n" count key
                         (make-string count ?+))))))))
 

Various things about the formatting could be improved. E.g., not use fixed-width fields for the count and the mode names, and make the + signs indicate more than one buffer mode when there are many.

The Grapes of Wrath & Occupy Wall Street

October 31st, 2011

I’m reading The Grapes of Wrath for the first time. I can’t believe it took me so long to finally read it. It’s great.

Below is a section I just ran across that I imagine will resonate strongly with the people involved in Occupy Wall Street. I’ve long been fascinated to watch how power tries to maintain itself by attempting to enforce isolation and to restrict information flow, and, on the contrary, how increased information flow between the subjects of power naturally undermines this basis. Awareness of these opposing forces, even if not explicitly understood, is what I think accounts for the tenacity and ferocity on both sides of the OWS (and many other) movements, even (especially) when the movements are still only tiny. The occupiers experience the surge of energy and determination and self-identification that comes from solidarity, while those in power recognize the danger and act in heavy-handed ways to try to crush it, usually after trying to ignore and then ridicule. The consistent characteristic of the reaction against these movements, as Steinbeck notes, is that those in power do not understand what’s going on. So in their efforts to snuff out the protests they instead fan the flames, which they then have to react even more violently to. It seems an extraordinarily difficult task for power to successfully manage to defuse a popular movement without resorting to extremes. Hence the absurd justifications of needing to clean (often already cleaned – by the protesters) public spaces, to make the public spaces once again available to the public, etc. Disperse, ridicule, isolate. If the gentle pretenses do not work, then we’ll do what we can to get rid of or evade the media (in all its forms), and then come in and beat the shit out of you.

So for all those out there in the OWS camps around the world (don’t forget there were protests in almost one thousand cities worldwide), and especially for those in the US, here’s some beautiful Steinbeck:

One man, one family driven from the land; this rusty car creaking along the highway to the West. I lost my land, a single tractor took my land. I’m alone and I am bewildered. In the night one family camps in a ditch and other family pulls in and the tents come out. The two men squat on their hams and the women and children listen. Here’s the node, you who hate change and fear revolution. Keep these two squatting men apart; make them hate, fear, suspect each other. Here is the anlage of the thing you fear. This is the zygote. For here “I lost my land” is changed; a cell is split and from its splitting grows the thing you hate — “we lost our land.” The danger is here, for two men are not as lonely and perplexed as one. And from his first “we” there grows a still more dangerous thing; “I have a little food” plus “I have none”. If from this problem the sum is “we have a little food”, the thing is on its way, the movement has direction. Only a little multiplication now, and this land, this tractor are ours. The two-men squatting in a ditch, the little fire, the side-meat stewing in a single pot, the silent, stone-eyed women; behind, the children listening with their souls to words their minds do not understand. The night draws down. The baby has a cold. Here, take this blanket. It’s wool. It was my mothers blanket — take it for the baby. This is the thing to bomb. This is the beginning — from “I” to “we”.

If you who own the things people must have could understand this, you might preserve yourself. If you could separate causes from results, if you could know that Paine, Marx, Jefferson, Lenin were results, not causes, you might survive. But that you cannot know. For the quality of owning freezes you forever into “I”, and cuts you off forever from the “we”.

The Western states are nervous under the beginning change. Need is the stimulus to concept, concept to action. A half-million people moving over the country; one million more restive, ready to move; 10 million more feeling the first nervousness.

And tractors turning the multiple furrows in the vacant land.

Leaving Barcelona

October 7th, 2011

I’m leaving Barcelona on October 19th and have a bunch of stuff I need to get rid of before then. If you’re interested anything below, please let me know ASAP. You’ll need to come pick things up in the Born, right next to Santa Maria del Mar. I’ve not put prices on anything. So either make an offer or tell me why I should just give you what you want for free. You can reach me via email to terry at-sign jon dot es.

  • Cheap ironing board
  • Braun iron
  • Vacuum cleaner
  • Dell DN1815 multi-function networked laser printer (black & white). Fax, copy, scan, print. 5 years old, works great.
  • 20" Miyata deluxe (48 spoke) unicycle
  • 26" Semcycle unicycle
  • 6 Renegade juggling clubs
  • Bag of about 15 silicone juggling balls
  • 2 minitorre computers (from about 2002) without hard drives
  • 3 Ikea CD shelves (each holds about 200 CDs)
  • 7 60cm wide x 2.5m tall white Ikea (Billy) bookshelves
  • 1 40cm wide x 2.5m tall white Ikea (Billy) bookshelf
  • 1 30cm wide x 2.5m tall white Ikea (Billy) bookshelf
  • 19" CRT monitor
  • 2 100Mbit ethernet hubs (5 port, 8 port)
  • 5 cable modems (DLink, Cisco, 3Com)
  • 2 Siemens Gigaset AS29H DECT phones, like new, white
  • White wooden Ikea TV/DVD table
  • Massive (3m by 1.2m) wall-mounted whiteboard
  • White Ikea filing cabinet (2 wide roll-out shelves)
  • Green wooden 6-drawer small rolling shelves
  • DVD player with sub-woofer & 5 external speakers
  • Sony CD player with sub-woofer & 2 external speakers
  • Panasonic NVGS230 hand-held video recorder, perfect condition
  • K2 rollerblades 6000 series, good condition, size 41/42
  • 5-wheel speed skates, size 41/42
  • Philips toaster
  • Large wooden Ikea cutting board
  • Kettle
  • Electric juice extractor
  • Hand-held electric blender
  • Barcelona apartment floor tiles. I have about 20 that I’ve accumulated over the years.

fishus

August 16th, 2011

Date: Tue, 26 Sep 95 15:55 MDT
From: mosterin@hydra.unm.edu (Ana Mosterin)
To: dsmith@cs.unm.edu

well, you should love wild cooking too
you have to find the right attitude:

you have to be sensitive enough
to feel the fear and shudder a bit at what you’re doing
and to love your piece of fishus enough
to touch it and smell it
with patience and lust
and then aaaaaaaarrrh! sacrifice it
and chop it skillfully
and be matter-of-fact enough
to to act like you’ve done it before
and professionally dry your hands
with your apron
and and have your hands on your hip
as you listen and smell to
the sound of the frying
breath in through your nose
as you watch the pan with love and think
“no, no more garlic,
just a half-cup of wine”
and relax!
it’s the ferocious poetry
of the wild cooking job
and then eating it will be twice as lovely
you’ll see

hey, derek,
cooking is not mary poppins!

La Storia di San Michele

August 8th, 2011

Image: Villa San Michele

[Written in 2003, as the first of a two-part story of a remarkable connection. Here’s part two.]

Axel Munthe

In 1928, Axel Munthe, a Swedish physician living on the isle of Capri, published The Story of San Michele. Munthe’s villa on the slopes of Mount Barbarossa stands on a site chosen almost two thousand years earlier by the emperor Tiberius, who from tiny Capri held sway over the entire Roman empire. Extraordinarily beautiful, the island passed at various times through the hands of the Greeks, the Romans (Caesar Augustus was captivated), the Dutchy of Naples, the Saracens, the Longobards, the Normans, the Angevins, the Aragonese, the Spanish, and the Bourbons.

On completing his medical studies, Munthe was the youngest physician in Europe. The Story of San Michele describes his time in Paris and Rome, his years as the physician to the Swedish Royal family and later his years as private physician to the queen of Sweden, who had also taken a liking to Capri. Written in English, The Story of San Michele, which remains in print, was an instant success, becoming the best-selling non-fiction book in the U.S. in 1930. Munthe’s novel approach to medicine and the book’s mixture of adventure, treasure, and royalty continue to inspire. The Story of San Michele was the mysterious target of one Henry Arthur Harrington, a petty thief who crisscrossed the UK, stealing 1,321 copies from second-hand bookstores before his eventual arrest in 1982. Even in 2003, Munthe’s contributions are the subject of learned attention: the Second International Symposium on Axel Munthe’s life and work will be held in Sweden tomorrow (September 13).

With the rapid success of The Story of San Michele, the book was a natural target for would-be translators. Editions in several languages were soon completed. Given its origin, it was odd that such a popular book was not more quickly translated into Italian.

Patricia Volterra

Living in Florence, Patricia Volterra was fascinated by the book and was eager for her husband Gualti to read it too. A minor obstacle: Gualti did not speak English. Undeterred, Volterra decided to translate the book into Italian. She wrote to John Murray, the publisher, requesting permission. To her surprise, she received a reply directly from Munthe. From Volterra’s diary, Munthe told her that:

the book had already been translated into several languages and was selling like wildfire. To date he had refused offers for it to be translated into Italian as, he wrote, this language, when written, was apt to become too flowery and overloaded and that he had written the book in an extremely simple style which he wished to retain. However, he continued, he suggested I should translate the last chapter, which he considered the most difficult, and send it to him to the Torre Matterita at Anacapri. He would then let me know whether he thought he could permit me to translate the rest.

Volterra sent off her translation of the final chapter and spent several weeks waiting for an answer. Finally her manuscript was returned “with an extremely complimentary letter from Munthe, telling me to proceed to do the rest.” Later she wrote that at that time nothing seemed impossible to her but that now she wouldn’t have even considered the translation.

While working on the translation, she had lunch with Munthe in Rome when Gualti, an Italian concert pianist, was playing at the Augusteum. Munthe was staying at Villa Svezia, the Queen of Sweden’s residence on the Via Aldovrandi. When Munthe saw her he exclaimed ‘My goodness, how old are you?’ She: ‘Twenty three.’ He: ‘And you are translating San Michele!’ Munthe was over 70 at the time.

Volterra sent the work to an Italian publisher, Mondadori, who refused her. “Their great loss,” she wrote. Another, Treves, accepted. Munthe “had decreed that the entire royalties should go to the Society for the Protection of Animals in Naples.” Volterra was to sell her translation for whatever she could get for it. This amounted to the equivalent of 50 pounds sterling for 8 months work.

Later that spring, Volterra traveled to Capri. In a horse-drawn cab they drove to Anacapri where they visited San Michele. From there on foot through the olives to the Torre di Materita to have lunch with Munthe. A variety of his dogs scampered round his heels as he showed them the old tower which was then his home. They had a vegetarian lunch served by Rosina, so affectionately mentioned by Munthe in his book.

The Volterra translation ran quickly into 35 editions and was still selling well when she left Italy in 1938. Mussolini was so impressed by La Storia di San Michele that he passed a law prohibiting the shooting of migratory birds on Capri.

Volterra saw Munthe one final time, in Jermyn Street, London. Munthe died in 1949, leaving the villa of San Michele to Sweden. Owned today by the Swedish Munthe Foundation, it is home to an ornithological research center and is open to the public.

[Continued in part two, "Bob Arno".]

Bob Arno

August 8th, 2011

Image: ABC Tasmania

[Written in 2003, this is the 2nd part of the story of a remarkable connection. You’ll need to read part one for the set up.]

For the last seven years, I’ve kept a web page full of people’s email about street scams they’ve been involved in (as victims) in Barcelona.

In the beginning I just wrote down brief descriptions of things that I saw or was involved in soon after moving to Spain. I’d seen hardly any street crime in my (then) 33 years and I found it fascinating to watch for. It certainly wasn’t hard to find. Often it came right to my door or to the street under my balcony. Before long I began to receive email from others who had visited or lived in Barcelona, each with their own story to tell. I put the stories onto the web page and they soon outnumbered my own. I continue to receive a few emails a month from people who’ve read the web page (generally after being robbed, though sometimes before leaving on a trip). I don’t often reply to these emails, apart from a line or two to say thanks when I put their messages on the web page, often months after they mail me.

For whatever reason, I’ve never been very interested to meet these people, though I’ve had plenty of chances to. In general I don’t seem to have much interest in meeting new people – it’s quite rare that I do. I should probably be more sociable (or something) because once in a while the consequences are immediately extraordinary.

Among my email, I get occasional contacts from people in the tourism industry. Lonely Planet, Fodor’s, people writing books or running travel services or web sites. Mainly they want to know if they can link to the web page, or to use some of the content in their own guides. I always agree without condition. After all, the main (but not the only) point is to help people be more aware, and besides, the majority of the content was written by other people who clearly share the same advisory aim. With this attention from various professionals who are trying to pass on the information, I began to wonder how many such people there were. Maybe there were other people with web sites devoted to street crime. So once in a while I’d do a web search on “street scams”, or something similar, just to see what came up. It’s usually interesting.

On July 30th 2001, I went looking around for similar web sites and ran across Bob Arno. I took a quick look around and fired off an email to say hello, and offered to buy him a beer the next time he was in Barcelona:

    Hi Bob

    I was just having a wander around the web when I ran into your
    pages about pickpockets. They look good, very useful.

    You might be interested to see a page of my own: http://jon.es/barna/scams.html

    All about things that have happened to people in Barcelona. It's
    not too well organized, but there's a lots of it. Most of it falls
    into well known classes of petty crime. Things are getting worse
    here, with the most recent tactics being strangulation from behind
    and squirting a flammable liquid onto people's backs and then, you
    guessed it, setting them on fire.

    Let me know next time you're in Barcelona and I'll buy you a
    beer. I'm also in Manhattan very often.

    Regards,
    Terry Jones.

Bob looked very interesting, and we seemed to have the same point of view on street crime. He’s a seasoned professional, a Vegas showman, and is constantly traveling the world studying many forms of crime and passing on his knowledge. Check out his website.

I sent mail to Derek, passing on Bob Arno’s URL. I said a little of how funny and random it seemed to me, of how over all the years of doing different things and meeting any number of famous and high-powered academics and intellectuals etc., and not really having much interest in any of them, that I’m sending email to this Bob Arno guy suggesting we meet up.

The next day I read more about Bob’s exploits and interests and I guessed that we would probably get on really well. I sent off a longer email with some more of my observations about Barcelona:

    Hi again.

    I sent off that first email without having looked at more than a
    page or two of your web site.

    It's very interesting to read more. I spend far too much time
    thinking about and watching for petty thieves in Barcelona. I've
    thought about many of the issues touched on in the interview with
    you by your own TSJ. The whole thing is very intriguing and lately
    I've begun to wonder increasingly what I can do about it, and if I
    want to do anything about it. I have tended to act to try to stop
    pickpockets, but I've also seen things many times from a distance
    or a height, read many things, seen freshly robbed people weeping,
    talked to many people who have been robbed, thought of this as an
    art (I'm interviewed in a Barcelona newspaper under the headline
    "Some crimes are a work of art" - I'm not sure if they understood
    what I meant), etc. I've never tried filming these people. But I
    know how they look at you when they know they have been spotted,
    how their faces look when the wallet hits the floor, how they prey
    on Western or "rich" psychology, and so many other things.  My
    focus has been Barcelona, after coming to live here 5 years ago
    and (at that time) having an apartment 1 floor up about 100 meters
    from Plaza Real. If I had had a net I could have caught people
    several times a day.

    I recently got a video camera and was thinking of interviewing the
    woman on my web site who was strangled here earlier this month. By
    the way, the papers reported up to 9 cases of such stranglings in a
    single day. I wasn't quite sure what to do with the tape. It hadn't
    occurred to me to film the thieves, but it would be so easy.  In
    Barcelona it's trivial to spot these people, and also feels very
    safe since many of them have been arrested literally hundreds of
    times.  There is basically no deterrent. There are undoubtedly more
    sophisticated pickpockets here too, but there is little in the way
    of evolutionary pressure to make them improve their methods. The
    tourists are too many and too unaware, the police are too few, and
    the laws are too slack. Why would you even bother to improve or
    think?

    I also know the boredom that comes with professional acts. I used to
    do a lot of juggling and unicycling, practicing 6 hours a day for a
    long time. But I could never stand to have a canned show that I did
    time after time - it was just too routine to have a routine. So I
    refused and eventually drifted into other things.

    How can I get a copy of your book? It doesn't seem to say on the web
    site. Also, the menu of links at the top left of your pages looks
    extremely garbled under my browser (Opera).

    Terry

As it turned out, my timing was perfect. I got a mail back the next day from Bob’s wife Bambi (yes, really). She said they’d be in Barcelona in just 5 days time and that they’d love to meet up.

And meet up we did!

They came to our apartment and we all hit it off immediately. As I’d thought, we did have a lot in common, both in terms of what we had done and in outlook. They told me they also get lots of email through their web site and hardly ever reply. Ana and I took them out for food. We sat outside at the nearby Textile Museum. Later, Ana went home to look after Sofia, and I stayed with Bob and Bambi. In the end I was with them about five hours and I had a really good time. We arranged to meet the next day to go hunting for thieves on the Ramblas. In one sense, “hunting” isn’t at all the right word: the thieves are typically very obvious to anyone who’s actually paying attention. But there’s a lot of subtlety in tracking and filming them, so it really is something like a hunt. I’ve since spent many hours, on several occasions, in action with Bob and Bambi in Barcelona. But that’s another story.

After getting home that first night, I went back to Bob’s web site and read more of his pages. He’s had a pretty colorful life. Actually, it’s extraordinarily colorful by almost any measure. “Who is this Bob Arno?” I wondered. Fortunately, Bob has a “Who is Bob Arno?” page, which I finally got around to reading.

Halfway down… unbelievable… I want to cry.

    Born in Sweden, Bob Arno is a great-grandson of Dr. Axel Munthe,
    who is most famous for his novel The Story of San Michele.

Patricia Volterra was my great aunt.

txdpce: a Twisted class for deferred parallel command execution

July 12th, 2011

I just uploaded a simple Twisted Python class, txdpce, to Launchpad. It’s designed for situations where you have multiple ways of obtaining a function result and you want to try them all in parallel and return the result from the fastest function. A typical case is that you can either compute a result via a network call or try to get it out of a cache (perhaps also via a network call, to memcached). You might also be able to compute it locally, etc.

Things can be more complicated than provided for here, of course. E.g., you might like to cache the result of a local call (if it finishes first or if the cache lookup fails). The txdpce class is supposed to be a simple demonstration. I wrote it for a bit of fun this morning and also because it’s yet another nice example of how you can click together the building blocks of Twisted to form increasingly sophisticated classes.

Here’s the class. You’ll find a test suite at the Launchpad site. You can download the code using bzr via bzr branch lp:txdpce.

from twisted.internet import defer
from twisted.python import failure

class ParallelCommandException(Exception):
    pass

class DeferredParallelCommandExecutor(object):
    """
    Provides a mechanism for the execution of a command by multiple methods,
    returning the result from the one that finishes first.
    "
""

    def __init__(self):
        self._functions = []

    def registerFunction(self, func):
        """
        Add func to the list of functions that will be run when execute is
        called.

        @param func: A callable that should be run when execute is called.
        """
        self._functions.append(func)

    def execute(self, *args, **kwargs):
        """
        Run all the functions in self._functions on the given arguments and
        keyword arguments.

        @param args: Arguments to pass to the registered functions.

        @param kwargs: Keyword arguments to pass to the registered functions.

        @raise RuntimeError: if no execution functions have been registered.

        @return: A C{Deferred} that fires when the first of the functions
        finishes.
        """
        if not self._functions:
            raise RuntimeError('No execution functions have been registered.')

        deferreds = [defer.maybeDeferred(func, *args, **kwargs)
                     for func in self._functions]
        d = defer.DeferredList(deferreds, fireOnOneCallback=1, consumeErrors=1)
        d.addCallback(self._done, deferreds)
        return d

    def _done(self, deferredListResult, deferreds):
        """
        Process the result of the C{DeferredList} execution of all the
        functions in self._functions. If result is a tuple, it's the result
        of a successful function, in which case we cancel all the other
        deferreds (none of which has finished yet) and give back the
        result.  Otherwise, all the function calls must have failed (since
        we passed fireOnOneCallback=True to the C{DeferredList} and we
        return a L{ParallelCommandException} containing the failures.

        @param deferredListResult: The result of a C{DeferredList} returned
        by self.execute firing.

        @param deferreds: A list of deferreds for other functions that were
        trying to compute the result.

        @return: Either the result of the first function to successfully
        compute the result or a C{failure.Failure} containing a
        L{ParallelCommandException} with a list of the failures from all
        functions that tried to get the result.
        """
        if isinstance(deferredListResult, tuple):
            # A tuple result means the DeferredList fired with a successful
            # result.  Cancel all other deferreds and return the result.
            result, index = deferredListResult
            for i in range(len(self._functions)):
                if i != index:
                    deferreds[i].cancel()
            return result
        else:
            # All the deferreds failed. Return a list of all the failures.
            failures = [fail for (result, fail) in deferredListResult]
            return failure.Failure(ParallelCommandException(failures))
 

A resizable dispatch queue for Twisted

June 27th, 2011

In December 2009 I posted to the Twisted mailing list about what I called a Resizable Dispatch Queue. I’ve just spent some time making a new version that’s much improved. You can pick up the new version via pip install txrdq, from PyPI, or from the txRDQ project page on Launchpad. Here’s the example use case, taken from my original posting:

You want to write a server with a web interface that allows people to enter their phone number so you can send them an SMS. You anticipate lots of people will use the service. But sending SMS messages is quite slow, and the company that you ship those jobs off to is concerned that you’ll overrun their service (or maybe they have an API limit, etc). So you need to queue up jobs locally and send them off at a certain rate. You’d like to be able to adjust that rate up or down. You also want to be able to shut your service down cleanly (i.e., not in the middle of a task), and when you restart it you want to be able to re-queue the jobs that were queued last time but which hadn’t gone out.

To make the example more concrete, suppose your function that sends the SMS is called sendSMS and that it takes a (number, message) tuple argument. Here are some of the kinds of things you can do:

from txrdq.rdq import ResizableDispatchQueue

# Create a queue that will allow 5 simultaneous underway jobs.
rdq = ResizableDispatchQueue(sendSMS, 5)

# Later... send off some SMS messages.
d1 = rdq.put((2127399921, 'Hello...'), priority=5)
d1.addCallback(...)

d2 = rdq.put((5052929919, 'Test...'), priority=5)
d2.addCallback(...)

# Cancel the second job
d2.cancel()

# Widen the outgoing pipeline to 10 simultaneous jobs.
rdq.width = 10

# We're dispatching jobs too fast, turn it down a little.
rdq.width = 7

# Get a copy of the list of pending jobs.
jobs = rdq.pending()

# Cancel one of the pending jobs from the jobs list.
job.cancel()

# Reprioritize one of the pending jobs from the jobs list.
rdq.reprioritize(job, -1)

# Arrange to increase the number of jobs in one hour.
reactor.callLater(3600, rdq.setWidth, 20)

# Pause processing.
rdq.pause()

# Resume processing, with a new width of 8.
rdq.resume(8)

# Shutdown. Wait for any underway jobs to complete, and save
# the list of jobs not yet processed.

def saveJobs(jobs):
    pickle.dump(jobs, ...)

d = rdq.stop()
d.addCallback(saveJobs)
 

I’ve come up with many uses for this class over the last 18 months, and have quite often recommended it to people on the #twisted IRC channel. Other examples include fetching a large number of URLs in a controlled way, making many calls to the Twitter API, etc.

Usage of the class is very simple. You create the dispatch queue, giving it a function that will be called to process all jobs. Then you just put job arguments as fast as you like. Each call to put gets you a Twisted Deferred instance. If your function runs successfully on the argument, the deferred will call back with an instance of txrdq.job.Job. The job contains information about when it was queued, when it was launched, when it stopped, and of course the result of the function. If your function hits an error or the job is canceled (by calling cancel on the deferred), the deferred will errback and the failure will again contain a job instance with the details.

It’s also useful to have an admin interface to the queue, so calls such as pending and underway are provided. These return lists of job instances. You can call cancel on a job instance too. You can reprioritize jobs. And you can pause processing or shut the queue down cleanly.

The code also contains an independently useful Twisted classes called DeferredPriorityQueue (which I plan to write about), and DeferredPool (which I described earlier).

Back of the envelope calculations with The Rule of 72

June 20th, 2011

Image: internetworldstats.com

The Rule of 72 deserves to be better known among technical people. It’s a widely-known financial rule of thumb used for understanding and calculating interest rates. But others, including computer scientist and start-up founders, are often concerned with growth rates. Knowing and applying the rule of 72 can help in developing numerical literacy (numeracy) around growth.

For example, consider Moore’s Law, which describes how "the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years." If something doubles every two years, at what rate does it increase per month, on average? If you know the rule of 72, you’ll instantly know that the monthly growth rate is about 3%. You get the answer by dividing 72 by 24 (the number of months).

Computer scientists are usually very familiar with powers of two. It’s often convenient to take advantage of the fact that 2^10 is about 1,000. That means that when something increases by a factor of 1,000, it has doubled about 10 times. By extension, and with a little more error, an increase of a million corresponds to 20 doublings, and a billion is 30 doublings (log base two of a billion is actually 29.897, so the error isn’t too wild). You can use this to ballpark the number of doublings in a process really easily, and go directly from that to a growth rate using the rule of 72.

For example, the bottom of this page tells us that there were about 16,000 internet domains on July 1st 1992, and 1.3M of them on July 1st 1997. Let’s think in thousands: that’s a jump from 16 to just over 1,000 in 5 years. To get from 1 to 16 is four doublings, so from 16 to 1,000 is six doublings (because 1,000 is ten doublings from 1). So the number of domains doubled 6 times in 5 years, or 6 times in 60 months, or once every 10 months (on average). If you want something to double in 10 months, the rule of 72 tells us we need a growth rate of 7.2% per month. To check: 16,000 * (1.072 ^ 60) = 1,037,067. That’s a damned good estimate (remember that we were shooting for 1M, not 1.3M) for five seconds of mental arithmetic! Note that the number of domains was growing much faster than Moore’s law (3% per month).

You can quickly get very good at doing these sorts of calculations. Here’s another easy example. This page shows the number of internet users growing from 16M in December 1995 to 2,072M in March of 2011. That’s just like the above example, but it’s 7 doublings in 15.25 years, or 183 months. That’s pretty close to a doubling every 24 months, which we know from above corresponds to 3% growth per month.

You can use facility with growth rates to have a good sense for interest rates in general. You can use it when building simple (exponential) models of product growth. E.g., suppose you’re launching a product and you reckon you’ll have 300K users in a year’s time. You want to map this out in a spreadsheet using a simple exponential model. What should the growth rate be? 300K is obviously not much more than 256 * 1,024, which is 18 doublings in 365 days, or a doubling roughly every 20 days. The rule of 72 gives 72/20 = ~3.5, so you need to grow 3.5% every day to hit your target. Is that reasonable? If it is, it means that when you hit 300K users, you’ll be signing up about 3.5% of that number, or 10,500 users per day. As you can see, familiarity with powers of two (i.e., estimating number of doublings) and with the rule of 72 can give you ballpark figures really easily. You can even use your new math powers to avoid looking stupid in front of VCs.

The math behind the rule of 72 is easy to extend to triplings (rule of 110), quadrupling (rule of 140), quintupling (rule of 160), etc.

Finally, you can use these rules of thumb to do super geeky party tricks. E.g., what’s the tenth root of two? Put another way, what interest rate do you need for something to double after ten periods? The rule of 72 tells you it’s 72/10 = 7.2%, so the tenth root of two will be about 1.072 (in fact 1.072 ^ 10 = 2.004). What’s the 20th root of 5? The rule of 160 tells you you need 160/20 = 8% growth each period, so 1.08 should be about right (the correct answer is ~1.0838).

As with all rules of thumb, it’s good to have a sense of when it’s most applicable. See the wikipedia page or this page for more detailed information. It’s also of course good to understand that it may not be suitable to model growth as an exponential at all.

How to asynchronously exchange a dictionary using Twisted deferreds

June 10th, 2011

Here’s a fun class that I can’t think of a good use for :-) But I like its simplicity and it’s another illustration of what I like to call asynchronous data structures.

Suppose you have a producer who’s building a dictionary and a consumer who wants to look things up in it. The producer is working at their own pace, making new dictionary entries in whatever work order that suits them, and the consumer is requesting dictionary items in whatever order they need them. The two orders are obviously extremely unlikely to be the same if the dictionary is of non-trivial size. How do you write an asynchronous server (or data structure) that sits between the producer and consumer?

Yes, it’s far fetched, perhaps, but here’s a simple asynchronous dictionary class that lets you do it gracefully:

from collections import defaultdict
from twisted.internet import defer

class DeferredDict(dict):
    def __init__(self, *args, **kwargs):
        self._deferreds = defaultdict(set)
        dict.__init__(self, *args, **kwargs)

    def __getitem__(self, item):
        try:
            return defer.succeed(dict.__getitem__(self, item))
        except KeyError:
            d = defer.Deferred()
            self._deferreds[item].add(d)
            return d

    def __setitem__(self, item, value):
        if item in self._deferreds:
            for d in self._deferreds[item]:
                d.callback(value)
            del self._deferreds[item]
        dict.__setitem__(self, item, value)
 

When a consumer tries to get an element from the dictionary, they always get a deferred. The deferred will fire with the value from the dictionary when (if) it becomes available. Of course if the value is already known, they get it in a deferred that has already fired (via succeed). When the producer puts an element into the dictionary, any consumer deferreds that were waiting on that element’s value are given the value.

Note that a __delitem__ isn’t needed, we just inherit that from dict. If a non-existent item is deleted, you get the normal dictionary behavior (a KeyError). If the item does exist, that means the list of waiting deferreds on that item is empty (the fact the item exists means any waiting deferreds for the item have all been fired and that that item in the self._deferreds dictionary was deleted), so we can just let the dictionary class delete the item, as usual.

Graceful shutdown of a Twisted service with outstanding deferreds

June 10th, 2011

I’ve been spending a bit of time thinking again about queues and services. I wrote a Twisted class in 2009 to maintain a resizable dispatch queue (code in Launchpad, description on the Twisted mailing list). For this post I’ve pulled out (and simplified slightly) one of its helper classes, a DeferredPool.

This simple class maintains a set of deferreds and gives you a mechanism to get a deferred that will fire when (if!) the size of the set ever drops to zero. This is useful because it can be used to gracefully shut down a service that has a bunch of outstanding requests in flight. For each incoming request (that’s handled via a deferred), you add the deferred to the pool. When a signal arrives to tell the service to stop, you stop taking new requests and ask the pool for a deferred that will fire when all the outstanding deferreds are done, then you exit. This can all be done elegantly in Twisted, the last part by having the stopService method return the deferred you get back from the pool (perhaps after you add more cleanup callbacks to it).

Here’s the code:

from twisted.internet import defer

class DeferredPool(object):
    """Maintains a pool of not-yet-fired deferreds and gives a mechanism to
    request a deferred that fires when the pool size goes to zero."
""

    def __init__(self):
        self._pool = set()
        self._waiting = []

    def _fired(self, result, d):
        """Callback/errback each pooled deferred runs when it fires. The
        deferred first removes itself from the pool. If the pool is then
        empty, fire all the waiting deferreds (which were returned by
        notifyWhenEmpty)."
""
        self._pool.remove(d)
        if not self._pool:
            waiting, self._waiting = self._waiting, []
            for waiter in waiting:
                waiter.callback(None)
        return result

    def add(self, d):
        """Add a deferred to the pool."""
        d.addBoth(self._fired, d)
        self._pool.add(d)

    def notifyWhenEmpty(self, testImmediately=True):
        """Return a deferred that fires (with None) when the pool empties.
        If testImmediately is True and the pool is empty, return an already
        fired deferred (via succeed)."
""
        if testImmediately and not self._pool:
            return defer.succeed(None)
        else:
            d = defer.Deferred()
            self._waiting.append(d)
            return d
 

As usual I’m posting this example because I find Twisted’s deferreds so elegant. Here are a few comments on the above that might help you understand deferreds better.

A frequent pattern when creating and managing deferreds is that you can add callbacks and errbacks to them yourself to transparently do some housekeeping when they fire. In this case, for each deferred passed to add, I’m adding a callback and an errback that will run self._fired when the deferred fires. The first thing that method does is take the deferred out of the pool of outstanding deferreds. So the deferred itself cleans up the pool. It does that transparently, by which I mean that the call/errback function (self._fired) always returns whatever result it was passed. It’s on both the callback and errback chains of the deferred and has no effect on the result. The deferred may already have call/errbacks on it when it is passed to add, and it may have them added to it after add is done. Whoever created and is otherwise using the deferred will be none the wiser and is in no way affected.

When a deferred in the pool fires, it also checks to see if the pool size is zero and if there are any deferreds waiting to be informed of that. If so, it fires all the waiting deferreds and empties the list of waiting deferreds. This doesn’t mean the action is necessarily over. More deferreds can be added, more waiters can be added, etc. The pool size can go to zero again and if there are no waiters are waiting, no big deal, etc.

It’s easy to add functionality to e.g., record what time deferreds were added, provide stats, allow outstanding deferreds to be cancelled, add notifications when high/low water marks are reached, etc. But that’s enough for now. Feel free to ask questions below.

The eighty six non-trivial powers ≤ 2^20

March 30th, 2011

Tonight Jamu Kakar mentioned in IRC that a program of his had unexpectedly crashed after processing 1,048,376 items. I think it’s a useful debugging skill to have to be able to recognize numbers like that (it’s very close to 2^20). I’ve often wanted to write a tiny program to print out all the non-trivial powers, and since I have far more important and pressing things to be doing, I immediately went to write the code. At a minimum it seems prudent to recognize all powers up to 1000, and the powers of 2 to much higher. Below you have all 86 non-trivial powers up to 2^20. I don’t know them all, but I wish I did.

  4 = 2^2                  729 = 3^6, 9^3                32768 = 2^15, 8^5
  8 = 2^3                 1000 = 10^3                    38416 = 14^4
  9 = 3^2                 1024 = 2^10, 4^5               46656 = 6^6
 16 = 2^4, 4^2            1296 = 6^4                     50625 = 15^4
 25 = 5^2                 1331 = 11^3                    59049 = 3^10, 9^5
 27 = 3^3                 1728 = 12^3                    65536 = 2^16, 4^8, 16^4
 32 = 2^5                 2048 = 2^11                    78125 = 5^7
 36 = 6^2                 2187 = 3^7                     83521 = 17^4
 49 = 7^2                 2197 = 13^3                   100000 = 10^5
 64 = 2^6, 4^3, 8^2       2401 = 7^4                    104976 = 18^4
 81 = 3^4, 9^2            2744 = 14^3                   117649 = 7^6
100 = 10^2                3125 = 5^5                    130321 = 19^4
121 = 11^2                3375 = 15^3                   131072 = 2^17
125 = 5^3                 4096 = 2^12, 4^6, 8^4, 16^3   160000 = 20^4
128 = 2^7                 4913 = 17^3                   161051 = 11^5
144 = 12^2                5832 = 18^3                   177147 = 3^11
169 = 13^2                6561 = 3^8, 9^4               248832 = 12^5
196 = 14^2                6859 = 19^3                   262144 = 2^18, 4^9, 8^6
216 = 6^3                 7776 = 6^5                    279936 = 6^7
225 = 15^2                8000 = 20^3                   371293 = 13^5
243 = 3^5                 8192 = 2^13                   390625 = 5^8
256 = 2^8, 4^4, 16^2     10000 = 10^4                   524288 = 2^19
289 = 17^2               14641 = 11^4                   531441 = 3^12, 9^6
324 = 18^2               15625 = 5^6                    537824 = 14^5
343 = 7^3                16384 = 2^14, 4^7              759375 = 15^5
361 = 19^2               16807 = 7^5                    823543 = 7^7
400 = 20^2               19683 = 3^9                   1000000 = 10^6
512 = 2^9, 8^3           20736 = 12^4                  1048576 = 2^20, 4^10, 16^5
625 = 5^4                28561 = 13^4

I produced the above with this quick hack:

from collections import defaultdict

powers = defaultdict(list)
lim = 20

for a in range(2, lim + 1):
    for b in range(2, lim + 1):
        n = a ** b
        if n > 2 ** lim:
            break
        powers[n].append((a, b))

for n in sorted(powers.keys()):
    print '%7d = %s' % (n,
                        ', '.join('%d^%d' % (a, b)
                                  for (a, b) in powers[n]))