Archive for June, 2009

Python code for retrieving all your tweets

Wednesday, June 24th, 2009

Here’s a little Python code to pull back all a user’s Twitter tweets. Make sure you read the notes at bottom in case you want to use it.

import sys, twitter, operator
from dateutil.parser import parse

twitterURL = ‘http://twitter.com’

def fetch(user):
    data = {}
    api = twitter.Api()
    max_id = None
    total = 0
    while True:
        statuses = api.GetUserTimeline(user, count=200, max_id=max_id)
        newCount = ignCount = 0
        for s in statuses:
            if s.id in data:
                ignCount += 1
            else:
                data[s.id] = s
                newCount += 1
        total += newCount
        print >>sys.stderr, "Fetched %d/%d/%d new/old/total." % (
            newCount, ignCount, total)
        if newCount == 0:
            break
        max_id = min([s.id for s in statuses])1
    return data.values()

def htmlPrint(user, tweets):
    for t in tweets:
        t.pdate = parse(t.created_at)
    key = operator.attrgetter(‘pdate’)
    tweets = sorted(tweets, key=key)
    f = open(‘%s.html’ % user, ‘wb’)
    print >>f, """<html><title>Tweets for %s</title>
    <meta http-equiv="
Content-Type" content="text/html;charset=utf-8">
    <body><small>"
"" % user
    for i, t in enumerate(tweets):
        print >>f, ‘%d. %s <a href="%s/%s/status/%d">%s</a><br/>’ % (
            i, t.pdate.strftime(‘%Y-%m-%d %H:%M’), twitterURL,
            user, t.id, t.text.encode(‘utf8′))
    print >>f, ‘</small></body></html>’
    f.close()
   
if __name__ == ‘__main__’:
    user = ‘terrycojones’ if len(sys.argv) < 2 else sys.argv[1]
    data = fetch(user)
    htmlPrint(user, data)
 

Notes:

Fetch all of a user’s tweets and write them to a file username.html (where username is given on the command line).

Output is to a file instead of to stdout as tweet texts are unicode and sys.stdout.encoding is ascii on my machine, which prevents printing non-ASCII chars.

This code uses the Python-Twitter library. You need to get (via SVN) the very latest version, and then you need to fix a tiny bug, described here. Or wait a while and the SVN trunk will be patched.

This worked flawlessly for my 2,300 tweets, but only retrieved about half the tweets of someone who had over 7,000. I’m not sure what happened there.

There are tons of things that could be done to make the output more attractive and useful. And yes, for nitpickers, the code has a couple of slight inefficiencies :-)

Paella

Sunday, June 14th, 2009

671505-paella-3

A middling tower

Tuesday, June 9th, 2009

648993-eiffel

2 cents

Friday, June 5th, 2009

My bank account hits rock bottom, at 2 cents, while building Fluidinfo.

634761-bbva-highlighted