Posted Wednesday, June 24th, 2009 at 9:44 pm under python, twitter.

Python code for retrieving all your tweets

Here’s a little Python code to pull back all a user’s Twitter tweets. Make sure you read the notes at bottom in case you want to use it.

import sys, twitter, operator
from dateutil.parser import parse

twitterURL = ‘http://twitter.com’

def fetch(user):
    data = {}
    api = twitter.Api()
    max_id = None
    total = 0
    while True:
        statuses = api.GetUserTimeline(user, count=200, max_id=max_id)
        newCount = ignCount = 0
        for s in statuses:
            if s.id in data:
                ignCount += 1
            else:
                data[s.id] = s
                newCount += 1
        total += newCount
        print >>sys.stderr, "Fetched %d/%d/%d new/old/total." % (
            newCount, ignCount, total)
        if newCount == 0:
            break
        max_id = min([s.id for s in statuses])1
    return data.values()

def htmlPrint(user, tweets):
    for t in tweets:
        t.pdate = parse(t.created_at)
    key = operator.attrgetter(‘pdate’)
    tweets = sorted(tweets, key=key)
    f = open(‘%s.html’ % user, ‘wb’)
    print >>f, """<html><title>Tweets for %s</title>
    <meta http-equiv="
Content-Type" content="text/html;charset=utf-8">
    <body><small>"
"" % user
    for i, t in enumerate(tweets):
        print >>f, ‘%d. %s <a href="%s/%s/status/%d">%s</a><br/>’ % (
            i, t.pdate.strftime(‘%Y-%m-%d %H:%M’), twitterURL,
            user, t.id, t.text.encode(‘utf8′))
    print >>f, ‘</small></body></html>’
    f.close()
   
if __name__ == ‘__main__’:
    user = ‘terrycojones’ if len(sys.argv) < 2 else sys.argv[1]
    data = fetch(user)
    htmlPrint(user, data)
 

Notes:

Fetch all of a user’s tweets and write them to a file username.html (where username is given on the command line).

Output is to a file instead of to stdout as tweet texts are unicode and sys.stdout.encoding is ascii on my machine, which prevents printing non-ASCII chars.

This code uses the Python-Twitter library. You need to get (via SVN) the very latest version, and then you need to fix a tiny bug, described here. Or wait a while and the SVN trunk will be patched.

This worked flawlessly for my 2,300 tweets, but only retrieved about half the tweets of someone who had over 7,000. I’m not sure what happened there.

There are tons of things that could be done to make the output more attractive and useful. And yes, for nitpickers, the code has a couple of slight inefficiencies :-)

Tags: ,

  • am

    nice, but didn’t work for me, not to mention those ‘smart quotes’.

    i prefer twitter-log username… it’s also python and outputs to plain text… (pip install twitter)

  • Keerthantantry

    I am the beginer to python. Can you please tell me the way to run the code?i have installed the python 2.7 and downloaded the python twitter library too

  • iake

    How to retrieve all tweet in favorites timeline?

  • http://twitter.com/iake Ake

    How to retrieve all tweet in favorites timeline?

  • http://www.bionovanaturalpools.com natural pond design

    this is nice information need to know more

    Thanks
    sam hardsy
    ______________________________________________

  • http://www.haykranen.nl/en/ Hay

    > This worked flawlessly for my 2,300 tweets, but only retrieved
    > about half the tweets of someone who had over 7,000. I?m not
    > sure what happened there.

    See the API docs: http://apiwiki.twitter.com/Things-Every-Developer-Should-Know#6Therearepaginationlimits

    “Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API methods. Requests for more than the limit will result in a reply with a status code of 200 and an empty result in the format requested. Twitter still maintains a database of all the tweets sent by a user. However, to ensure performance of the site, this artificial limit is temporarily in place.”

  • http://www.haykranen.nl/en/ Hay

    > This worked flawlessly for my 2,300 tweets, but only retrieved
    > about half the tweets of someone who had over 7,000. I?m not
    > sure what happened there.

    See the API docs: http://apiwiki.twitter.com/Things-Every-Developer-Should-Know#6Therearepaginationlimits

    “Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API methods. Requests for more than the limit will result in a reply with a status code of 200 and an empty result in the format requested. Twitter still maintains a database of all the tweets sent by a user. However, to ensure performance of the site, this artificial limit is temporarily in place.”

  • http://www.mikehedge.com Mike Hedge

    genius

  • http://www.mikehedge.com/ Mike Hedge

    genius

  • http://twitter.com/amarshwren Steve Ryner

    Cheers! I cut-n-pasted this into my terminal and then had to clean up “educated” quotes and at least one en-dash that was masquerading as a minus. Not sure if it’s safari, your HTML, EBCAK, but FYI.

  • http://twitter.com/amarshwren Steve Ryner

    Cheers! I cut-n-pasted this into my terminal and then had to clean up “educated” quotes and at least one en-dash that was masquerading as a minus. Not sure if it’s safari, your HTML, EBCAK, but FYI.

  • http://xavire.combelle.free.fr Xavier Combelle

    I think the amout of tweets that can be retrieve with the api is ilimited. I could check in the api doc but I’m lazy. However none of tweet are lost.

  • http://xavire.combelle.free.fr/ Xavier Combelle

    I think the amout of tweets that can be retrieve with the api is ilimited. I could check in the api doc but I’m lazy. However none of tweet are lost.