Python code for retrieving all your tweets
Here’s a little Python code to pull back all a user’s Twitter tweets. Make sure you read the notes at bottom in case you want to use it.
import sys, twitter, operator
from dateutil.parser import parse
twitterURL = 'http://twitter.com'
def fetch(user):
data = {}
api = twitter.Api()
max_id = None
total = 0
while True:
statuses = api.GetUserTimeline(user, count=200, max_id=max_id)
newCount = ignCount = 0
for s in statuses:
if s.id in data:
ignCount += 1
else:
data[s.id] = s
newCount += 1
total += newCount
print >>sys.stderr, "Fetched %d/%d/%d new/old/total." % (
newCount, ignCount, total)
if newCount == 0:
break
max_id = min([s.id for s in statuses]) - 1
return data.values()
def htmlPrint(user, tweets):
for t in tweets:
t.pdate = parse(t.created_at)
key = operator.attrgetter('pdate')
tweets = sorted(tweets, key=key)
f = open('%s.html' % user, 'wb')
print >>f, """Tweets for %s
""" % user
for i, t in enumerate(tweets):
print >>f, '%d. %s %s
' % (
i, t.pdate.strftime('%Y-%m-%d %H:%M'), twitterURL,
user, t.id, t.text.encode('utf8'))
print >>f, ''
f.close()
if __name__ == '__main__':
user = 'terrycojones' if len(sys.argv) < 2 else sys.argv[1]
data = fetch(user)
htmlPrint(user, data)
Notes:
Fetch all of a user's tweets and write them to a file username.html (where username is given on the command line).
Output is to a file instead of to stdout as tweet texts are unicode and sys.stdout.encoding is ascii on my machine, which prevents printing non-ASCII chars.
This code uses the Python-Twitter library. You need to get (via SVN) the very latest version, and then you need to fix a tiny bug, described here. Or wait a while and the SVN trunk will be patched.
This worked flawlessly for my 2,300 tweets, but only retrieved about half the tweets of someone who had over 7,000. I'm not sure what happened there.
There are tons of things that could be done to make the output more attractive and useful. And yes, for nitpickers, the code has a couple of slight inefficiencies :-)
You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.
July 7th, 2009 at 8:26 pm
I think the amout of tweets that can be retrieve with the api is ilimited. I could check in the api doc but I’m lazy. However none of tweet are lost.
July 7th, 2009 at 9:26 pm
I think the amout of tweets that can be retrieve with the api is ilimited. I could check in the api doc but I’m lazy. However none of tweet are lost.
July 8th, 2009 at 6:19 am
Cheers! I cut-n-pasted this into my terminal and then had to clean up “educated” quotes and at least one en-dash that was masquerading as a minus. Not sure if it’s safari, your HTML, EBCAK, but FYI.
July 8th, 2009 at 7:19 am
Cheers! I cut-n-pasted this into my terminal and then had to clean up “educated” quotes and at least one en-dash that was masquerading as a minus. Not sure if it’s safari, your HTML, EBCAK, but FYI.
July 14th, 2009 at 11:42 am
genius
July 14th, 2009 at 12:42 pm
genius
August 12th, 2009 at 5:39 pm
> This worked flawlessly for my 2,300 tweets, but only retrieved
> about half the tweets of someone who had over 7,000. I?m not
> sure what happened there.
See the API docs: http://apiwiki.twitter.com/Things-Every-Developer-Should-Know#6Therearepaginationlimits
“Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API methods. Requests for more than the limit will result in a reply with a status code of 200 and an empty result in the format requested. Twitter still maintains a database of all the tweets sent by a user. However, to ensure performance of the site, this artificial limit is temporarily in place.”
August 12th, 2009 at 6:39 pm
> This worked flawlessly for my 2,300 tweets, but only retrieved
> about half the tweets of someone who had over 7,000. I?m not
> sure what happened there.
See the API docs: http://apiwiki.twitter.com/Things-Every-Developer-Should-Know#6Therearepaginationlimits
“Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API methods. Requests for more than the limit will result in a reply with a status code of 200 and an empty result in the format requested. Twitter still maintains a database of all the tweets sent by a user. However, to ensure performance of the site, this artificial limit is temporarily in place.”
November 11th, 2009 at 11:31 pm
this is nice information need to know more
Thanks
sam hardsy
______________________________________________
November 13th, 2009 at 2:31 am
How to retrieve all tweet in favorites timeline?
November 13th, 2009 at 9:31 am
How to retrieve all tweet in favorites timeline?
March 15th, 2012 at 6:56 pm
I am the beginer to python. Can you please tell me the way to run the code?i have installed the python 2.7 and downloaded the python twitter library too
April 5th, 2012 at 11:11 pm
nice, but didn’t work for me, not to mention those ‘smart quotes’.
i prefer twitter-log username… it’s also python and outputs to plain text… (pip install twitter)