How To Get Tweets From A Twitter Account Using Python And Tweepy
Threats & Research
In this blog post, I’ll explain how to obtain data from a specified Twitter account using tweepy and Python. Let’s jump straight into the code!
As usual, we’ll start off by importing dependencies. I’ll use the datetime and Counter modules later on to do some simple analysis tasks.
from tweepy import OAuthHandler from tweepy import API from tweepy import Cursor from datetime import datetime, date, time, timedelta from collections import Counter import sys
The next bit creates a tweepy API object that we will use to query for data from Twitter. As usual, you’ll need to create a Twitter application in order to obtain the relevant authentication keys and fill in those empty strings. You can find a link to a guide about that in one of the previous articles in this series.
consumer_key="" consumer_secret="" access_token="" access_token_secret="" auth = OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) auth_api = API(auth)
Names of accounts to be queried will be passed in as command-line arguments. I'm going to exit the script if no args are passed, since there would be no reason to continue.
account_list = [] if (len(sys.argv) > 1): account_list = sys.argv[1:] else: print("Please provide a list of usernames at the command line.") sys.exit(0)
Next, let's iterate through the account names passed and use tweepy's API.get_user() to obtain a few details about the queried account.
if len(account_list) > 0: for target in account_list: print("Getting data for " + target) item = auth_api.get_user(target) print("name: " + item.name) print("screen_name: " + item.screen_name) print("description: " + item.description) print("statuses_count: " + str(item.statuses_count)) print("friends_count: " + str(item.friends_count)) print("followers_count: " + str(item.followers_count))
Twitter User Objects contain a created_at field that holds the creation date of the account. We can use this to calculate the age of the account, and since we also know how many Tweets that account has published (statuses_count), we can calculate the average Tweets per day rate of that account. Tweepy provides time-related values as datetime objects which are easy to calculate things like time deltas with.
tweets = item.statuses_count account_created_date = item.created_at delta = datetime.utcnow() - account_created_date account_age_days = delta.days print("Account age (in days): " + str(account_age_days)) if account_age_days > 0: print("Average tweets per day: " + "%.2f"%(float(tweets)/float(account_age_days)))
Next, let's iterate through the user's Tweets using tweepy's API.user_timeline(). Tweepy's Cursor allows us to stream data from the query without having to manually query for more data in batches. The Twitter API will return around 3200 Tweets using this method (which can take a while). To make things quicker, and show another example of datetime usage we're going to break out of the loop once we hit Tweets that are more than 30 days old. While looping, we'll collect lists of all hashtags and mentions seen in Tweets.
hashtags = [] mentions = [] tweet_count = 0 end_date = datetime.utcnow() - timedelta(days=30) for status in Cursor(auth_api.user_timeline, id=target).items(): tweet_count += 1 if hasattr(status, "entities"): entities = status.entities if "hashtags" in entities: for ent in entities["hashtags"]: if ent is not None: if "text" in ent: hashtag = ent["text"] if hashtag is not None: hashtags.append(hashtag) if "user_mentions" in entities: for ent in entities["user_mentions"]: if ent is not None: if "screen_name" in ent: name = ent["screen_name"] if name is not None: mentions.append(name) if status.created_at < end_date: break
Finally, we'll use Counter.most_common() to print out the ten most used hashtags and mentions.
print print("Most mentioned Twitter users:") for item, count in Counter(mentions).most_common(10): print(item + "t" + str(count)) print print("Most used hashtags:") for item, count in Counter(hashtags).most_common(10): print(item + "t" + str(count)) print print "All done. Processed " + str(tweet_count) + " tweets." print
And that's it. A simple tool. But effective. And, of course, you can extend this code in any direction you like.
Related posts
- Blog post
- 2017
- Noora Hyvärinen
- Attack detection
- Detect and respond to attacks
5 phases of a cyber attack: The attacker’s view
Cyber security is not something you do once and then you’re done. It is a continuous process that should be part of everything you do. However, no one has the resources to do everything perfectly. Thus, your goal should be constant improvement.
Read more- Blog post
- 2017
- Melissa Michael
- Attack Surface Management
- IoT
- Protect and prevent threats
Of Cameras & Compromise: How IoT Could Dull Your Competitive Edge
The Internet of Things is here. And with it are exciting possibilities, cost savings and efficiencies. But there’s a dark side to this bright new world, and it can be summed up in what we call Hypponen’s Law: If it’s smart, it’s vulnerable.
Read more- Blog post
- Noora Hyvärinen
- 2018
- Python
- Protect and prevent threats
How to decompile any Python binary
At WithSecure we often encounter binary payloads that are generated from compiled Python. These are usually generated with tools such as py2exe or PyInstaller to create a Windows executable.
Read more- Blog post
- Adam Pilkey
- 2018
- Detect and respond to attacks
The Chilling Reality of Cold Boot Attacks
What do you do when you finish working with your laptop? Do you turn it off? Put it to sleep? Just close the lid and walk away?
Read more