How To Get Tweets From A Twitter Account Using Python And Tweepy

Threats & Research

ws_business_metro_couple_looking_at_phone
Reading time: 4 min
Andrew Patel
26.01.18

 

 

In this blog post, I’ll explain how to obtain data from a specified Twitter account using tweepy and Python. Let’s jump straight into the code!

As usual, we’ll start off by importing dependencies. I’ll use the datetime and Counter modules later on to do some simple analysis tasks.

from tweepy import OAuthHandler
from tweepy import API
from tweepy import Cursor
from datetime import datetime, date, time, timedelta
from collections import Counter
import sys

The next bit creates a tweepy API object that we will use to query for data from Twitter. As usual, you’ll need to create a Twitter application in order to obtain the relevant authentication keys and fill in those empty strings. You can find a link to a guide about that in one of the previous articles in this series.

consumer_key=""
consumer_secret=""
access_token=""
access_token_secret=""

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
auth_api = API(auth)

Names of accounts to be queried will be passed in as command-line arguments. I'm going to exit the script if no args are passed, since there would be no reason to continue.

account_list = []
if (len(sys.argv) > 1):
  account_list = sys.argv[1:]
else:
  print("Please provide a list of usernames at the command line.")
  sys.exit(0)

Next, let's iterate through the account names passed and use tweepy's API.get_user() to obtain a few details about the queried account.

if len(account_list) > 0:
  for target in account_list:
    print("Getting data for " + target)
    item = auth_api.get_user(target)
    print("name: " + item.name)
    print("screen_name: " + item.screen_name)
    print("description: " + item.description)
    print("statuses_count: " + str(item.statuses_count))
    print("friends_count: " + str(item.friends_count))
    print("followers_count: " + str(item.followers_count))

Twitter User Objects contain a created_at field that holds the creation date of the account. We can use this to calculate the age of the account, and since we also know how many Tweets that account has published (statuses_count), we can calculate the average Tweets per day rate of that account. Tweepy provides time-related values as datetime objects which are easy to calculate things like time deltas with.

    tweets = item.statuses_count
    account_created_date = item.created_at
    delta = datetime.utcnow() - account_created_date
    account_age_days = delta.days
    print("Account age (in days): " + str(account_age_days))
    if account_age_days > 0:
      print("Average tweets per day: " + "%.2f"%(float(tweets)/float(account_age_days)))

Next, let's iterate through the user's Tweets using tweepy's API.user_timeline(). Tweepy's Cursor allows us to stream data from the query without having to manually query for more data in batches. The Twitter API will return around 3200 Tweets using this method (which can take a while). To make things quicker, and show another example of datetime usage we're going to break out of the loop once we hit Tweets that are more than 30 days old. While looping, we'll collect lists of all hashtags and mentions seen in Tweets.

    hashtags = []
    mentions = []
    tweet_count = 0
    end_date = datetime.utcnow() - timedelta(days=30)
    for status in Cursor(auth_api.user_timeline, id=target).items():
      tweet_count += 1
      if hasattr(status, "entities"):
        entities = status.entities
        if "hashtags" in entities:
          for ent in entities["hashtags"]:
            if ent is not None:
              if "text" in ent:
                hashtag = ent["text"]
                if hashtag is not None:
                  hashtags.append(hashtag)
        if "user_mentions" in entities:
          for ent in entities["user_mentions"]:
            if ent is not None:
              if "screen_name" in ent:
                name = ent["screen_name"]
                if name is not None:
                  mentions.append(name)
      if status.created_at < end_date:
        break

Finally, we'll use Counter.most_common() to print out the ten most used hashtags and mentions.

    print
    print("Most mentioned Twitter users:")
    for item, count in Counter(mentions).most_common(10):
      print(item + "t" + str(count))

    print
    print("Most used hashtags:")
    for item, count in Counter(hashtags).most_common(10):
      print(item + "t" + str(count))

    print
    print "All done. Processed " + str(tweet_count) + " tweets."
    print

And that's it. A simple tool. But effective. And, of course, you can extend this code in any direction you like.

Related posts

Read more
meet-threat-hunters_1940x970-1024x512
  • Blog post
  • 2017
  • Melissa Michael
  • Attack Surface Management

Of Cameras & Compromise: How IoT Could Dull Your Competitive Edge

The Internet of Things is here. And with it are exciting possibilities, cost savings and efficiencies. But there’s a dark side to this bright new world, and it can be summed up in what we call Hypponen’s Law: If it’s smart, it’s vulnerable.

Read more
ws_woman_looking_at_computer_screen_with_pen
  • Blog post
  • Noora Hyvärinen
  • 2018
  • Python
August 25, 2023

How to decompile any Python binary

At WithSecure we often encounter binary payloads that are generated from compiled Python. These are usually generated with tools such as py2exe or PyInstaller to create a Windows executable.

Read more
ws_cold_boot_attack_demo
  • Blog post
  • Adam Pilkey
  • 2018
August 25, 2023

The Chilling Reality of Cold Boot Attacks

What do you do when you finish working with your laptop? Do you turn it off? Put it to sleep? Just close the lid and walk away?

Read more