GitHub

Twitter ~ Playing with Python

Requirements:

Usage: change to the user_history directory and run:

scrapy crawl user_history_spider -s LOG_FILE=scrapy.log -s LOG_LEVEL='INFO' -o scraped_data.json -a handle=BarackObama

(replace "BarackObama" with the Twitter user handle of interest)

Tweets are returned as best approximating the Twitter tweet format (ref: https://dev.twitter.com/overview/api/tweets) as possible.

Additional fields include:

created_at_ts: UTC timestamp corresponding to the created_at field
entities item contains pure array lists of the correpsonding entity, without additional information (i.e. tweet.entities.hashtags is an array of hashtag terms only)
date_scraped: date and time when the record was scraped ("now")
url: actual url from which the record was scraped

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
user_history		user_history
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback