Consider the code in this repo as sample code.
This webapp performed scheduled webscraping of politically-oriented calls to action (CTAs). It has 2 main components
- Scrape sites and send the results to the CTAAggregator API.
- Admin Panel to display unsucessful scraping attempts (alerting admins that the sraper scripts need to be updated)
Code for scraping websites is more "Mr. Right Now" than "Mr. Right." We anticipate periodic changes on the websites that this app scrapes. For that reason, the code quality isn't very high and may not always demonstrate best practices. In addition, there is no test coverage.
If you plan to contribute a scraper, then please ensure that there's a rake task that runs your code. For instance, this command:
rake scape:emilys_list
will scrape everything available on Emily's List. Failures will be stored in the DB, so that we know when scraper scripts need to be updated.
Scraper rake tasks can be found in lib/tasks/scraper.rake
. To run all the scrapers, the following command.
rake scape:all
There is a rake task for creating an admin. You'll need to pass an email and a password as arguments.
rake admin:create['[email protected]','password123']
This process will work well enough when running the app locally. If you want creds on staging or production, then have an existing admin run this rake task on your behalf. This will set you in the system, then you can reset your password something that no one else knows.
This project leverages Rspec for unit and integration tests.
- This app uses PostrgreSQL, so be sure to have that installed and running on your machine.
- Clone this repo.
- CD in the root directory and run
bin/setup
. This command should create your database and install any dependencies
This app is deployed on Heroku. There's a staging site and a production site.