Introductory knowledge in Statistics, Python. This assignment can be used to stimulate interest in data science and machine learning. The exercise is appropriate for undergraduate students in data mining with an interest in exploring and visualizing data, while three models are developed: Knn, Decision Tree, and Random Forest.
All data and notebooks are provided on a public Github repository. Notebooks open and run in Google Colab.
The objective of this research is to improve our understanding of the presence of sharks during tourist seasons in middle Atlantic and south eastern coastal waters, specifically North Carolina and South Carolina. The study will focus on the analysis of existing data from the International Shark Attack database, weather and water data from NOAA, calculated moon phase dates, crab and turtle populations. The quantitative analysis on this data will lead to new and interesting knowledge that will ultimately provide the basis for improved data collection and an app providing advanced information on the likelihood of sharks in coastal waters where tourists swim, surf and wade.
Final Data Sets are provided. Raw data is not always provided due to sensitive nature of some of the data that we have analyzed, for example turtle nesting and false crawl behavior.
To get started, please open the following Jupyter Notebook in Google Colab (click on link):
Thank you to the following individuals and organizations for providing data and/or support for the project:
Turtle Data: From the National Oceanic and Atmospheric Administration (NOAA) National Marine Fisheries Service, Southeast Fisheries Science Center: Michelle Pate, Coordinator SC Marine Turtle Conservation Program
Dr. Matthew Godfrey
State Coordinator, NC Sea Turtle Program
Shark habitat information and overall feedback on research: Dr. Charles Bagley, East Carolina University
Crab Landings, NC: Alan Bianchi, Trip Ticket Coordinator License and Statistics Section North Carolina Division of Marine Fisheries
Data Science Initiative, UNC Charlotte Mr Rick Hudson Dr. Mirsad Hadzikadic, Executive Director
Sarah Beardmore DORSAL app, Professional Surfer
Lavanya Vinodh, MS Data Scientist, Shark Researcher
KDD Class of Summer 2015 Class Project on analyzing the shark attacks of Summer 2015 along the east coast, taught by Dr. Pamela Thompson, CS, UNCC and Catawba College, NC
Dr. Pamela Thompson Blog (Shark Research): https://www.linkedin.com/in/drpamelathompson/ UPDATE: This is the preliminary assignment that is designed to introduce students to machine learning with an interesting data set! Current data involves receiver-transmitter sightings on the minute from tagged great white sharks off of the coast of Cape Cod, Massachusetts. Work has been ongoing since Spring 2023, with a paper being developed for submission in early 2025.