Skip to content

AKDDResearch/Shark-Attack

Repository files navigation

Understanding-Presence-of-Sharks-in-Near-Shore-Waters-NC-SC

ASSIGNMENT OBJECTIVES:

Beginning undergraduate students will learn the Machine Learning Life Cycle as they apply AI and Machine Learning in the preparation of data for use in visualization and classification as they develop a model that predicts near shore shark presence off of the coast of North Carolina and South Carolina. After domain understanding, data will be prepared and visualized from the international shark attack file, along with weather, moon phase, and prey data from online sources. Classfication and ensemble models will be used to predict the presence of sharks. Students will also understand the limitations and bias associated with this preliminary research as they explore improved data collection for great white sharks.

PREREQUISITES:

Introductory knowledge in Statistics, Python. This assignment can be used to stimulate interest in data science and machine learning. The exercise is appropriate for undergraduate students in data mining with an interest in exploring and visualizing data, while three models are developed: Knn, Decision Tree, and Random Forest.

SOFTWARE:

All data and notebooks are provided on a public Github repository. Notebooks open and run in Google Colab.

START OF ASSIGNMENT:

RESEARCH SUMMARY:

The objective of this research is to improve our understanding of the presence of sharks during tourist seasons in middle Atlantic and south eastern coastal waters, specifically North Carolina and South Carolina. The study will focus on the analysis of existing data from the International Shark Attack database, weather and water data from NOAA, calculated moon phase dates, crab and turtle populations. The quantitative analysis on this data will lead to new and interesting knowledge that will ultimately provide the basis for improved data collection and an app providing advanced information on the likelihood of sharks in coastal waters where tourists swim, surf and wade.

Final Data Sets are provided. Raw data is not always provided due to sensitive nature of some of the data that we have analyzed, for example turtle nesting and false crawl behavior.

To get started, please open the following Jupyter Notebook in Google Colab (click on link):

Thank you to the following individuals and organizations for providing data and/or support for the project:

Turtle Data: From the National Oceanic and Atmospheric Administration (NOAA) National Marine Fisheries Service, Southeast Fisheries Science Center: Michelle Pate, Coordinator SC Marine Turtle Conservation Program

Dr. Matthew Godfrey
State Coordinator, NC Sea Turtle Program
Shark habitat information and overall feedback on research: Dr. Charles Bagley, East Carolina University

Crab Landings, NC: Alan Bianchi, Trip Ticket Coordinator License and Statistics Section North Carolina Division of Marine Fisheries

Data Science Initiative, UNC Charlotte Mr Rick Hudson Dr. Mirsad Hadzikadic, Executive Director

Sarah Beardmore DORSAL app, Professional Surfer

Lavanya Vinodh, MS Data Scientist, Shark Researcher

KDD Class of Summer 2015 Class Project on analyzing the shark attacks of Summer 2015 along the east coast, taught by Dr. Pamela Thompson, CS, UNCC and Catawba College, NC

Dr. Pamela Thompson Blog (Shark Research): https://www.linkedin.com/in/drpamelathompson/ UPDATE: This is the preliminary assignment that is designed to introduce students to machine learning with an interesting data set! Current data involves receiver-transmitter sightings on the minute from tagged great white sharks off of the coast of Cape Cod, Massachusetts. Work has been ongoing since Spring 2023, with a paper being developed for submission in early 2025.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages