Datacamp Stats is a modular Streamlit-based dashboard that analyzes learner engagement across Courses, Projects, and Tracks on DataCamp. It helps visualize learning behaviors, highlight popular technologies, and identify undervalued or overhyped content.
🚀 Currently supports HTML data downloaded from DataCamp Report Dashboard.
- Technologies Boosting Retention – Identifies technologies learners stick with the most.
- Top Courses & Tracks – Quick snapshot of popular learning paths.
- Quadrant Analysis – Scatter plot with completion vs start metrics:
- Underrated Gems
- Overhyped / Hard
- Ideal Balance
- Neither Here Nor There
We’re expanding Datacamp Stats into a broader learner intelligence tool. Here’s what’s next:
- Automatically suggests next steps based on a user's past course completions
- Works with public profile data
- Upload or paste a job description
- Recommends most relevant DataCamp content based on skill requirements
These tools aim to make the dashboard not just reflective, but prescriptive — helping learners act on insights.
-
Clone the repository:
git clone https://github.com/your-username/datacamp-stats.git cd datacamp-stats
-
Install the required dependencies:
pip install -r requirements.txt
-
Optional: Add your raw
.html
data files to theraw_data/
folder.
Use the following naming convention for the files:courses_N.html
projects_N.html
tracks_N.html
where
N
can be any identifier (e.g., date or version number).
Once added, process the data by running:python process_data.py
This will extract the relevant data and generate CSVs in the
data/
folder for use by the dashboard. -
Start the Streamlit app:
streamlit run app.py
Feel free to fork, contribute, or raise issues! Whether it's a bug fix, design improvement, or a whole new feature, PRs are welcome.