Skip to content

gthomas08/Big-Data-Management-Systems-Project

Repository files navigation

Big Data Management Systems Project

About The Project

This project was part of the Computer Engineering and Informatics Department (CEID) of University of Patras curriculum.

Exercise

The goal of this project was to write queries for a big dataset and calculate the time elapsed for each query to return the results. The queries run in a local machine and a virtual one (with different configurations) that was setup by the University. Apache Spark was used to execute the queries. PySpark was used to write the queries.

Technologies

  • Java
  • Python
  • Pyspark
  • Apache Spark
  • Jupyter Notebook

About

Big Data project using Apache Spark

Topics

Resources

Stars

Watchers

Forks