-
Notifications
You must be signed in to change notification settings - Fork 0
This Python project was used to deploy several machine learning classifiers on the MMTF-14K dataset, which consists of visual, audio and metadata of films. The aim of the task was to predict the film's genre based off this data. See the report PDF for further detail.
jockharkness/Movie-Genre-Classification---COMP90049-Introduction-to-Machine-Learning-Assignment-2
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
{\rtf1\ansi\ansicpg1252\cocoartf2513 \cocoatextscaling0\cocoaplatform0{\fonttbl\f0\fswiss\fcharset0 Helvetica;} {\colortbl;\red255\green255\blue255;} {\*\expandedcolortbl;;} \paperw11900\paperh16840\margl1440\margr1440\vieww6780\viewh11380\viewkind0 \pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0 \f0\fs24 \cf0 Below is a description for each of the files submitted:\ \ \ get_data.py: extracts the data from the tsv files\ \ baseline.py: performs the Zero-R baseline\ \ feature_extraction.py: this is where the preprocessing occurs. The datasets are concatenated, and processing is performed on the joint dataset as to make the vectorisation consistent. The tags and titles features are lemmatised and stop words are removed. A there is a getter for each feature used in analysis\ \ classifiers.py: this file was used in the preliminary testing of the features. The program iterates through four classifiers, outputting results for each of them.\ \ decisiontree.py: this file contains the code to implement the decision tree and its respective testing. It also contains the code for the pruning component. The figures plot the effects of pruning on accuracy.\ \ neuralnet_gridsearch.py: this file was used when testing which parameters may increase accuracy for the MLP classifier. The CV GridSearch functionality from the sklearn package was used to iterate through different parameter settings.\ \ \ }
About
This Python project was used to deploy several machine learning classifiers on the MMTF-14K dataset, which consists of visual, audio and metadata of films. The aim of the task was to predict the film's genre based off this data. See the report PDF for further detail.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published