Uni/machine-learning

Go to file

FlorianSpeicher 0eda6dcfa8 Add newest html and pdf

2025-08-26 21:57:43 +02:00

games_march2025_cleaned_2k_i3k

changes

2025-08-18 20:22:59 +02:00

games_march2025_cleaned_10k_i3k

move to _i3k

2025-08-22 11:23:37 +02:00

.gitattributes

Add lfs

2025-08-11 21:47:56 +02:00

.gitignore

jupyter notebook

2025-08-12 19:09:53 +02:00

compare_datasets_2k.png

commit message

2025-08-22 13:11:37 +02:00

compare_datasets_10k.png

commit message

2025-08-22 13:11:37 +02:00

compare_graph_maker_3.py

changes

2025-08-18 20:22:59 +02:00

compare_graph_maker.py

graphmaker

2025-08-22 13:12:34 +02:00

compare_models_2k_3.png

changes

2025-08-18 20:22:59 +02:00

compare_models_2k.png

commit message

2025-08-22 13:11:37 +02:00

compare_models_2k.py

changes

2025-08-18 20:22:59 +02:00

compare_models_10k_3.png

commit message

2025-08-22 13:11:37 +02:00

compare_models_10k.png

commit message

2025-08-22 13:11:37 +02:00

compare_models_10k.py

commit message

2025-08-22 13:11:37 +02:00

games_march2025_cleaned_2k.csv

first version of the plot and some noose

2025-08-15 11:40:34 +02:00

games_march2025_cleaned_10k.csv

first version of the plot and some noose

2025-08-15 11:40:34 +02:00

Machine-Learning.html

Add newest html and pdf

2025-08-26 21:57:43 +02:00

Machine-Learning.ipynb

Add Contributors

2025-08-25 23:24:52 +02:00

Machine-Learning.pdf

Add newest html and pdf

2025-08-26 21:57:43 +02:00

README.md

Doppelt hält besser

2025-08-25 23:25:56 +02:00

README.md

Machine Learning Project – Summer Semester 2025

This project was developed as part of the "Machine Learning" course at HTW Saar in the summer semester 2025 in "Practical Computer Science". The goal is to predict the genres of a game based on its description using various machine learning techniques.

Project Overview

We use a cleaned Steam dataset containing game descriptions and genre labels as well as many other feature values. The main challenge was to build a robust multi label classification model that can handle multiple genres per game and work with a relatively small dataset due to computational constraints.

Our workflow includes:

Data cleaning and preprocessing
Feature extraction
Multi label genre encoding
Model selection and evaluation
Optimization suggestions for future work

Dataset

The dataset used for this project is available here:
Steam Games Dataset from Kaggle

Repository

The full project, including the Jupyter Notebook, code, results and all data set sizes used, can be found on GitHub:
GitHub FlorianSpeicher04/machine-learning

Large File Storage (git-lfs)

Some files in this repository (such as the datasets) are managed using git-lfs.
To clone the repository with all large files, please make sure you have git-lfs installed:

git lfs install
git clone https://github.com/FlorianSpeicher04/machine-learning

How to Run

Clone the repository (see above).
Install the required Python packages.
Open notebook.ipynb in Jupyter Notebook or VS Code.
Follow the steps in the notebook to reproduce the results (Run All).

Results

Our model achieves reasonable performance given the dataset size and computational limitations. For more details, see the evaluation and conclusion sections in the notebook.

Contributors

Maximilian Kany 5016118
Florian Speicher 5014185
Tim Wall 5014365