s# ๐ค Kaggle ML Projects & AI Models
Welcome to my Machine Learning portfolio! I am an AI Tools engineering student and a Python developer. This repository contains my code, notebooks, and models from various Kaggle competitions and Data Science challenges.
- Goal: To predict which passengers survived the Titanic shipwreck using passenger data (ie name, age, gender, socio-economic class, etc).
- Tech Stack: Python, Pandas, Machine Learning Models
- My Achievement: Successfully trained an AI model that achieved an accuracy score of 0.77511 (77.5%) on the first deployment.
- Python
- Jupyter Notebooks
- Data Analysis & Predictive Modeling
- Goal: To analyze the gaming market and identify the most popular genres using worldwide sales data.
- Tech Stack: Python, Pandas, Matplotlib
- Key Insight: Used automated file discovery to handle dynamic datasets and visualized top gaming trends.
- Goal: To predict which passengers were transported to an alternate dimension during the spaceship's collision with a spacetime anomaly.
- Tech Stack: Python, Pandas, Data Preprocessing
- My Achievement: (Day 2) Successfully performed advanced data cleaning, handled missing values (
NaN) on live competition data, and generated a baseline prediction model achieving an initial score of 0.49310. Setting the stage for Machine Learning deployment!
- Goal: To predict which passengers were transported to an alternate dimension during the spaceship's collision with a spacetime anomaly.
- Tech Stack: Python, Pandas, Scikit-Learn, Machine Learning (Random Forest Classifier)
- My Achievement: - Day 2: Performed advanced data cleaning and handled missing values (
NaN) on live competition data, establishing a baseline. - Day 3: Successfully trained, validated, and deployed a Random Forest Machine Learning model that crushed the baseline, achieving an impressive accuracy score of 0.79448 (79.4%)!
*4 Credit Card Fraud Detection *Developed a Credit Card Fraud Detection model with 92% recall using balanced Logistic Regression
This project is a Computer Vision classification model built to correctly identify hand-written digits (0-9) from a dataset of tens of thousands of scanned images. Instead of using standard image files, the model processes 784 pixel values (28x28 images) per digit to recognize patterns.
- Accuracy Score: 96.57%
- Algorithm: Random Forest Classifier (
n_estimators=100) - Platform: Kaggle
- Language: Python
- Libraries: Pandas, Scikit-Learn
- Environment: Kaggle Notebooks / Jupyter
- Handling large datasets and pixel-level data extraction.
- Training an ensemble Machine Learning model (Random Forest) for image classification.
- Linking Kaggle directly to GitHub for automated version control.
*Docs: Added comprehensive NLP Kaggle Competition (Score: 0.796)
A comprehensive data analysis of global video game sales to uncover market trends, popular genres, and dominant publishers. This market research serves as a foundational study for upcoming game development projects at VS Gaming Studio, helping the team understand player demands.
- Top Genres: Action and Sports games have historically seen the highest number of releases.
- Top Publishers: Nintendo and Electronic Arts (EA) maintain a massive lead in global sales.
- Data Value: Understanding platform adoption and genre popularity helps indie developers make data-driven decisions.
- Language: Python
- Libraries: Pandas, Matplotlib, Seaborn
- Environment: Kaggle Notebooks
This repository/notebook contains my Day 9 progress of the '15 Days of Python Basics' learning track. Today, I shifted focus to Data Science and Analytics using Kaggle. The project demonstrates how to create, manipulate, and visualize datasets using Python's most powerful data libraries.
As a practical example, I analyzed a custom player dataset for VS Gaming Studio, filtering pro players and visualizing their scores using a dark-themed cyberpunk bar chart.
- Language: Python 3
- Libraries: Pandas (Data Manipulation), Matplotlib (Data Visualization)
- Environment: Kaggle Notebooks
- Dictionary to DataFrame Conversion: Creating structured tabular data (
pd.DataFrame). - Data Inspection: Using
.head()and.describe()to get a quick statistical summary of the data. - Data Filtering: Extracting specific rows based on conditions (e.g., filtering players with a score > 1000).
- Data Visualization: Plotting customized bar charts using Matplotlib with custom themes (
dark_background), labels, and titles.
- Open Kaggle.
- Create a New Notebook.
- Copy the code cells provided in this project.
- Hit
Shift + Enteron each cell to see the data tables and graphs render in real-time.
๐ฅ "Consistency is the ultimate hack." - Keeping the daily learning streak alive!
Developed as part of my daily AI & Machine Learning practice.
Quietly working away and building the future with AI.