Skip to content

krishsharma-code/Kaggle-ML-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

19 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

s# ๐Ÿค– Kaggle ML Projects & AI Models

Welcome to my Machine Learning portfolio! I am an AI Tools engineering student and a Python developer. This repository contains my code, notebooks, and models from various Kaggle competitions and Data Science challenges.

๐Ÿš€ Featured Projects

1. Titanic - Machine Learning from Disaster ๐Ÿšข

  • Goal: To predict which passengers survived the Titanic shipwreck using passenger data (ie name, age, gender, socio-economic class, etc).
  • Tech Stack: Python, Pandas, Machine Learning Models
  • My Achievement: Successfully trained an AI model that achieved an accuracy score of 0.77511 (77.5%) on the first deployment.

๐Ÿ› ๏ธ Tools & Technologies

  • Python
  • Jupyter Notebooks
  • Data Analysis & Predictive Modeling

2. Video Game Sales Analysis ๐ŸŽฎ

  • Goal: To analyze the gaming market and identify the most popular genres using worldwide sales data.
  • Tech Stack: Python, Pandas, Matplotlib
  • Key Insight: Used automated file discovery to handle dynamic datasets and visualized top gaming trends.

3. Spaceship Titanic - Cosmic Mystery ๐Ÿš€

  • Goal: To predict which passengers were transported to an alternate dimension during the spaceship's collision with a spacetime anomaly.
  • Tech Stack: Python, Pandas, Data Preprocessing
  • My Achievement: (Day 2) Successfully performed advanced data cleaning, handled missing values (NaN) on live competition data, and generated a baseline prediction model achieving an initial score of 0.49310. Setting the stage for Machine Learning deployment!

3. Spaceship Titanic - Cosmic Mystery ๐Ÿš€

  • Goal: To predict which passengers were transported to an alternate dimension during the spaceship's collision with a spacetime anomaly.
  • Tech Stack: Python, Pandas, Scikit-Learn, Machine Learning (Random Forest Classifier)
  • My Achievement: - Day 2: Performed advanced data cleaning and handled missing values (NaN) on live competition data, establishing a baseline.
  • Day 3: Successfully trained, validated, and deployed a Random Forest Machine Learning model that crushed the baseline, achieving an impressive accuracy score of 0.79448 (79.4%)!

*4 Credit Card Fraud Detection *Developed a Credit Card Fraud Detection model with 92% recall using balanced Logistic Regression

๐Ÿ‘๏ธ Digit Recognizer - Computer Vision (Kaggle)

๐ŸŽฏ Project Overview

This project is a Computer Vision classification model built to correctly identify hand-written digits (0-9) from a dataset of tens of thousands of scanned images. Instead of using standard image files, the model processes 784 pixel values (28x28 images) per digit to recognize patterns.

๐Ÿ† Model Performance

  • Accuracy Score: 96.57%
  • Algorithm: Random Forest Classifier (n_estimators=100)
  • Platform: Kaggle

๐Ÿ› ๏ธ Tech Stack & Tools

  • Language: Python
  • Libraries: Pandas, Scikit-Learn
  • Environment: Kaggle Notebooks / Jupyter

๐Ÿš€ Key Learnings

  • Handling large datasets and pixel-level data extraction.
  • Training an ensemble Machine Learning model (Random Forest) for image classification.
  • Linking Kaggle directly to GitHub for automated version control.

*Docs: Added comprehensive NLP Kaggle Competition (Score: 0.796)

๐ŸŽฎ Video Game Sales & Market Analysis (VS Gaming Studio)

๐ŸŽฏ Project Overview

A comprehensive data analysis of global video game sales to uncover market trends, popular genres, and dominant publishers. This market research serves as a foundational study for upcoming game development projects at VS Gaming Studio, helping the team understand player demands.

๐Ÿ“Š Key Insights

  • Top Genres: Action and Sports games have historically seen the highest number of releases.
  • Top Publishers: Nintendo and Electronic Arts (EA) maintain a massive lead in global sales.
  • Data Value: Understanding platform adoption and genre popularity helps indie developers make data-driven decisions.

๐Ÿ› ๏ธ Tech Stack & Tools

  • Language: Python
  • Libraries: Pandas, Matplotlib, Seaborn
  • Environment: Kaggle Notebooks

๐Ÿ“Š Day 9: Python Data Analysis Basics & Visualization

๐Ÿ“ Overview

This repository/notebook contains my Day 9 progress of the '15 Days of Python Basics' learning track. Today, I shifted focus to Data Science and Analytics using Kaggle. The project demonstrates how to create, manipulate, and visualize datasets using Python's most powerful data libraries.

As a practical example, I analyzed a custom player dataset for VS Gaming Studio, filtering pro players and visualizing their scores using a dark-themed cyberpunk bar chart.

๐Ÿ› ๏ธ Tech Stack Used

  • Language: Python 3
  • Libraries: Pandas (Data Manipulation), Matplotlib (Data Visualization)
  • Environment: Kaggle Notebooks

๐Ÿง  Concepts Mastered Today

  1. Dictionary to DataFrame Conversion: Creating structured tabular data (pd.DataFrame).
  2. Data Inspection: Using .head() and .describe() to get a quick statistical summary of the data.
  3. Data Filtering: Extracting specific rows based on conditions (e.g., filtering players with a score > 1000).
  4. Data Visualization: Plotting customized bar charts using Matplotlib with custom themes (dark_background), labels, and titles.

๐Ÿš€ How to Run (Kaggle)

  1. Open Kaggle.
  2. Create a New Notebook.
  3. Copy the code cells provided in this project.
  4. Hit Shift + Enter on each cell to see the data tables and graphs render in real-time.

๐Ÿ”ฅ "Consistency is the ultimate hack." - Keeping the daily learning streak alive!


Developed as part of my daily AI & Machine Learning practice.


Quietly working away and building the future with AI.

About

๐Ÿš€ A collection of my Data Science, AI/ML models, and Kaggle competition submissions, starting with predictive analysis in Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors