Projects

Profit Analysis Dashboard

Advanced Data Warehouse Project - Sales & Returns Analysis

Built an end-to-end data pipeline integrating OLTP and Data Warehouse systems across MySQL and Oracle, transforming 51K+ transactional records into structured analytical insights for reporting and decision-making.

  • Designed and implemented OLTP (normalized) and Data Warehouse (STAR schema) systems across two database platforms
  • Built ETL pipeline to extract, transform, and load data from OLTP to DW environment
  • Implemented Change Data Capture (CDC) for incremental loading, reducing processing time by ~30%
  • Performed complex SQL transformations including joins, aggregations, and schema mapping
  • Designed fact and dimension tables (customer, product, date, location, ship mode) for analytical queries
  • Developed Tableau dashboards tracking 8–10 key KPIs across sales, profit, and returns
IBM Capstone Project

IBM Capstone Project (Developer Survey Analysis)

Analyzed real-world developer survey datasets to uncover technology trends and generate actionable insights. Focused on data cleaning, exploratory analysis, and dashboard development for KPI tracking.

  • Cleaned and transformed large-scale survey datasets using Python (Pandas) to prepare structured data for analysis.
  • Performed exploratory data analysis (EDA) to identify trends in programming languages, cloud platforms, and database technologies.
  • Queried and aggregated data using SQL to extract key metrics for reporting and analysis.
  • Built interactive dashboards in IBM Cognos Analytics to track KPIs and visualize technology adoption trends.
  • Translated analytical findings into actionable insights to support data-driven decision-making.
View on GitHub ↗
Power BI Dashboard

Bank Customer Churn Prediction Project

Tools:Python (Pandas, Scikit-learn, Matplotlib, Seaborn), Power BI, Jupyter Notebook

  • Developed a complete end-to-end customer churn prediction system using machine learning and visual analytics.
  • Built a classification model using Logistic Regression, Decision Tree, and Random Forest to predict whether a customer will churn based on their demographics, account activity, and balance.
  • Achieved 87% accuracy and a significant improvement in recall and F1-score using Random Forest after preprocessing and feature scaling.
  • Performed EDA and feature engineering, including missing value checks, categorical encoding, and feature scaling.
  • Designed a Power BI dashboard to visualize churn distribution across gender, age, credit score groups, and tenure — revealing key business insights.
  • Evaluated all models using Accuracy, Precision, Recall, F1-Score, and Confusion Matrix to determine the best-performing algorithm.
Model Comparison View on GitHub ↗

NYC Turnstile Data Analysis

Tools:Apache Spark, PySpark, Kafka, SparkML, MongoDB, Python, SQL, Pandas, Seaborn, Matplotlib

  • Analyzed over 13 million records of NYC MTA Subway turnstile data to uncover ridership trends and station-level traffic patterns using PySpark and Pandas.
  • Built an end-to-end real-time data pipeline using Apache Kafka, PySpark Structured Streaming, and MongoDB for live prediction of subway foot traffic.
  • Developed and deployed machine learning models (Random Forest, Decision Tree, Linear Regression) using SparkML to predict hourly passenger entries/exits with 93.36% accuracy.
  • Engineered custom Kafka producer scripts to simulate realistic subway traffic, publishing streaming data every 3–5 seconds.
  • Designed insightful visualizations including heatmaps, boxplots, and bar charts to reveal peak hours, seasonal usage, and holiday traffic dips using Seaborn and Matplotlib.
  • Implemented delta-based transformation logic to convert cumulative turnstile counts into accurate interval-based foot traffic data.
  • Managed structured and semi-structured data using MongoDB to enable flexible querying and integration with Spark SQL.
View on GitHub ↗
NICE Cruise Screenshot

NICE Cruise Case Study

Tools: MySQL, PHP, HTML, CSS, JS, XAMPP

  • Designed and Implemented robust relational database schema to manage cruise operations.
  • Developed a RESTful web application using PHP, MySQL, HTML, CSS and JS.
  • Enhanced security by implementing SQL injection prevention, password hashing, XSS protection and secure session handling.
  • Integrated CRUD operations for user and admin interactions supporting booking, updating and cancelling functionalities.
  • Created SQL queries providing insights into passenger behavior, trip occupancy and revenue trends.
  • Built a data visualization module to analyze passenger statistics through interactive graphs and charts.