Projects

Power BI Dashboard

Bank Customer Churn Prediction Project

Tools:Python (Pandas, Scikit-learn, Matplotlib, Seaborn), Power BI, Jupyter Notebook

  • Developed a complete end-to-end customer churn prediction system using machine learning and visual analytics.
  • Built a classification model using Logistic Regression, Decision Tree, and Random Forest to predict whether a customer will churn based on their demographics, account activity, and balance.
  • Achieved 87% accuracy and a significant improvement in recall and F1-score using Random Forest after preprocessing and feature scaling.
  • Performed EDA and feature engineering, including missing value checks, categorical encoding, and feature scaling.
  • Designed a Power BI dashboard to visualize churn distribution across gender, age, credit score groups, and tenure — revealing key business insights.
  • Evaluated all models using Accuracy, Precision, Recall, F1-Score, and Confusion Matrix to determine the best-performing algorithm.
Model Comparison View on GitHub ↗

NYC Turnstile Data Analysis

Tools:Apache Spark, PySpark, Kafka, SparkML, MongoDB, Python, SQL, Pandas, Seaborn, Matplotlib

  • Analyzed over 13 million records of NYC MTA Subway turnstile data to uncover ridership trends and station-level traffic patterns using PySpark and Pandas.
  • Built an end-to-end real-time data pipeline using Apache Kafka, PySpark Structured Streaming, and MongoDB for live prediction of subway foot traffic.
  • Developed and deployed machine learning models (Random Forest, Decision Tree, Linear Regression) using SparkML to predict hourly passenger entries/exits with 93.36% accuracy.
  • Engineered custom Kafka producer scripts to simulate realistic subway traffic, publishing streaming data every 3–5 seconds.
  • Designed insightful visualizations including heatmaps, boxplots, and bar charts to reveal peak hours, seasonal usage, and holiday traffic dips using Seaborn and Matplotlib.
  • Implemented delta-based transformation logic to convert cumulative turnstile counts into accurate interval-based foot traffic data.
  • Managed structured and semi-structured data using MongoDB to enable flexible querying and integration with Spark SQL.
View on GitHub ↗
NICE Cruise Screenshot

NICE Cruise Case Study

Tools: MySQL, PHP, HTML, CSS, JS, XAMPP

  • Designed and Implemented robust relational database schema to manage cruise operations.
  • Developed a RESTful web application using PHP, MySQL, HTML, CSS and JS.
  • Enhanced security by implementing SQL injection prevention, password hashing, XSS protection and secure session handling.
  • Integrated CRUD operations for user and admin interactions supporting booking, updating and cancelling functionalities.
  • Created SQL queries providing insights into passenger behavior, trip occupancy and revenue trends.
  • Built a data visualization module to analyze passenger statistics through interactive graphs and charts.