Skip to content

Amey-Thakur/DATA-WAREHOUSING-AND-MINING-AND-DATA-WAREHOUSING-AND-MINING-LAB

Repository files navigation

University of Mumbai

Data Warehousing and Mining and Data Warehousing and Mining Laboratory

CSC603 & CSL603 · Semester VI · Computer Engineering

License: CC BY 4.0 University Institution Curated by

A comprehensive academic resource for Data Warehousing and Mining (DWM) and Data Warehousing and Mining Laboratory, covering data warehouse design, OLAP operations, data mining algorithms, classification, clustering, and association rule mining.


Overview  ·  Contents  ·  Reference Books  ·  Personal Preparation  ·  Assignments  ·  Laboratory  ·  Mock Test  ·  Internal Assessment Test  ·  Semester Exam  ·  Question Papers  ·  Syllabus  ·  Usage Guidelines  ·  License  ·  About  ·  Acknowledgments


Overview

Data Warehousing and Mining (CSC603) and Data Warehousing and Mining Laboratory (CSL603) are core subjects in the Third Year (Semester VI) of the Computer Engineering curriculum at the University of Mumbai. These courses provide foundational knowledge of data warehouse architecture, ETL processes, OLAP operations, and various data mining techniques including classification, clustering, and association rule mining.

Course Topics

The curriculum encompasses several key domains in Data Warehousing and Mining:

  • Introduction to Data Warehousing: Data warehouse architecture, schemas (Star, Snowflake, Fact Constellation), metadata.
  • OLAP Technology: OLAP operations (roll-up, drill-down, slice, dice, pivot), OLAP server architectures.
  • Data Preprocessing: Data cleaning, integration, transformation, and reduction techniques.
  • Classification & Prediction: Decision trees, Bayesian classification, rule-based classification, neural networks.
  • Clustering Analysis: K-Means, hierarchical clustering, density-based clustering (DBSCAN).
  • Association Rule Mining: Apriori algorithm, FP-Growth, mining multi-level and multi-dimensional associations.

Repository Purpose

This repository represents a curated collection of study materials, reference books, lab experiments, and personal preparation notes compiled during my academic journey. The primary motivation for creating and maintaining this archive is simple yet profound: to preserve knowledge for continuous learning and future reference.

As a computer engineer, understanding data warehousing and mining techniques is essential for extracting valuable insights from large datasets. This repository serves as my intellectual reference point: a resource I can return to for relearning concepts, reviewing methodologies, and strengthening understanding when needed.

Why this repository exists:

  • Knowledge Preservation: To maintain organized access to comprehensive study materials beyond the classroom.
  • Continuous Learning: To support lifelong learning by enabling easy revisitation of fundamental DWM concepts.
  • Academic Documentation: To authentically document my learning journey through Data Warehousing and Mining and Data Warehousing and Mining Laboratory.
  • Community Contribution: To share these resources with students and learners who may benefit from them.

Note

All materials in this repository were created, compiled, and organized by me throughout my undergraduate program (2018-2022) as part of my coursework, laboratory assignments, and project implementations.


Repository Contents

Reference Books

This collection includes comprehensive reference materials covering all major topics:

# Resource Focus Area
1 DWM Techmax Complete syllabus coverage
2 Data Mining: Concepts and Techniques Standard Textbook (Han & Kamber)
3 DWM Toppers Solutions Solved questions and exam-oriented summaries
4 DWM - Toppers Solutions (2019) Additional solved questions (2019)
5 DWM Toppers Solutions (Alt) Comprehensive solved questions
6 DWM Question Bank Practice questions for exam preparation
7 DWM - IMCQ Important MCQs for entrance and exams
8 DWM VIVA Questions Viva preparation material
9 DWM Notes Concise notes
10 BH Plan Study planning and strategy

Personal Preparation

Study materials and planning resources for effective exam preparation:

# Resource Description
1 Blueprint DWM exam blueprint and marking scheme
2 Semester 6 Timetable Academic schedule for Semester VI
3 Computer Semester 6 Timetable Detailed computer engineering timetable

Assignments

Course assignments demonstrating understanding of DWM concepts:

# Assignment Description Date
1 Assignment 1 Data Warehouse Architecture, Metadata, ETL, OLAP, KDD Process May 04, 2021
2 Assignment 2 Classification vs Clustering, Regression, Apriori, Web Mining May 04, 2021

Topics Covered: Data Warehouse Architecture · Metadata Types · ETL Process · OLAP Operations · KDD Process · Classification, Prediction & Clustering · Linear Regression · Apriori Algorithm · Association Rule Mining · Spatial & Web Mining


Data Warehousing and Mining Laboratory

The laboratory component (CSL603) focuses on hands-on implementation of data mining algorithms and data warehousing concepts using tools like WEKA and SQL.

Total Experiments Status Languages Focus

Tip

Strategic Insight: The success of any Data Mining project lies in the quality of Data Preprocessing ("Garbage In, Garbage Out"). A robust Data Warehouse schema (Star/Snowflake) is the backbone of efficient OLAP. When implementing algorithms in WEKA or Python, always prioritize data cleaning, normalization, and correct format conversion (ARFF/CSV) to ensure statistically significant results.

# Experiment Date Report
1 Star Schema for E-Commerce Data Warehouse February 02, 2021 View
2 Dimension Tables and Fact Table Implementation February 02, 2021 View
3 OLAP Operations (Roll-up, Drill-down, Slice, Dice) March 23, 2021 View
4 Decision Tree Classification using J-48 (WEKA) April 10, 2021 View
5 Naive Bayes Classification (Python) April 15, 2021 View
6 K-Means Clustering (Python) April 15, 2021 View
7 Hierarchical Clustering (Python) April 15, 2021 View
8 Association Pattern Mining (Apriori & FPM) April 28, 2021 View
9 Apriori Algorithm (WEKA) April 28, 2021 View
10 Web Mining and Spatial Data Mining April 28, 2021 View

Program Details

Experiment 1: Star Schema for E-Commerce Data Warehouse (1 Program)
Program Category Description Code
Star_Schema_ECommerce.sql Schema Design Implementation of Star Schema for E-Commerce Data Warehouse View
Experiment 2: Dimension Tables and Fact Table Implementation (1 Program)
Program Category Description Code
Dimension_Fact_Tables.sql SQL Implementation Creation of Dimension and Fact tables with relationships View
Experiment 3: OLAP Operations (1 Program)
Program Category Description Code
OLAP_Operations.sql OLAP Queries Implementation of Roll-up, Drill-down, Slice, and Dice operations View
Experiment 4: Decision Tree Classification using J-48 (1 Resource)
Resource Category Description Link
iris_decision_tree.arff Dataset Iris dataset formatted for WEKA Decision Tree analysis View
Experiment 5: Naive Bayes Classification (1 Program)
Program Category Description Code
naive_bayes_classification.py Classification Python implementation of Naive Bayes algorithm View
Experiment 6: K-Means Clustering (1 Program)
Program Category Description Code
kmeans_clustering.py Clustering Python implementation of K-Means clustering algorithm View
Experiment 7: Hierarchical Clustering (1 Program)
Program Category Description Code
hierarchical_clustering.py Clustering Python implementation of Hierarchical clustering algorithm View
Experiment 8: Association Pattern Mining (2 Programs)
Program Category Description Code
FPM.java FP-Growth Java implementation of Frequent Pattern Growth algorithm View
association_rule_mining.py Apriori Python implementation of Association Rule Mining View
Experiment 9: Apriori Algorithm (WEKA) (1 Resource)
Resource Category Description Link
car.arff Dataset Car evaluation dataset formatted for WEKA Apriori analysis View
Experiment 10: Web Mining and Spatial Data Mining (1 Resource)
Resource Category Description Link
web_spatial_mining_study.txt Study Material Notes and study material for Web and Spatial Data Mining View

Laboratory Documentation

# Resource Description
1 Lab README Detailed navigation guide with program descriptions

Mock Test

Technical mock test conducted for placement preparation:

# Resource Description
1 Technical Mock Test Campus Corners Mock Test for Terna Engineering College

Internal Assessment Test

Internal assessment evaluations conducted during the course:

IAT - 1

# Resource Description Marks
1 Question Paper DWM Internal Assessment Test 1 Question Paper
2 Answer Sheet DWM Internal Assessment Test 1 Answer Sheet
3 MCQ DWM Internal Assessment Test 1 MCQ

IAT - 2

# Resource Description Marks
1 Answer Sheet DWM Internal Assessment Test 2 Answer Sheet

Additional Resources:

# Resource Description
1 Answer Sheet Template IAT Answer Sheet Template

Semester Exam

Important

COVID-19 Impact: This coursework was completed during the COVID-19 pandemic. All examinations and assessments were conducted in a digital format.

Final semester examination submission:

# Resource Description Date
1 MCQ DWM Semester Exam MCQ Paper June 07, 2021
2 Question 2 DWM Semester Exam Answer Sheet June 07, 2021
3 Question 3 DWM Semester Exam Answer Sheet June 07, 2021

Additional Resources:

# Resource Description
1 Answer Sheet Template Semester Exam Answer Sheet Template
2 DWM Questions DWM Exam Questions Document

Question Papers

University of Mumbai examination papers from 2012-2019:

# Exam Session Syllabus Resource
1 May 2019 CBCGS View
2 December 2019 CBCGS View
3 May 2018 CBCGS View
4 December 2018 CBCGS View
5 May 2017 CBCGS View
6 December 2017 CBCGS View
7 May 2016 CBCGS View
8 December 2016 CBCGS View
9 May 2015 CBGS View
10 December 2015 CBGS View
11 May 2014 CBGS View
12 December 2014 CBGS View
13 May 2013 CBGS View
14 December 2013 CBGS View
15 May 2012 CBGS View
16 December 2012 CBGS View

Syllabus

Official CBCGS Syllabus
Complete Third Year Computer Engineering syllabus document from the University of Mumbai, including detailed course outcomes, assessment criteria, and module specifications for Data Warehousing and Mining and Data Warehousing and Mining Laboratory.

Important

Always verify the latest syllabus details with the official University of Mumbai website, as curriculum updates may occur after this repository's archival date.


Usage Guidelines

This repository is openly shared to support learning and knowledge exchange across the academic community.

For Students
Use these resources as reference materials for understanding data mining algorithms, concepts, and preparing for examinations. All content is organized for self-paced learning.

For Educators
These materials may serve as curriculum references, lab examples, or supplementary teaching resources. Attribution is appreciated when utilizing content.

For Researchers
The documentation and organization may provide insights into academic resource curation and educational content structuring.


License

This repository and all linked academic content are made available under the Creative Commons Attribution 4.0 International License (CC BY 4.0). See the LICENSE file for complete terms.

Note

Summary: You are free to share and adapt this content for any purpose, even commercially, as long as you provide appropriate attribution to the original author.


About This Repository

Created & Maintained by: Amey Thakur
Academic Journey: Bachelor of Engineering in Computer Engineering (2018-2022)
Institution: Terna Engineering College, Navi Mumbai
University: University of Mumbai

This repository represents a comprehensive collection of study materials, reference books, assignments, and personal preparation notes curated during my academic journey. All content has been carefully organized and documented to serve as a valuable resource for students pursuing Data Warehousing and Mining & Data Warehousing and Mining Laboratory.

Connect: GitHub  ·  LinkedIn  ·  ORCID

Acknowledgments

Grateful acknowledgment to the faculty members of the Department of Computer Engineering at Terna Engineering College for their guidance and instruction in Data Warehousing and Mining. Their clear teaching and continued support helped develop a strong understanding of data mining techniques and warehouse concepts.

Special thanks to the mentors and peers whose encouragement, discussions, and support contributed meaningfully to this learning experience.



Computer Engineering (B.E.) - University of Mumbai

Semester-wise curriculum, laboratories, projects, and academic notes.