A comprehensive academic resource for Data Warehousing and Mining (DWM) and Data Warehousing and Mining Laboratory, covering data warehouse design, OLAP operations, data mining algorithms, classification, clustering, and association rule mining.
Overview · Contents · Reference Books · Personal Preparation · Assignments · Laboratory · Mock Test · Internal Assessment Test · Semester Exam · Question Papers · Syllabus · Usage Guidelines · License · About · Acknowledgments
Data Warehousing and Mining (CSC603) and Data Warehousing and Mining Laboratory (CSL603) are core subjects in the Third Year (Semester VI) of the Computer Engineering curriculum at the University of Mumbai. These courses provide foundational knowledge of data warehouse architecture, ETL processes, OLAP operations, and various data mining techniques including classification, clustering, and association rule mining.
The curriculum encompasses several key domains in Data Warehousing and Mining:
- Introduction to Data Warehousing: Data warehouse architecture, schemas (Star, Snowflake, Fact Constellation), metadata.
- OLAP Technology: OLAP operations (roll-up, drill-down, slice, dice, pivot), OLAP server architectures.
- Data Preprocessing: Data cleaning, integration, transformation, and reduction techniques.
- Classification & Prediction: Decision trees, Bayesian classification, rule-based classification, neural networks.
- Clustering Analysis: K-Means, hierarchical clustering, density-based clustering (DBSCAN).
- Association Rule Mining: Apriori algorithm, FP-Growth, mining multi-level and multi-dimensional associations.
This repository represents a curated collection of study materials, reference books, lab experiments, and personal preparation notes compiled during my academic journey. The primary motivation for creating and maintaining this archive is simple yet profound: to preserve knowledge for continuous learning and future reference.
As a computer engineer, understanding data warehousing and mining techniques is essential for extracting valuable insights from large datasets. This repository serves as my intellectual reference point: a resource I can return to for relearning concepts, reviewing methodologies, and strengthening understanding when needed.
Why this repository exists:
- Knowledge Preservation: To maintain organized access to comprehensive study materials beyond the classroom.
- Continuous Learning: To support lifelong learning by enabling easy revisitation of fundamental DWM concepts.
- Academic Documentation: To authentically document my learning journey through Data Warehousing and Mining and Data Warehousing and Mining Laboratory.
- Community Contribution: To share these resources with students and learners who may benefit from them.
Note
All materials in this repository were created, compiled, and organized by me throughout my undergraduate program (2018-2022) as part of my coursework, laboratory assignments, and project implementations.
This collection includes comprehensive reference materials covering all major topics:
| # | Resource | Focus Area |
|---|---|---|
| 1 | DWM Techmax | Complete syllabus coverage |
| 2 | Data Mining: Concepts and Techniques | Standard Textbook (Han & Kamber) |
| 3 | DWM Toppers Solutions | Solved questions and exam-oriented summaries |
| 4 | DWM - Toppers Solutions (2019) | Additional solved questions (2019) |
| 5 | DWM Toppers Solutions (Alt) | Comprehensive solved questions |
| 6 | DWM Question Bank | Practice questions for exam preparation |
| 7 | DWM - IMCQ | Important MCQs for entrance and exams |
| 8 | DWM VIVA Questions | Viva preparation material |
| 9 | DWM Notes | Concise notes |
| 10 | BH Plan | Study planning and strategy |
Study materials and planning resources for effective exam preparation:
| # | Resource | Description |
|---|---|---|
| 1 | Blueprint | DWM exam blueprint and marking scheme |
| 2 | Semester 6 Timetable | Academic schedule for Semester VI |
| 3 | Computer Semester 6 Timetable | Detailed computer engineering timetable |
Course assignments demonstrating understanding of DWM concepts:
| # | Assignment | Description | Date |
|---|---|---|---|
| 1 | Assignment 1 | Data Warehouse Architecture, Metadata, ETL, OLAP, KDD Process | May 04, 2021 |
| 2 | Assignment 2 | Classification vs Clustering, Regression, Apriori, Web Mining | May 04, 2021 |
Topics Covered: Data Warehouse Architecture · Metadata Types · ETL Process · OLAP Operations · KDD Process · Classification, Prediction & Clustering · Linear Regression · Apriori Algorithm · Association Rule Mining · Spatial & Web Mining
The laboratory component (CSL603) focuses on hands-on implementation of data mining algorithms and data warehousing concepts using tools like WEKA and SQL.
Tip
Strategic Insight: The success of any Data Mining project lies in the quality of Data Preprocessing ("Garbage In, Garbage Out"). A robust Data Warehouse schema (Star/Snowflake) is the backbone of efficient OLAP. When implementing algorithms in WEKA or Python, always prioritize data cleaning, normalization, and correct format conversion (ARFF/CSV) to ensure statistically significant results.
| # | Experiment | Date | Report |
|---|---|---|---|
| 1 | Star Schema for E-Commerce Data Warehouse | February 02, 2021 | View |
| 2 | Dimension Tables and Fact Table Implementation | February 02, 2021 | View |
| 3 | OLAP Operations (Roll-up, Drill-down, Slice, Dice) | March 23, 2021 | View |
| 4 | Decision Tree Classification using J-48 (WEKA) | April 10, 2021 | View |
| 5 | Naive Bayes Classification (Python) | April 15, 2021 | View |
| 6 | K-Means Clustering (Python) | April 15, 2021 | View |
| 7 | Hierarchical Clustering (Python) | April 15, 2021 | View |
| 8 | Association Pattern Mining (Apriori & FPM) | April 28, 2021 | View |
| 9 | Apriori Algorithm (WEKA) | April 28, 2021 | View |
| 10 | Web Mining and Spatial Data Mining | April 28, 2021 | View |
Experiment 1: Star Schema for E-Commerce Data Warehouse (1 Program)
| Program | Category | Description | Code |
|---|---|---|---|
Star_Schema_ECommerce.sql |
Schema Design | Implementation of Star Schema for E-Commerce Data Warehouse | View |
Experiment 2: Dimension Tables and Fact Table Implementation (1 Program)
| Program | Category | Description | Code |
|---|---|---|---|
Dimension_Fact_Tables.sql |
SQL Implementation | Creation of Dimension and Fact tables with relationships | View |
Experiment 3: OLAP Operations (1 Program)
| Program | Category | Description | Code |
|---|---|---|---|
OLAP_Operations.sql |
OLAP Queries | Implementation of Roll-up, Drill-down, Slice, and Dice operations | View |
Experiment 4: Decision Tree Classification using J-48 (1 Resource)
| Resource | Category | Description | Link |
|---|---|---|---|
iris_decision_tree.arff |
Dataset | Iris dataset formatted for WEKA Decision Tree analysis | View |
Experiment 5: Naive Bayes Classification (1 Program)
| Program | Category | Description | Code |
|---|---|---|---|
naive_bayes_classification.py |
Classification | Python implementation of Naive Bayes algorithm | View |
Experiment 6: K-Means Clustering (1 Program)
| Program | Category | Description | Code |
|---|---|---|---|
kmeans_clustering.py |
Clustering | Python implementation of K-Means clustering algorithm | View |
Experiment 7: Hierarchical Clustering (1 Program)
| Program | Category | Description | Code |
|---|---|---|---|
hierarchical_clustering.py |
Clustering | Python implementation of Hierarchical clustering algorithm | View |
Experiment 8: Association Pattern Mining (2 Programs)
| Program | Category | Description | Code |
|---|---|---|---|
FPM.java |
FP-Growth | Java implementation of Frequent Pattern Growth algorithm | View |
association_rule_mining.py |
Apriori | Python implementation of Association Rule Mining | View |
Experiment 9: Apriori Algorithm (WEKA) (1 Resource)
| Resource | Category | Description | Link |
|---|---|---|---|
car.arff |
Dataset | Car evaluation dataset formatted for WEKA Apriori analysis | View |
Experiment 10: Web Mining and Spatial Data Mining (1 Resource)
| Resource | Category | Description | Link |
|---|---|---|---|
web_spatial_mining_study.txt |
Study Material | Notes and study material for Web and Spatial Data Mining | View |
| # | Resource | Description |
|---|---|---|
| 1 | Lab README | Detailed navigation guide with program descriptions |
Technical mock test conducted for placement preparation:
| # | Resource | Description |
|---|---|---|
| 1 | Technical Mock Test | Campus Corners Mock Test for Terna Engineering College |
Internal assessment evaluations conducted during the course:
| # | Resource | Description | Marks |
|---|---|---|---|
| 1 | Question Paper | DWM Internal Assessment Test 1 Question Paper | — |
| 2 | Answer Sheet | DWM Internal Assessment Test 1 Answer Sheet | — |
| 3 | MCQ | DWM Internal Assessment Test 1 MCQ | — |
| # | Resource | Description | Marks |
|---|---|---|---|
| 1 | Answer Sheet | DWM Internal Assessment Test 2 Answer Sheet | — |
Additional Resources:
| # | Resource | Description |
|---|---|---|
| 1 | Answer Sheet Template | IAT Answer Sheet Template |
Important
COVID-19 Impact: This coursework was completed during the COVID-19 pandemic. All examinations and assessments were conducted in a digital format.
Final semester examination submission:
| # | Resource | Description | Date |
|---|---|---|---|
| 1 | MCQ | DWM Semester Exam MCQ Paper | June 07, 2021 |
| 2 | Question 2 | DWM Semester Exam Answer Sheet | June 07, 2021 |
| 3 | Question 3 | DWM Semester Exam Answer Sheet | June 07, 2021 |
Additional Resources:
| # | Resource | Description |
|---|---|---|
| 1 | Answer Sheet Template | Semester Exam Answer Sheet Template |
| 2 | DWM Questions | DWM Exam Questions Document |
University of Mumbai examination papers from 2012-2019:
| # | Exam Session | Syllabus | Resource |
|---|---|---|---|
| 1 | May 2019 | CBCGS | View |
| 2 | December 2019 | CBCGS | View |
| 3 | May 2018 | CBCGS | View |
| 4 | December 2018 | CBCGS | View |
| 5 | May 2017 | CBCGS | View |
| 6 | December 2017 | CBCGS | View |
| 7 | May 2016 | CBCGS | View |
| 8 | December 2016 | CBCGS | View |
| 9 | May 2015 | CBGS | View |
| 10 | December 2015 | CBGS | View |
| 11 | May 2014 | CBGS | View |
| 12 | December 2014 | CBGS | View |
| 13 | May 2013 | CBGS | View |
| 14 | December 2013 | CBGS | View |
| 15 | May 2012 | CBGS | View |
| 16 | December 2012 | CBGS | View |
Official CBCGS Syllabus
Complete Third Year Computer Engineering syllabus document from the University of Mumbai, including detailed course outcomes, assessment criteria, and module specifications for Data Warehousing and Mining and Data Warehousing and Mining Laboratory.
Important
Always verify the latest syllabus details with the official University of Mumbai website, as curriculum updates may occur after this repository's archival date.
This repository is openly shared to support learning and knowledge exchange across the academic community.
For Students
Use these resources as reference materials for understanding data mining algorithms, concepts, and preparing for examinations. All content is organized for self-paced learning.
For Educators
These materials may serve as curriculum references, lab examples, or supplementary teaching resources. Attribution is appreciated when utilizing content.
For Researchers
The documentation and organization may provide insights into academic resource curation and educational content structuring.
This repository and all linked academic content are made available under the Creative Commons Attribution 4.0 International License (CC BY 4.0). See the LICENSE file for complete terms.
Note
Summary: You are free to share and adapt this content for any purpose, even commercially, as long as you provide appropriate attribution to the original author.
Created & Maintained by: Amey Thakur
Academic Journey: Bachelor of Engineering in Computer Engineering (2018-2022)
Institution: Terna Engineering College, Navi Mumbai
University: University of Mumbai
This repository represents a comprehensive collection of study materials, reference books, assignments, and personal preparation notes curated during my academic journey. All content has been carefully organized and documented to serve as a valuable resource for students pursuing Data Warehousing and Mining & Data Warehousing and Mining Laboratory.
Connect: GitHub · LinkedIn · ORCID
Grateful acknowledgment to the faculty members of the Department of Computer Engineering at Terna Engineering College for their guidance and instruction in Data Warehousing and Mining. Their clear teaching and continued support helped develop a strong understanding of data mining techniques and warehouse concepts.
Special thanks to the mentors and peers whose encouragement, discussions, and support contributed meaningfully to this learning experience.
Overview · Contents · Reference Books · Personal Preparation · Assignments · Laboratory · Mock Test · Internal Assessment Test · Semester Exam · Question Papers · Syllabus · Usage Guidelines · License · About · Acknowledgments
Computer Engineering (B.E.) - University of Mumbai
Semester-wise curriculum, laboratories, projects, and academic notes.