B.Tech.(V Sem.)
23AD04- Data Warehousing and Data Mining 3(L) 0(T) 0(P) 3(C)
Pre-requisites: Data Structures, Algorithms, Probability & Statistics, Data Base Management
Systems Course Educational Objectives: The main objective of the course is to
- Introduce basic concepts and techniques of data warehousing and datamining
- Examine the types of data to be mined and apply pre-processing methods on raw data
- Discover interesting patterns, analyze supervised and unsupervised models and estimate the accuracy of the algorithms.
Course Outcomes
CO2: Understand data preprocessing techniques required to convert raw data into a suitable format for effective machine learning applications. (Understand L2)
CO3: Apply classification techniques using different algorithms to solve real-world problems and evaluate their performance. (Apply L3)
CO4: Apply Apriori and FP-Growth algorithms to analyze frequent patterns and uncover insights from large datasets. (Apply L3)
CO5: Understand clustering concepts and various cluster analysis methods to group similar data points effectively. (Understand L2)
Syllabus:
UNIT–I: Data Warehousing and Online Analytical Processing: Basic concepts, Data Warehouse Modeling: Data Cube and OLAP, Data Warehouse Design and Usage, Data Warehouse Implementation, Cloud Data Warehouse, Data Mining and Patten Mining, Technologies, Applications, Major issues, Data Objects &Attribute Types, Basic Statistical Descriptions of Data, Data Visualization, Measuring Data Similarity and Dissimilarity. (Textbook- 1)
Link- https://drive.google.com/file/d/1ba0qGm58_n5GI2PxaZMmYtcdVO6iMTwg/view?usp=sharing
UNIT II: Data Preprocessing: An Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation and Data Discretization. (Textbook- 1)
Part1-https://drive.google.com/file/d/1re_eV2oN-W9SWsOapCN2eprvFagjT432/view?usp=sharing
Part2-https://drive.google.com/file/d/1M934fR57wkRiSLNnkfZP-0kdVti_P9Iz/view?usp=sharing
UNIT–III: Classification: Basic Concepts, General Approach to solving a classification problem, Decision Tree Induction: Attribute Selection Measures, Tree Pruning, Scalability and Decision Tree Induction, Visual Mining for Decision Tree Induction, Bayesian Classification Methods: Bayes Theorem, Naïve Bayes Classification, Rule-Based Classification, Model Evaluation and Selection. (Textbook- 2)
Link-https://drive.google.com/file/d/1mlkDd8AGAO88-JSAJyUmx4_IoSYkCTcz/view?usp=sharing
UNIT–IV: Association Analysis: Problem Definition, Frequent Itemset Generation, Rule Generation: Confident Based Pruning, Rule Generation in Apriori Algorithm, Compact Representation of frequent item sets, FP-Growth Algorithm. (Textbook- 2) Link-
UNIT–V: Cluster Analysis: Overview, Basics and Importance of Cluster Analysis, Clustering techniques, Different Types of Clusters; K-means: The Basic K-means Algorithm, K-means Additional Issues, Bi secting K Means, Agglomerative Hierarchical Clustering: Basic Agglomerative Hierarchical Clustering Algorithm DBSCAN: Traditional Density Center-Based Approach, DBSCAN Algorithm, Strengths and Weaknesses. (Textbook- 2) Lonk-
Textbooks:
2. Introduction to Data Mining: Pang-Ning Tan & Michael Steinbach, VipinKumar, Pearson, 2012.
Reference Books:
2. Data Mining Techniques, Arun K Pujari, 3rd edition, UniversitiesPress,2013.
3. (NPTEL course by Prof.PabitraMitra) http://onlinecourses.nptel.ac.in/noc17_mg24/preview
4. http://www.saedsayad.com/data_mining_map.htm
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.