Data Warehousing and Data Mining

  B.Tech.(V Sem.)

 23AD04- Data Warehousing and Data Mining   3(L) 0(T) 0(P) 3(C)

 Pre-requisites: Data Structures, Algorithms, Probability & Statistics, Data Base Management

 Systems Course Educational Objectives: The main objective of the course is to 

  • Introduce basic concepts and techniques of data warehousing and datamining 
  • Examine the types of data to be mined and apply pre-processing methods on raw data 
  • Discover interesting patterns, analyze supervised and unsupervised models and estimate the accuracy of the algorithms. 

 Course Outcomes 

 CO1: Design data warehouses to support effective data modeling, integration, and analytical processing. (Understand L2) 
 CO2: Understand data preprocessing techniques required to convert raw data into a suitable format for effective machine learning applications. (Understand L2) 
 CO3: Apply classification techniques using different algorithms to solve real-world problems and evaluate their performance. (Apply L3) 
 CO4: Apply Apriori and FP-Growth algorithms to analyze frequent patterns and uncover insights from large datasets. (Apply L3) 
 CO5: Understand clustering concepts and various cluster analysis methods to group similar data points effectively. (Understand L2) 

 Syllabus: 

UNIT–I: Data Warehousing and Online Analytical Processing: Basic concepts, Data Warehouse Modeling: Data Cube and OLAP, Data Warehouse Design and Usage, Data Warehouse Implementation, Cloud Data Warehouse, Data Mining and Patten Mining, Technologies, Applications, Major issues, Data Objects &Attribute Types, Basic Statistical Descriptions of Data, Data Visualization, Measuring Data Similarity and Dissimilarity. (Textbook- 1) 

Link- https://drive.google.com/file/d/1ba0qGm58_n5GI2PxaZMmYtcdVO6iMTwg/view?usp=sharing

 UNIT II: Data Preprocessing: An Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation and Data Discretization. (Textbook- 1) 

 UNIT–III: Classification: Basic Concepts, General Approach to solving a classification problem, Decision Tree Induction: Attribute Selection Measures, Tree Pruning, Scalability and Decision Tree Induction, Visual Mining for Decision Tree Induction, Bayesian Classification Methods: Bayes Theorem, Naïve Bayes Classification, Rule-Based Classification, Model Evaluation and Selection. (Textbook- 2) 

Link-https://drive.google.com/file/d/1mlkDd8AGAO88-JSAJyUmx4_IoSYkCTcz/view?usp=sharing

 UNIT–IV: Association Analysis: Problem Definition, Frequent Itemset Generation, Rule Generation: Confident Based Pruning, Rule Generation in Apriori Algorithm, Compact Representation of frequent item sets, FP-Growth Algorithm. (Textbook- 2) Link-

 UNIT–V: Cluster Analysis: Overview, Basics and Importance of Cluster Analysis, Clustering techniques, Different Types of Clusters; K-means: The Basic K-means Algorithm, K-means Additional Issues, Bi secting K Means, Agglomerative Hierarchical Clustering: Basic Agglomerative Hierarchical Clustering Algorithm DBSCAN: Traditional Density Center-Based Approach, DBSCAN Algorithm, Strengths and Weaknesses. (Textbook- 2) Lonk-

 Textbooks: 

 1. Data Mining concepts and Techniques, 3rd edition, Jiawei Han, MichelKamber, Elsevier, 2011. 
 2. Introduction to Data Mining: Pang-Ning Tan & Michael Steinbach, VipinKumar, Pearson, 2012. 

 Reference Books: 

 1. Data Mining: Vikram Pudi and P. Radha Krishna, Oxford Publisher. 
 2. Data Mining Techniques, Arun K Pujari, 3rd edition, UniversitiesPress,2013. 
 3. (NPTEL course by Prof.PabitraMitra) http://onlinecourses.nptel.ac.in/noc17_mg24/preview 
 4. http://www.saedsayad.com/data_mining_map.htm

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.