Data Warehouse & Data Mining Syllabus

Data Warehouse & Data Mining Syllabus

Prerequisite:

  • Familiarity with database concepts and basic understanding of statistics and machine learning techniques would be beneficial for comprehending data mining principles and methodologies.

Course Objective:

  • This course is designed for a section level integration of data mining. It is about how to discover significant data and therefore separate important patterns from it.

Unit-01 Introduction to Data Mining (17%)

  • What is data mining
  • What kind of data is mined?
  • Database data
  • Data Warehouses
  • Transactional data
  • Other kinds of data
  • What kind of patterns can be mined?
  • Introduction about Knowledge Discovery in Databases Process
  • Steps in the KDD Process
  • Real-life Applications of KDD
  • Examples from business intelligence, healthcare, marketing

Unit-02 Data Pre-processing (17%)

  • Data Pre-processing: An Overview
  • Data Quality: Why Pre-process the Data?
  • Major task in Data Pre-processing,
  • Data Cleaning, Missing Values, Noisy Data
  • Data Cleaning as Process
  • Data Integration
  • Entity identification problem
  • Redundancy and correlation analysis

Unit-03 Data Warehouse (24%)

  • What is a Data Warehouse?
  • Difference between Operational Database Systems and Data Warehouse
  • Why have a separate Data Warehouse?
  • Data Warehousing: A multitier architecture
  • Data Warehouse Models: Enterprise Warehouse
  • Data Mart, and Virtual Warehouse Extraction
  • Transformation and Loading
  • Data Warehouse Modelling: Data cube and OLAP
  • Data Cube: Multidimensional Data Model, Stars, Snowflakes, and Fact
  • Constellations: Schemas for Multidimensional Data Models
  • Typical OLAP Operations
  • Data Warehouse Design and Usage
  • Information Processing from OLAP to Multidimensional data Mining

Unit-04 Mining Frequent Patterns, Association: Basic Concepts and Methods (17%)

  • Market Basket Analysis: A Motivating Example
  • Frequent itemset, Closed itemset and Association Rules
  • Frequent Itemset Mining Methods
  • Apriori Algorithm: Finding Frequent Itemset by confined candidate generation
  • Generating Association rules from frequent itemset

Unit-05 Classification and Cluster Analysis (25%)

  • Basic Concepts of Classification
  • What is Classification?
  • General approach to Classification
  • Decision Tree Induction
  • Bayes Classification method
  • Basic Concepts of Clustering
  • What is Cluster Analysis?
  • Requirements of Cluster Analysis
  • Overview of Basic Clustering Methods
  • Partitioning Methods
  • K-Means: A centroid based technique

Made By SOU Student for SOU Students