Data Warehouse & Data Mining Syllabus

Table of Contents

    Data Warehouse & Data Mining Syllabus

    Prerequisite:

    • Familiarity with database concepts and basic understanding of statistics and machine learning techniques would be beneficial for comprehending data mining principles and methodologies.

    Course Objective:

    • This course is designed for a section level integration of data mining. It is about how to discover significant data and therefore separate important patterns from it.

    Unit-01 Introduction to Data Mining (17%)

    • What is data mining
    • What kind of data is mined?
    • Database data
    • Data Warehouses
    • Transactional data
    • Other kinds of data
    • What kind of patterns can be mined?
    • Introduction about Knowledge Discovery in Databases Process
    • Steps in the KDD Process
    • Real-life Applications of KDD
    • Examples from business intelligence, healthcare, marketing

    Unit-02 Data Pre-processing (17%)

    • Data Pre-processing: An Overview
    • Data Quality: Why Pre-process the Data?
    • Major task in Data Pre-processing,
    • Data Cleaning, Missing Values, Noisy Data
    • Data Cleaning as Process
    • Data Integration
    • Entity identification problem
    • Redundancy and correlation analysis

    Unit-03 Data Warehouse (24%)

    • What is a Data Warehouse?
    • Difference between Operational Database Systems and Data Warehouse
    • Why have a separate Data Warehouse?
    • Data Warehousing: A multitier architecture
    • Data Warehouse Models: Enterprise Warehouse
    • Data Mart, and Virtual Warehouse Extraction
    • Transformation and Loading
    • Data Warehouse Modelling: Data cube and OLAP
    • Data Cube: Multidimensional Data Model, Stars, Snowflakes, and Fact
    • Constellations: Schemas for Multidimensional Data Models
    • Typical OLAP Operations
    • Data Warehouse Design and Usage
    • Information Processing from OLAP to Multidimensional data Mining

    Unit-04 Mining Frequent Patterns, Association: Basic Concepts and Methods (17%)

    • Market Basket Analysis: A Motivating Example
    • Frequent itemset, Closed itemset and Association Rules
    • Frequent Itemset Mining Methods
    • Apriori Algorithm: Finding Frequent Itemset by confined candidate generation
    • Generating Association rules from frequent itemset

    Unit-05 Classification and Cluster Analysis (25%)

    • Basic Concepts of Classification
    • What is Classification?
    • General approach to Classification
    • Decision Tree Induction
    • Bayes Classification method
    • Basic Concepts of Clustering
    • What is Cluster Analysis?
    • Requirements of Cluster Analysis
    • Overview of Basic Clustering Methods
    • Partitioning Methods
    • K-Means: A centroid based technique

    Made By SOU Student for SOU Students