Data Mining and Management Strategies

Online Certification

Michigan State University

MSU Programs > Business AnalyticsData Mining and Management Strategies

Course Overview

Instructor: Dr. Arend Hintze

Course Objectives 

After successfully completing this course, students will be able to:

  • Determine the difference between data and information

  • Explain the role of asking the right questions to extract data from a database

  • Evaluate alternative answers to questions to develop better strategies using data

  • Recognize the practical applications of the four Vs of big data

  • Recognize how the web works

  • Visualize the idea behind clustering

  • Compare and contrast classification and decision trees

Students will meet the course objectives through the following actions:

  • Completing learning content pages, which includes

    • Watching videos

    • Reading text and studying charts and tables

    • Listening to audio

  • ​Complete interactive learning activities

  • Posting to the discussion board

  • Completing eight exams

Course Requirements


  • Internet connection (DSL, LAN, or cable connection desirable)

  • Access to Canvas

  • Read content, watch videos, listen to audio, and complete assignments

  • Microsoft Excel

Course Structure

The content will be delivered in Canvas with a variety of media components. Students will need to be able to play video and download files if indicated.

Course Outline/Schedule

Module 1: Enterprise Database and Data Models

  • Enterprise data and data management systems

  • The role of ETL to create and cleanse multidimensional data

  • The elements of a data model and explain its role in organizational database design

  • The data structure that supports a multidimensional data store

  • The challenges for selecting the data to be included in multidimensional databases

Module 2: Extracting Data from a Database

  • The four Vs of data

  • Different types of data sources generated by the "Internet of Things"

  • The creation and/or magnitude of change analytics can create within an industry and a firm

Module 3: The Data Analysis Landscape and an Overview of Machine Learning

  • The difference between what data analytics is and is not

  • The scientific method

  • What a good question is and what it is not

  • The basics of probabilities

  • Where probability distribution comes from and what it does

  • The outcome of a choice involving probabilities and utilities

  • The landscape of machine learning and artificial intelligence

Module 4: Getting Data from Social Networks and Geolocalization

  • Web content

  • The structure of a web page

  • What a web crawler is and what uses it has

  • API, or alternate programming interface

Module 5: Clustering and Understanding the Relation of Things

  • The concept of finding structure within the data

  • The possible outcomes of clustering

  • The number and size of clusters

  • Which data points are similar to each other and which are not

  • The landscape of machine learning and artificial intelligence

Module 6: Black Box vs. White Box Approach and the Implications

  • The differences between a black box and a white box

  • What circumstances create a black or a white box

  • The advantages and disadvantages of either class

  • Legal implications

  • Under which circumstances the user should prefer either class

Module 7: Classification Black Box Methods

  • The components of deep learning

  • When an LSTM should be applied

  • The random forest method

  • The components of an artificial neuron network

Module 8: Large-Scale Implementation of Hadoop and MapReduce

  • Hadoop

  • Parallel tasks and their necessity within Hadoop framework

  • The tools contained within the Hadoop Zoo

  • The components of the MapReduce paradigm

  • The MapReduce paradigm and the word count example

  • The differences between the classic MapReduce implementation and the streaming implementation