About the course
Data Science Masters Program makes you proficient in tools and systems used by Data Science Professionals. It includes training on Statistics, Data Science, Python, Apache Spark & Scala, TensorFlow and Tableau. The curriculum has been determined by extensive research on 5000+ job descriptions across the globe.
Python Statistics for Data Science Course
R Statistics for Data Science Course
Data Science Certification Training
Python Certification Training for Data Science
Apache Spark and Scala Certification Training
Deep Learning with TensorFlow 2.0 Certification Training
Tableau Training & Certification
Data Science Master Program Capstone Project
1. Python Statistics for Data Science Course
Module 1: Understanding the Data
Topics:
Introduction to Data Types
Numerical parameters to represent data
Mean
Mode
Median
Sensitivity
Information Gain
Entropy
Statistical parameters to represent data
Module 2: Probability and its uses
Topics:
Uses of probability
Need of probability
Bayesian Inference
Density Concepts
Normal Distribution Curve
Module 3: Statistical Inference
Topics:
Point Estimation
Confidence Margin
Hypothesis Testing
Levels of Hypothesis Testing
Module 4: Testing the Data
Topics:
Understand Parametric and Non-parametric Testing
Learn various types of parametric testing
Discuss experimental designing
Explain a/b testingTopics
Parametric Test
Parametric Test Types
Non- Parametric Test
Experimental Designing
A/B testing
Module 5: Data Clustering
Topics:
Association and Dependence
Causation and Correlation
Covariance
Simpson’s Paradox
Clustering Techniques
Module 6: Regression Modelling
Topics:
Logistic and Regression Techniques
Problem of Collinearity
WOE and IV
Residual Analysis
Heteroscedasticity
Homoscedasticity
2. R Statistics for Data Science Course
Module 1: Understanding the Data
Topics:
Introduction to Data Types
Numerical parameters to represent data
Mean
Mode
Median
Sensitivity
Information Gain
Entropy
Statistical parameters to represent data
Module 2: Probability and its Uses
Topics:
Uses of probability
Need of probability
Bayesian Inference
Density Concepts
Normal Distribution Curve
Module 3: Statistical Inference
Topics:
Point Estimation
Confidence Margin
Hypothesis Testing
Levels of Hypothesis Testing
Module 4: Testing the Data
Topics:
Parametric Test
Parametric Test Types
Non- Parametric Test
A/B testing
Module 5: Data Clustering
Topics:
Association and Dependence
Causation and Correlation
Covariance
Simpson’s Paradox
Clustering Techniques
Module 6: Regression Modelling
Topics:
Logistic and Regression Techniques
Problem of Collinearity
WOE and IV
Residual Analysis
Heteroscedasticity
Homoscedasticity
3. Data Science Certification Training
Module 1: Introduction to Data Science
Topics:
What is Data Science?
What does Data Science involve?
Era of Data Science
Business Intelligence vs Data Science
Life cycle of Data Science
Tools of Data Science
Introduction to Big Data and Hadoop
Introduction to R
Introduction to Spark
Introduction to Machine Learning
Module 2: Statistical Inference
Topics:
What is Statistical Inference?
Terminologies of Statistics
Measures of Centers
Measures of Spread
Probability
Normal Distribution
Binary Distribution
Module 3: Data Extraction, Wrangling and Exploration
Topics:
Data Analysis Pipeline
What is Data Extraction
Types of Data
Raw and Processed Data
Data Wrangling
Exploratory Data Analysis
Visualization of Data
Module 4: Introduction to Machine Learning
Topics:
What is Machine Learning?
Machine Learning Use-Cases
Machine Learning Process Flow
Machine Learning Categories
Supervised Learning algorithm: Linear Regression and Logistic
Regression
Module 5: Classification Techniques
Topics:
What are classification and its use cases?
What is Decision Tree?
Algorithm for Decision Tree Induction
Creating a Perfect Decision Tree
Confusion Matrix
What is Random Forest?
What is Navies Bayes?
Support Vector Machine: Classification
Module 6: Unsupervised Learning
Topics:
What is Clustering & its use cases
What is K-means Clustering?
What is C-means Clustering?
What is Canopy Clustering
What is Hierarchical Clustering?
Module 7: Recommender Engines
Topics:
What is Association Rules & its Use Cases?
What is Recommendation Engine & its Workings?
Types of Recommendations
User-Based Recommendation
Item-Based Recommendation
Difference: User-Based and Item-Based Recommendation
Recommendation Use Cases
Module 8: Text Mining
Topics:
The concepts of text-mining
Use cases
Text Mining Algorithms
Quantifying text
TF-IDF
Beyond TF-IDF
Module 9: Time Series
Topics:
What is Time Series data?
Time Series variables
Different components of Time Series data
Visualize the data to identify Time Series Components
Implement ARIMA model for forecasting
Exponential smoothing models
Identifying different time series scenario based on which different Exponential Smoothing model can be applied
Implement respective ETS model for forecasting
Module 10 : Deep Learning
Topics:
Reinforced Learning
Reinforcement learning Process Flow
Reinforced Learning Use cases
Deep Learning
Biological Neural Networks
Understand Artificial Neural Networks
Building an Artificial Neural Network
How ANN works
Important Terminologies of ANN’s
4. Python Certification Training for Data Science
Module 1: Introduction to Python
Topics:
Overview of Python
The Companies using Python
Different Applications where Python is used
Discuss Python Scripts on UNIX/Windows
Values, Types, Variables
Operands and Expressions
Conditional Statements
Loops
Command Line Arguments
Writing to the screen
Module 2 : Sequences and File Operations
Topics:
Python files I/O Functions
Numbers
Strings and related operations
Tuples and related operations
Lists and related operations
Dictionaries and related operations
Sets and related operations
Module 3 : Deep Dive – Functions , OOPs , Modules , Errors and Exceptions
Topics:
Functions
Function Parameters
Global Variables
Variable Scope and Returning Values
Lambda Functions
Object-Oriented Concepts
Standard Libraries
The Import Statements
Module Search Path
Package Installation Ways
Errors and Exception Handling
Handling Multiple Exceptions
Module 4 : Introduction to NumPy , Pandas and Matplotlib
Topics:
NumPy - arrays
Operations on arrays
Indexing slicing and iterating
Reading and writing arrays on files
Pandas - data structures & index operations
Reading and Writing data from Excel/CSV formats into Pandas
matplotlib library
Grids, axes, plots
Markers, colors, fonts and styling
Types of plots - bar graphs, pie charts, histograms
Module 5 : Association Rules Mining and Recommendation Systems
Topics:
What are Association Rules?
Association Rule Parameters
Calculating Association Rule Parameters
Recommendation Engines
How does Recommendation Engines work?
Collaborative Filtering
Content-Based Filtering
Module 6 : Reinforcement Learning
Topics:
What is Reinforcement Learning
Why Reinforcement Learning
Elements of Reinforcement Learning
Exploration vs Exploitation dilemma
Epsilon Greedy Algorithm
Markov Decision Process (MDP)
Q values and V values
Q – Learning
α values
Module 7 : Time Series Analysis
Topics:
What is Time Series Analysis?
Importance of TSA
Components of TSA
White Noise
AR model
MA model
ARMA model
ARIMA model
Stationarity
ACF & PACF
Module 8: Model Selection and Boosting
Topics:
What is Model Selection?
The need for Model Selection
Cross-Validation
What is Boosting?
How Boosting Algorithms work?
Types of Boosting Algorithms
Adaptive Boosting
5. Apache Spark and Scala Certification Training
Module 1 : Introduction to Big Data Hadoop and Spark
Topics:
What is Big Data?
Big Data Customer Scenarios
Limitations and Solutions of Existing Data Analytics Architecture with Uber Use Case
How Hadoop Solves the Big Data Problem?
What is Hadoop?
Hadoop’s Key Characteristics
Hadoop Ecosystem and HDFS
Hadoop Core Components
Rack Awareness and Block Replication YARN and its Advantage
Hadoop Cluster and its Architecture
Hadoop: Different Cluster Modes
Big Data Analytics with Batch & Real-time Processing
Why Spark is needed?
What is Spark?
How Spark differs from other frameworks?
Spark at Yahoo!
Module 2 : Introduction to Scala and Apache Spark
Topics:
What is Scala?
Scala in other Frameworks
Basic Scala Operations
Control Structures in Scala
Collections in Scala- Array
Why Scala for Spark?
Introduction to Scala REPL
Variable Types in Scala
Foreach loop, Functions and Procedures
ArrayBuffer, Map, Tuples, Lists, and more
Module 3 : Functional Programming and OOPs Concepts in Scala
Topics:
Functional Programming
Anonymous Functions
Getters and Setters
Properties with only Getters
Singletons
Overriding Methods
Higher Order Functions
Class in Scala
Custom Getters and Setters
Auxiliary Constructor and Primary Constructor
Extending a Class
Traits as Interfaces
and Layered Traits
Module 4: Deep Dive into Apache Spark Framework
Topics:
Spark’s Place in Hadoop Ecosystem
Spark Components & its Architecture
Spark Deployment Modes
Introduction to Spark Shell
Writing your first Spark Job Using SBT
Submitting Spark Job
Spark Web UI
Data Ingestion using Sqoop
Module 5: Playing with Spark RDDs
Topics:
Challenges in Existing Computing Methods
Probable Solution & How RDD Solves the Problem
What is RDD, Its Functions, Transformations & Actions?
Data Loading and Saving Through RDDs
Key-Value Pair RDDs
Other Pair RDDs o RDD Lineage
RDD Lineage
RDD Persistence
WordCount Program Using RDD Concepts
RDD Partitioning & How It Helps Achieve Parallelization
Passing Functions to Spark
Module 6 : DataFrames and Spark SQL
Topics:
Need for Spark SQL
What is Spark SQL?
Spark SQL Architecture
SQL Context in Spark SQL
User Defined Functions
Data Frames & Datasets
Interoperating with RDDs
JSON and Parquet File Formats
Loading Data through Different Sources
Spark – Hive Integration
Module 7: Machine Learning using Spark MLlib
Topics:
Why Machine Learning?
What is Machine Learning?
Where Machine Learning is Used?
Face Detection: USE CASE
Different Types of Machine Learning Techniques
Introduction to MLlib
Features of MLlib and MLlib Tools
Various ML algorithms supported by MLlib
Module 8: Deep Dive into Spark MLlib
Topics:
Supervised Learning - Linear Regression, Logistic Regression, DecisionmTree, Random Forest
Unsupervised Learning - K-Means Clustering & How It Workswith MLlib
Analysis on US Election Data using MLlib (K-Means)
Module 9 : Understanding Apache Kafka & Apache Flume
Topics:
Need for Kafka
Core Concepts of Kafka
Where is Kafka Used?
What is Kafka?
Kafka Architecture
Understanding the Components of Kafka Cluster
Configuring Kafka Cluster
Need of Apache Flume
What is Apache Flume?
Flume Sources
Flume Channels
Integrating Apache Flume and Apache Kafka
Basic Flume Architecture
Flume Sinks
Flume Configuration
Module 10 : Apache Spark Streaming- Processing Multiple Batches
Topics:
Drawbacks in Existing Computing Methods
Why Streaming is Necessary?
What is Spark Streaming?
Spark Streaming Features
Spark Streaming Workflow
How Uber Uses Streaming Data
Streaming Context & DStreams
Transformations on DStreams
Describe Windowed Operators and Why it is Useful
Important Windowed Operators
Slice, Window and ReduceByWindow Operators
Stateful Operators
Module 11 : Apache Spark Streaming- Data Sources
Topics:
Apache Spark Streaming: Data Sources
Streaming Data Source Overview
Apache Flume and Apache Kafka Data Sources
Example: Using a Kafka Direct Data Source
Perform Twitter Sentimental Analysis Using Spark Streaming
Module 12: In Class Project
Module 13 : Spark GraphX(Self-Paced)
6. Deep Learning with TensorFlow 2.0 Certification Training
Module 1 : Introduction to Deep Learning
Topics:
What is Deep Learning?
Curse of Dimensionality
Machine Learning vs. Deep Learning
Use cases of Deep Learning
Human Brain vs. Neural Network
What is Perceptron?
Learning Rate
Epoch
Batch Size
Activation Functio
Single Layer Perceptron
Module 2 : Getting Started with TensorFlow 2.0
Topics:
Introduction to TensorFlow 2.x
Installing TensorFlow 2.x
Defining Sequence model layers
Activation Function
Layer Types
Model Compilation
Model Optimizer
Model Loss Function
Model Training
Digit Classification using Simple Neural Network in TensorFlow 2.x
Improving the model
Adding Hidden Layer
Adding Dropout
Using Adam Optimizer
Module 3 : Convolution Neural Network
Topics:
Image Classification Example
What is Convolution
Convolutional Layer Network
Convolutional Layer
Filtering
ReLU Layer
Pooling
Data Flattening
Fully Connected Layer
Predicting a cat or a dog
Saving and Loading a Model
Face Detection using OpenCV
Module 4 : Regional CNN
Topics:
Regional-CNN
Selective Search Algorithm
Bounding Box Regression
SVM in RCNN
Pre-trained Model
Model Accuracy
Model Inference Time
Model Size Comparison
Transfer Learning
Object Detection – Evaluation
mAP
IoU
RCNN – Speed Bottleneck
Fast R-CNN
RoI Pooling
Fast R-CNN – Speed Bottleneck
Faster R-CNN
Feature Pyramid Network (FPN)
Regional Proposal Network (RPN)
Mask R-CNN
Module 5: Boltzmann Machine & Autoencoder
Topics:
What is Boltzmann Machine (BM)?
Identify the issues with BM
Why did RBM come into picture?
Step by step implementation of RBM
Distribution of Boltzmann Machine
Understanding Autoencoders
Architecture of Autoencoders
Brief on types of Autoencoders
Applications of Autoencoders
Module 6 : Generative Adversarial Network (GAN)
Topics:
What is Boltzmann Machine (BM)?
Identify the issues with BM
Why did RBM come into picture?
Step by step implementation of RBM
Distribution of Boltzmann Machine
Understanding Autoencoders
Architecture of Autoencoders
Brief on types of Autoencoders
Applications of Autoencoders
Module 7 : Emotion and Gender Detection
Topics:
What is Boltzmann Machine (BM)?
Identify the issues with BM
Why did RBM come into picture?
Step by step implementation of RBM
Distribution of Boltzmann Machine
Understanding Autoencoders
Architecture of Autoencoders
Brief on types of Autoencoders
Applications of Autoencoders
Module 8 : Introduction RNN and GRU
Topics:
What is Boltzmann Machine (BM)?
Identify the issues with BM
Why did RBM come into picture?
Step by step implementation of RBM
Distribution of Boltzmann Machine
Understanding Autoencoders
Architecture of Autoencoders
Brief on types of Autoencoders
Applications of Autoencoders
Module 9: LSTM
Topics:
What is Boltzmann Machine (BM)?
Identify the issues with BM
Why did RBM come into picture?
Step by step implementation of RBM
Distribution of Boltzmann Machine
Understanding Autoencoders
Architecture of Autoencoders
Brief on types of Autoencoders
Applications of Autoencoders
Module 10 : Auto Image Captioning Using CNN LSTM
Topics:
Auto Image Captioning
COCO dataset
Pre-trained model
Inception V3 model
Architecture of Inception V3
Modify last layer of pre-trained model
Freeze model
CNN for image processing
LSTM or text processing
7. Tableau Training & Certification
Topics:
Data Visualization
Business Intelligence tools
Introduction to Tableau
Tableau Architecture
Tableau Server Architecture
VizQL
Introduction to Tableau Prep
Tableau Prep Builder User Interface
Data Preparation techniques using Tableau Prep Builder tool
Module 2 : Data Connection with Tableau Desktop
Topics:
Features of Tableau Desktop
Connect to data from File and Database
Types of Connections
Joins and Unions
Data Blending
Tableau Desktop User Interface
Basic project: Create a workbook and publish it on Tableau Online
Module 3 : Basic Visual Analytics
Topics:
Visual Analytics
Basic Charts: Bar Chart, Line Chart, and Pie Chart
Hierarchies
Data Granularity
Highlighting
Sorting
Filtering
Grouping
Sets
Module 4 : Calculations in Tableau
Topics:
Types of Calculations
Built-in Functions (Number, String, Date, Logical and Aggregate)
Operators and Syntax Conventions
Table Calculations
Level Of Detail (LOD) Calculations
Using R within Tableau for Calculations
Module 5 : Advanced Visual Analytics
Topics:
Parameters
Tool tips
Trend lines
Reference lines
Forecasting
Clustering
Module 6: Level of Detail (LOD) Expressions in Tableau
Topics:
Use Case I - Count Customer by Order
Use Case II - Profit per Business Day
Use Case III - Comparative Sales
Use Case IV - Profit Vs Target
Use Case V - Finding the second order date
Use Case VI - Cohort Analysis
Module 7: Geographic Visualizations in Tableau
Topics:
Introduction to Geographic Visualizations
Manually assigning Geographical Locations
Types of Maps
Spatial Files
Custom Geocoding
Polygon Maps
Web Map Services
Background Images
Module 8 : Advanced Charts in Tableau
Topics:
Box and Whisker’s Plot
Bullet Chart
Bar in Bar Chart
Gantt Chart
Waterfall Chart
Pareto Chart
Control Chart
Funnel Chart
Bump Chart
Step and Jump Lines
Word Cloud
Donut Chart
Module 9: Dashboards and Stories
Topics:
Introduction to Dashboards
The Dashboard Interface
Dashboard Objects
Building a Dashboard
Dashboard Layouts and Formatting
Interactive Dashboards with actions
Designing Dashboards for devices
Story Points
Module 10: Get Industry Ready
Topics:
Tableau Tips and Tricks
Choosing the right type of Chart
Format Style
Data Visualization best practices
Prepare for Tableau Interview
Module 11: Exploring Tableau Online
Topics:
Publishing Workbooks to Tableau Online
Interacting with Content on Tableau Online
Data Management through Tableau Catalog
AI-Powered features in Tableau Online (Ask Data and Explain Data)
Understand Scheduling
Managing Permissions on Tableau Online
Data Security with Filters in Tableau Online