Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Applied ML Course
Python Programming
Resources
Why Python for DS?
Python Installation and Setup - Pre-requisites
Miniconda Installation
Module, Script, Package and Library
Virtual Environments
Python, IPython Shell
Introduction to Jupyter
Keywords and Identifiers
Variables and Data Types
Python Standard IO
Executing Python Script using Terminal
Operators
Strings
Strings - Indexing, Formatting
Control Flow - If else
Control Flow - While loop, Break and Continue
Control Flow - for loop
Lists
Lists - Indexing and Slicing
Tuples
Sets
Dictionary
Dictionary and List Comprehensions
Introduction to Functions
Positional and Keyword arguments
*args, **kwargs
Lambda Functions
Higher Order Functions
Namespaces, Scope
Enclosing Scope and LEGB Rule
Decorators
Intro. to Object Oriented Programming
Classes and Objects
Methods
Inheritance
Exception Handling
Reading and Writing .txt files
Reading and Writing .csv files
SQL
Resources
Introduction to Databases and SQL
Installing MySQL
DDL - CREATE, ALTER, DROP
SQL DataTypes
Setting up the data
Data Retrieval - SELECT, WHERE, COUNT, DISTINCT, LIKE
Data Retrieval - ORDER BY, LIMIT, OFFSET
Get Summary Info. using GROUP BY
GROUP BY - HAVING Clause
Creating additional Columns
Intro. to Joins
JOINS - INNER, LEFT, RIGHT
JOINS - OUTER
Sub Queries, Correlated Sub Queries
Window Functions - RANK, ROW_NUM, NTILE, LEAD, LAG
DML - INSERT, DELETE, UPDATE
DCL - GRANT, REVOKE
Python - Reading the data from MySQL Table
Python - Writing data to MySQL Table
Python - Inserting Multiple records at once to MySQL Table
Python for Data Science : NumPy
Resources
Introduction to NumPy
Numpy arrays
Placeholder Functions
Array Indexing and Slicing
Boolean Indexing
Transposing and Flattening
Combining and Splitting of arrays
Broadcasting arrays
Python for Data Science : Pandas
Resources
Introduction to Pandas
DataFrames
Series
DataFrames - Indexing
DataFrame - loc and iloc Operators
DataFrame - Comparsion and Filtering
DataFrame - Insert, Concatenate and Delete
DataFrame - Merging
DataFrame - apply, applymap and map
Groupby
Sorting and Ranking
Reading and Writing Data Files
Data Visualization : Matplotlib and Seaborn
Resources
Introduction to Data Visualization
Understanding Matplotlib Object Hierarchy
Adding Color, Line Style, Markers
Labels, Ticks and Legends to Plots
Line Plots
Histogram
Bar Plots
Stacked Bar Chart & Grouped Bar Chart
Scatter Plots
Box and Violin Plots
Seaborn - Distribution plot, Kernel Density Estimate (KDE)
Seaborn - Relational Plot, Joint Plot
Probability and Statistics
Resources
Intro. to Statistics
Types of Data, Sample and Population
Estimates of Location
Estimates of Location - Coding
Estimates of Variability
Coefficient of Variation
Descriptive Statistics - Coding
Intro. to Probability, Random Experiment & Random Variable
Calculating Probability
Conditional Probability
Bayes Theorem
Bayes Theorem Problem
Discrete RV, Probability Mass Function (PMF)
Bernoulli Distribution
Bernoulli Distribution and PMF using Python
Binomial Distribution
Geometric, Hyper Geometric Distribution
Continuous RV, Probability Density Function (PDF)
Cumulative Distribution Function (CDF)
Gaussian Distribution
Standard Normal Distribution, Z Score
Normal Distribution - Coding
Normal approximation to Binomial
Log Normal Distribution
Law of Large Numbers
Central Limit Theorem (CLT)
Verifying CLT
Intro. to Confidence Intervals
Confidence Intervals : Margin of Error
Confidence Intervals : t-Distribution
Hypothesis Testing
Z Test
One sample and Two sample t-test
t-test - Implementation
Paired t-test
Chi Square Test
Chi Square Test - Implementation
Covariance
Pearson Correlation
Spearman Rank Correlation
Kendall's Tau Correlation
Exploratory Data Analysis
Resources
Machine Learning Life Cycle
Predictive Modeling Steps
EDA Steps
Variable Types
Variable Identification
Categorical Encoding - Label, Ordinal Encoding
Categorical Encoding - One Hot Encoding
Categorical Encoding - Frequency Encoding
Missing Value Identification
Univariate Analysis - Descriptive Statistics
Univariate Analysis - Data Profiling
What are Outliers?
Impact of Outliers - Why are they bad?
Identifying Outliers - Box Plot approach
Identifying Outliers - Z Score method
Identifying Outliers - Modified Z Score method
Outlier Treatment - Ways to handle Outliers
Multivariate Outlier Identification
Multivariate Outlier Identification - Implementation
Need for Scaling
Standardization and Normalization of data
Intro. to Bivariate Analysis
Continuous-Continuous
Categorical-Categorical : Hypothesis Testing
Categorical-Categorical : Visualizations
Categorical-Continuous
Quantile-Quantile Plot (QQ Plot)
Kolmogorov Smirnov Test (KS Test)
Linear Algebra
Resources
Introduction to Linear Algebra
Vector Operations
Vector Dot Product
Projection of a Vector
Basis, Span and Linear Dependence
System of Linear equations
Solving System of Linear equations
Types of Matrices
Linear Transformations
Eigen Vectors, Eigen Values
Eigen Decomposition
Deriving Eigen Vectors and Values using Python
K Nearest Neighbors
Resources
KNN - Intuition
Failure cases of KNN
Distance Measures - Euclidean, Manhattan, Minkowski
Distance Measures - Hamming distance
Cosine Similarity and Cosine Distance
KNN Implementation
KNN - Breaking a tie
Decision Regions
Overfitting versus Underfitting
KNN - Choosing an Optimal value for K
Need for Cross Validation
Holdout Validation, Stratified Holdout Validation
Stratified Holdout Validation - Data Partitioning
K-Fold Cross Validation, LOOCV
K-Fold Cross Validation Implementation
KNN for Regression Problems
Weighted KNN
Curse of Dimensionality
Bias Variance Tradeoff
Performance Measurement of Classification Models
Resources
Accuracy
Confusion Matrix, TPR, FPR, FNR, TNR
Precision and Recall, F1 Score
Receiver Operating Characteristic Curve (ROC) and AUC
ROC Curve, Precision Recall Curve Implementation
Log-loss
Linear Regression
Resources
Intro. to Linear Regression
Baseline model and SSE
Least Squares Method
Evaluation Metrics for Regression model
Linear Regression Output Interpretation
Multiple Linear Regression and Problems
Assumptions of Linear Regression
Linear Regression Implementation - Problem and Data Pre-processing
Linear Regression Implementation - Linearity, Multi-Collinearity
Linear Regression Implementation - Data Transformations
Linear Regression Implementation - Creating model
Linear Regression Implementation - Model Evaluation
Solving Optimization Problems
Intro. to Differentiation
Derivative Rules
Maxima and Minima
Vector Gradient of Cost Function
Gradient Descent Algorithm
Learning rate
Types of Gradient Descent - Batch, SGD and Mini-Batch
Constrained Optimization, Ridge and Lasso Regression
Polynomial Regression and Problems with Overfitting
L1, L2 Vector Norms
Regularization, Ridge and Lasso Regression
Why L1 regularization induces Sparsity?
Ridge and Lasso Implementation
Elastic Net
Logistic Regression
Resources
Line, Plane and Hyper Plane
Distance of a point from Plane
Logistic Regression - Geometric Intuition
Sigmoid Function
Mathematical Formulation of Objective Function
Objective Function using MLE
Regularization - Logistic Regression
Parameters and Hyper-Parameters
Grid Search CV and Random Search CV
Why and when Feature Scaling is required
Implementation - Data Pre-processing
Implementation - Missing value Imputation
Implementation - Categorical Feature Encoding
Implementation - Model Building
Support Vector Machines (SVM)
Resources
SVM - Geometric Intuition
Mathematical Formulation of Objective Function
Hard Margin SVM
SVM Soft Margin
Hinge Loss
SVM Dual Formulation
Kernel Trick
RBF Kernel
SVM Kernels, Decision Boundaries
SVM Decision Boundaries with change in hyper parameters
SVM Implementation - Grid Search CV
SVM Implementation - Random Search CV
Principal Component Analysis (PCA)
Resources
Dimensionality Reduction
PCA : Geometric Intuition
PCA : Objective Function
Optimization Solution
PCA : Steps
PCA Implementation using MNIST data
PCA Implementation : Reducing Feature set
PCA : Step by Step Implementation
Decision Trees
Resources
Intro. to Decision Trees
Purity in a decision tree
Gini Impurity
Entropy
Deriving best split using Entropy
Performance Optimization
Decision Tree Implementation
Decision Tree Regression
Decision Tree - Pro's and Con's
Ensemble Models
Resources
Intro. to Ensemble Learning
Basic Ensemble Implementation for Classification problem
Basic Ensemble Implementation for Regression problem
Why Ensemble models work well?
Bias Variance Trade-off (Revisited)
Generalization Error, Bias Variance Decomposition
Bootstrap Aggregation (Bagging) Intuition
Intro. to Random Forest
Random Forest model - Hyper Parameter Tuning
Random Forest Implementation
Extremely Randomized Trees
Boosting Intuition
Pseudo Residual Loss
Gradient Boosting
Gradient Boosting - Regularization by Shrinkage
GBM Implementation
Intro. to XGBoost
XGBoost Implementation
XGBoost CV Implementation
AdaBoost Math
AdaBoost In-depth
AdaBoost Implementation
Model Stacking
Model Stacking Implementation
Project : Microsoft Malware Detection
Resources
Problem Description
Dataset Overview
EDA - Data Pre-processing Part I
EDA - Data Pre-processing Part II
EDA - Data Profiling
EDA - Hypothesis Testing
Feature Engineering
Feature Encoding Approaches
Applying Feature Encoding
Data Pre-processing for ML Model building
Creating a Decision Tree Model
Decision Tree - Model Evaluation
Creating a Random Forest Model
Creating XGBoost Model
Model Performance Assessment - Cumulative Gains Chart
Project : Vesta Fraud Detection
Resources
Problem Description
Data Overview
EDA - Data Pre-processing
EDA - Data Profiling
EDA - Hypothesis Testing
Feature Engineering
Feature Encoding Approaches
Categorical Feature Encoding
Feature Selection Methods, Data Partitioning
Creating Random Forest Model
Cost Sensitive Learning using Class Weights
Feature Selection - Recursive Feature Elimination
Random Forest - Model Evaluation, AUC ROC Curve, Precision Recall Curve
Creating XGBoost Model
Finding Optimal Threshold, Model Evaluation
Cumulative Gain Curve
Creating Light GBM Model
Fundamentals of Natural Language Processing (NLP)
Resources
Intro. to NLP
Project : Customer Review Classification, Data Overview
Data Cleaning
Text to Vector conversion
Bag of Words (BoW)
Terminologies
Text Pre-processing : Stop Word Removal, Stemming, Lemmatization
Text Pre-processing : Code Example
Text Pre-processing : applying to review data
Creating Bag of Words
uni-gram, bi-gram and tri-gram
Creating uni-gram, bi-gram and tri-gram's
Term Frequency (TF) and Inverse Document Frequency (IDF)
Why log in IDF?
Creating TF-IDF matrix
Model Building : Classification of Reviews
Word Embeddings : Word2Vec
Average Word2Vec, TFIDF Weighted Word2Vec
Creating Word2Vec
Unsupervised Learning
Resources
Intro. to Clustering
Applications of Clustering
Evaluation Metrics
K-Means Clustering
K-Means: Geometric Intuition
K-Means: Mathematical Formulation
K-Means: Algorithm
Implementing K-Means from scratch
K-Means++: Centroid Initialization
K-Means Implementation - Data Cleaning
K-Means Implementation
Limitations
Choosing Optimal K
Hierarchical Clustering
Agglomerative & Divisive, Dendograms
Agglomerative Clustering
Inter Cluster Similarity
Hierarchical Clustering Implementation
Grouping Variables using Hierarchical Clustering
DBSCAN (Density based clustering)
Intro. to DBSCAN
MinPts and Eps
Core, Border and Noise points
Density edge and Density connected points
DBSCAN Algorithm
Hyper Parameters : MinPts and Eps
Advantages and Limitations of DBSCAN
Recommender Systems and Matrix Factorization
Resources
Recommender Systems
Types of Recommender Systems
User based Collaborative Filtering
Challenges in User based Collaborative Filtering
Item based Collaborative Filtering
Matrix Factorization : SVD
SVD, Eigen Decomposition Relevance
Matrix Factorization for Collaborative Filtering
Creating Recommender Systems : Non Personalized
Creating Recommender Systems : User based Collaborative Filtering
Creating Recommender Systems : Item based Collaborative Filtering
Creating Recommender Systems : Matrix Factorization
Deep Learning : Fundamentals
Resources
Artificial Intelligence vs Machine Learning vs Deep Learning
What is Deep Learning?
Factors behind Deep Learning popularity
Intro. to Google Colab
Perceptron
Multi Layer Perceptron
Visualizing Neural Network
Neural Network : Notation
Training Single Neuron Model
Training MLP
Backward Propagation
Epoch versus Iteration
Activation Functions : Linear activation
Activations Functions : Sigmoid, Tanh
Activation Functions : Softmax
Vanishing Gradient Problem
Activation Functions : ReLU, Leaky ReLU
Deep Learning : High Performing Neural Nets
Resources
Weight Initialization
Batch Normalization
Model Generalization, Dropouts
Variants of Gradient Descent
Exponential Weighted Average
Momentum
Optimizers : AdaGrad
Optimizers : RMSProp
Optimizers : Adam
Project : Image Classification using Tensorflow and Keras
Resources
Deep Learning Frameworks
Intro. to Image data
Reading Image data
Reading Image Classification Data
Creating MLP : Pre-processing, Model Training
Model Evaluation
Hyper Parameter Tuning
Early Stopping
MLP : Dropout Implementation
MLP : Weight Initialization
MLP : Batch Normalization
Deep Learning : Convolutional Neural Nets
Resources
Why CNN?
Convolution : Filters
Convolution : Edge Detection using Sobel Kernel
Convolution over RGB Images
Local Connectivity and Parameter Sharing
CNN Architecture
Pooling
CNN Forward Propagation
CNN Back Propagation
Creating CNN model
Hyper parameter tuning
Image augmentation
Image augmentation implementation
CNN Image augmentation using Keras
Deep Learning : LSTMs, RNNs
Resources
Sequential Modeling
Recurrent Neural Network
RNN BPPT
Types of RNNs
RNN : Long term dependencies issue
Alternate representation of RNN
LSTM
GRUs
Deep RNN
Teach online with
K-Means Implementation - Data Cleaning
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock