Advanced Data Analysis using Wavelets and Machine Learning
- Description
- Curriculum
- FAQ
- Reviews
Welcome to my course on Machine Learning and Data Analysis, a course that will teach you how to use advanced algorithms to solve real problems with data. I am Emanuele, a mechanical engineer with a PhD in advanced algorithms, and I will be your instructor for this course.
This course consists of four main parts:
- Part 1: Overview on Fourier Analysis and Wavelets. You will learn the basics of these two powerful mathematical tools for analyzing signals and images in different domains.
- Part 2: Data Analysis with Fourier Series, Transforms and Wavelets. You will learn how to apply these methods to process and explore data efficiently and effectively, both in time and frequency domains.
- Part 3: Machine Learning Methods. You will learn how to use techniques that enable computers to learn from data and make intelligent predictions or decisions, such as linear regression, curve fitting, least squares, gradient descent, Singular Value Decomposition (and more).
- Part 4: Dynamical Systems. You will learn how to model and understand complex and nonlinear phenomena that change over time, using mathematical equations. We will also apply machine learning techniques to dynamical systems, such as the SINDy algorithm.
By the end of this course, you will be able to:
- Understand the principles and applications of Fourier analysis and wavelets
- Use Fourier series and transforms to analyze data in various domains
- Apply machine learning methods to different problems
- Extract features from data using wavelets
- Understand the importance of sparsity of natural data, as well as the revolutionary concept of compressed sensing, with realistic examples.
- Discover the governing equations of a dynamical system from time series data (SINDy algorithm).
I hope you enjoy this course and find it useful for your personal and professional goals.
————————————————————————————————————————————
Let’s provide some more details about the main parts of this course:
Part 1 constitutes a preliminary introduction to Fourier and Wavelet Analysis. Special focus will be put on understanding the most relevant concepts related to these fundamental topics.
In part 2, the Fourier series and the Fourier Transform are introduced. Although the most important mathematical formulae are shown, the focus is not on the mathematics. One of the key points of this part is to show one possible application of the Fourier Transform: the spectral derivative. Then, we introduce the concept of Wavelets more in detail by showing some applications of Multiresolution Analysis.
This is exemplified with Matlab, without using rigorous mathematical formulae. The student can follow and get the intuition even if they have no access to Matlab.
Another important achievement of this part is to convey a simple but thorough explanation of the well-known computational FFT method.
There are also some extras on the Inverse Wavelet Transform and the Uncertainty principle (here we see more mathematics, but this is an extra, if you want to skip it, just do it).
In part 3, some machine learning techniques are introduced: the methods of curve-fitting, gradient descent, linear regression, Singular Value Decomposition (SVD), feature extraction, classification, Gaussian Mixture Model (GMM). The objective in this part is to show some practical applications and cast light on their usefulness.
We will also focus on sparsity and compressed sensing, which are related concepts in signal processing. Sparsity means that a signal can be represented by a few non-zero coefficients in some domain, such as frequency or wavelet. Compressed sensing means that a signal can be reconstructed from fewer measurements than the Nyquist–Shannon sampling theorem requires, by exploiting its sparsity and using optimization techniques. These concepts are useful for reducing the dimensionality and complexity of data in machine learning applications, such as image processing or radar imaging.
Part 4 is a self-contained introduction to dynamical models. The models contained in this part are the prey-predator model, the model of epidemics, the logistic model of population growth.
The student will learn how to implement these models using free and open-source software called Scilab (quite similar to Matlab).
Related to Part 4, there is an application of machine learning technique called SINDy, which is an acronym for Sparse Identification of Nonlinear Dynamics. It is a machine learning algorithm that can discover the governing equations of a dynamical system from time series data. The main idea is to assume that the system can be described by a sparse set of nonlinear functions, and then use a sparsity-promoting regression technique to find the coefficients of these functions that best fit the data. This way, SINDy can recover interpretable and parsimonious models of complex systems.
Note: For some of the lectures of the course, I was inspired by S.L. Brunton and J. N. Kutz’s book titled “Data-Driven Science and Engineering”. This book is an excellent source of information to dig deeper on most (although not all) of the topics discussed in the course.
-
1Overview of Fourier AnalysisVideo lesson
Fourier analysis is a branch of mathematics that studies how general functions can be decomposed into simpler functions with definite frequencies. These simpler functions are usually trigonometric or exponential functions. Fourier analysis has many applications in physics, engineering, and other fields, because it allows us to solve differential equations, analyze signals, and understand periodic phenomena. In this lecture, we will introduce two types of Fourier analysis: Fourier series and Fourier transforms. Fourier series are used to represent periodic functions as discrete sums of sine and cosine functions. Fourier transforms are used to represent non-periodic functions as continuous integrals of trigonometric or exponential functions. We will also discuss some properties and examples of these methods, and how they relate to each other.
-
2Space-Frequency resolution for the Short Time Fourier TransformVideo lesson
The Short Time Fourier Transform (STFT) is a technique that allows us to analyze the frequency content of a signal as it changes over time (or in space). It works by dividing the signal into short segments, applying a window function to each segment, and computing the Fourier transform of each windowed segment. The result is a two-dimensional representation of the signal in the time-frequency domain, called the spectrogram. However, the STFT has a limitation: it uses the same window size for all segments, which means it has the same resolution for all frequencies (we will see that it is not the case for Wavelets). This can be problematic when the signal has different frequency components that vary at different rates over time. In general, it is possible to improve the space-frequency resolution of the STFT by using different window sizes for different frequency bands. It is also possibile to discuss some methods and criteria for choosing optimal window sizes and shapes for different signals and applications. This lectures serve as conceptual summary of the main concepts.
-
3Wavelets and Space-Frequency resolutionVideo lesson
Wavelets are wave-like oscillations that are localized in time (or space) and have two basic properties: scale and location. Scale defines how stretched or squished a wavelet is, and location defines where the wavelet is centered. Wavelets can be used to analyze signals at different scales and locations, which is useful for capturing both global and local features of the signal. Wavelet analysis is an alternative to Fourier analysis, which uses sinusoidal functions that are infinite in time and have a fixed frequency. Wavelet analysis can overcome some of the limitations of Fourier analysis, such as the trade-off between time and frequency resolution (or space vs wavenumber resolution). In this lecture, we will introduce the concept of wavelets and how they can be used to decompose a signal into different frequency bands at different resolutions. Especially in the next section, we will also discuss some types of wavelets, such as discrete and continuous wavelets, and some applications of wavelet analysis, such as compression and feature extraction.
-
4Summary of Fourier Series and Fourier TransformVideo lesson
The Fourier Series and Fourier Transform are two mathematical tools that allow us to decompose a signal into simpler components that are sinusoidal functions with different frequencies and phases. These components are called harmonics or frequency components of the signal. By decomposing a signal into its frequency components, we can analyze and manipulate the signal in the frequency domain, which is often easier and more convenient than working in the time domain (or in the space domain). The Fourier Series is used to represent a periodic signal, which is a signal that repeats itself over time, by a discrete sum of complex exponentials. The coefficients of the complex exponentials are called Fourier coefficients and they depend on the shape and amplitude of the signal. The Fourier Transform is used to represent a general, non-periodic signal, which is a signal that does not repeat itself over time, by a continuous integral of complex exponentials. The function that relates the frequency and the amplitude of the complex exponentials is called Fourier transform and it depends on the spectrum and energy of the signal. In this lecture and in the following, we will review the definitions and properties of Fourier Series and Fourier Transform, and how they can be computed using different methods and algorithms. We will also discuss some applications and examples of Fourier Series and Fourier Transform in various fields such as engineering, physics, and signal processing.
-
5Notation for the Fourier TransformVideo lesson
The Fourier Transform is a mathematical operation that converts a function of time or space into a function of frequency or wavenumber. It can be denoted by a capital letter F with a subscript indicating the variable of the original function and parentheses indicating the variable of the transformed function. For example, if f(t) is a function of time, then its Fourier transform with respect to time is denoted by F_t(f)(ω), where ω is the angular frequency. Similarly, if g(x) is a function of space, then its Fourier transform with respect to space is denoted by F_x(g)(k), where k is the wavenumber. The inverse Fourier transform, which converts a function of frequency or wavenumber back into a function of time or space, is denoted by F^-1 with the same subscripts and parentheses as the forward transform. For example, if F_t(f)(ω) is the Fourier transform of f(t), then F^-1_t(F_t(f))(t) = f(t). The notation for the Fourier transform can vary depending on the convention and context. Some common variations are:
Using a script capital F (ℱ) instead of a regular capital F to denote the Fourier transform. This notation follows the ISO 80000-2 standard.
Using a tilde (~) over the transformed function instead of parentheses around the original function. For example, f~(ω) instead of F_t(f)(ω). This notation is more compact and avoids confusion with function evaluation.
Using different constants in front of the integrals that define the Fourier transform and inverse Fourier transform. For example, some authors use 1/√(2π) or 1/2π instead of 1 in front of both integrals, or use different constants for each integral to make them symmetric. These choices affect the scaling and normalization of the transformed function.
Using different signs in the exponentials that define the Fourier transform and inverse Fourier transform.
-
6Fourier Transform of the derivative of a functionVideo lesson
The Fourier Transform of the derivative of a function is a property that relates the differentiation of a function in the time or space domain to the multiplication of its Fourier transform in the frequency or wavenumber domain. It is useful for solving differential equations and analyzing signals with varying frequencies. The property can be stated as follows: if f(t) is a function of time with Fourier transform F(ω), then the derivative f’(t) has Fourier transform iωF(ω), where ω is the angular frequency. Similarly, if g(x) is a function of space with Fourier transform G(k), then the derivative g’(x) has Fourier transform ikG(k), where k is the wavenumber. The property can be derived using integration by parts or using the inverse Fourier transform. The property can be generalized to higher-order derivatives and partial derivatives. For example, if f(t) has Fourier transform F(ω), then the second derivative f’'(t) has Fourier transform -(iω)^2F(ω), and if f(x,y) has Fourier transform F(kx,ky), then the mixed partial derivative f_xy(x,y) has Fourier transform -(i)^2kxkyF(kx,ky).
-
7The importance of the Fast Fourier Transform (FFT)Video lesson
The Fast Fourier Transform (FFT) is an algorithm that computes the Discrete Fourier Transform (DFT) of a sequence of data, or its inverse (IDFT). The DFT is a mathematical operation that converts a signal from its original domain (often time or space) to a representation in the frequency domain and vice versa. The frequency domain reveals important information about the signal, such as its spectrum, energy, harmonics, and periodicity. The DFT can be computed directly from its definition, but this requires a large number of arithmetic operations that grows quadratically with the size of the data. The FFT reduces this complexity to a "linearithmic" growth, by exploiting the symmetry and periodicity of the complex exponential functions that are used in the DFT. The FFT can also achieve higher accuracy and stability than the direct DFT, by avoiding numerical errors and round-off errors. The FFT is widely used in many fields and applications, such as engineering, physics, mathematics, music, signal processing, image processing, data compression, cryptography, and more. The FFT enables fast and efficient analysis and manipulation of signals in various domains and formats. The FFT was popularized by Cooley and Tukey in 1965, but its origins can be traced back to Gauss in 1805. There are many variants and implementations of the FFT algorithm, depending on the size and shape of the data, the desired accuracy and speed, and the available hardware and software resources.
-
8Spectral derivativeVideo lesson
Spectral derivative is a technique that computes the derivative of a function using its spectral representation, such as the Fourier transform (or also the wavelet transform). The idea is to exploit the properties of the basis functions that are used in the spectral representation, such as complex exponentials or wavelets, to obtain simple expressions for the derivative in the frequency or wavenumber domain. For example, if f(t) is a function of time with Fourier transform F(ω), then its derivative f’(t) has Fourier transform iωF(ω), where ω is the angular frequency. Spectral derivative has several advantages over other methods of numerical differentiation, such as finite difference or polynomial interpolation. It can achieve high accuracy and stability, especially for smooth and periodic functions. It can also handle non-uniform grids and irregular domains. It can be combined with fast algorithms for spectral transforms, such as the fast Fourier transform (FFT) or the fast wavelet transform (FWT), to reduce the computational cost. Spectral derivative can be used to solve differential equations, analyze signals, and perform optimization problems. However, it also has some limitations and challenges, such as dealing with discontinuities, boundaries, noise, and aliasing effects.
-
9Wavelets and Multiresolution AnalysisVideo lesson
Wavelets and Multiresolution Analysis are two related concepts that deal with the representation and analysis of signals at different scales and resolutions. Wavelets are functions that have a localized and oscillatory behavior in both time and frequency domains. They can be used to decompose a signal into a linear combination of shifted and scaled versions of a basic wavelet, called the mother wavelet. This decomposition is called the wavelet transform and it provides a sparse and adaptive representation of the signal that captures its features at different levels of detail. Multiresolution Analysis is a mathematical framework that explains how wavelets can be constructed and organized into a hierarchical structure of nested subspaces. Each subspace corresponds to a certain resolution or scale of the signal, and contains the information that is not present in the coarser subspaces. The transition from one subspace to another is achieved by applying operators called scaling and wavelet functions, which act as low-pass and high-pass filters respectively. The scaling function generates the approximation coefficients, which represent the coarse features of the signal, while the wavelet function generates the detail coefficients, which represent the fine features of the signal. Wavelets and Multiresolution Analysis have many applications in various fields, such as signal processing, image processing, data compression, denoising, feature extraction, pattern recognition, numerical analysis, and more. They offer several advantages over other methods, such as flexibility, adaptivity, efficiency, and accuracy.
-
10Extra: Why the Dirac delta helps derive the Inverse Fourier TransformVideo lesson
The Dirac delta function is a mathematical object that is not really a function, but a distribution that has some useful properties. One of them is that it can be used to represent a point source or an impulse in time or space. Another one is that it can be used to sample or extract a value from another function. The Dirac delta function can also be related to the Fourier transform, which is an operation that converts a function from its original domain (often time or space) to a representation in the frequency or wavenumber domain. The Fourier transform reveals important information about the function, such as its spectrum, energy, harmonics, and periodicity. The inverse Fourier transform is the operation that converts a function back from the frequency or wavenumber domain to its original domain. The Dirac delta function helps derive the inverse Fourier transform by using its sampling property and its Fourier transform property. The sampling property says that for any function f(x), we have ∫f(x')δ(x'− x)dx' = f(x). This means that the Dirac delta function can pick out the value of f at x0 by integrating it with f. The Fourier transform property says that for any constant k0, we have F[δ(x' − x)] = exp(ikx)F[δ(x')]=exp(ikx), where F denotes the Fourier transform. This means that the Fourier transform of a shifted Dirac delta function is just a phase factor times the Fourier transform of an unshifted Dirac delta function (the latter transform is equal to one). Using these two properties, we can derive the inverse Fourier transform as follows: Let g(k) be a function in the frequency or wavenumber domain, and let f(x) be its inverse Fourier transform. Then we have g(k) = F[f(x)] = ∫f(x')exp(-ikx')dx'. Now we can insert a Dirac delta function inside the integral without changing its value: g(k) = ∫f(x')F[δ(x − x')]dx'=F[∫f(x')δ(x − x')dx']; notice that F[δ(x − x')] transforms the variable x. Then we can use the sampling property to evaluate the integrand at x' = x: g(k) = F[f(x)]. These steps show that the Fourier transform and its Inverse are related thanks to the properties of the Dirac delta. The Dirac delta function helps derive the inverse Fourier transform by acting as a building block for any function in the frequency or wavenumber domain.
-
11Extra: Mathematical derivation of the Inverse Wavelet TransformVideo lesson
The Inverse continuous Wavelet Transform is a way to recover a signal from its wavelet representation. The wavelet representation is obtained by breaking down the signal into different levels of detail using a basic shape called the mother wavelet. The Inverse continuous Wavelet Transform can be done using different methods, but one common method is to use a special type of mother wavelet that has a simple and regular shape in both time and frequency domains. This type of mother wavelet makes it easier to relate the signal and its wavelet representation. Among the wavelet properties, one requirement is to make sure that the wavelet representation preserves the energy of the original signal at each level of detail. This is called L1 normalization. Using this type of mother wavelet and its properties, this makes the Inverse continuous Wavelet Transform simpler and faster. It can be done by adding up the wavelet representation at each level of detail and taking the real part of the result. In this lecture I want to show you the mathematical steps which help derive the Inverse Wavelet Transform. Note: this lecture is an extra, and you can skip it if you are not interested in the mathematical details.
-
12Extra: Uncertainty principle - mathematical proofVideo lesson
The Uncertainty Principle in signal processing is a limit to how well we can localize a signal in both time and frequency domains. The time domain is where we observe the signal as a function of time, and the frequency domain is where we observe the signal as a function of frequency. The frequency domain reveals important information about the signal, such as its spectrum, energy, harmonics, and periodicity. The Uncertainty Principle in signal processing states that the product of the widths of the signal in the time and frequency domains is always greater than or equal to a constant. The mathematical proof of this principle can be done using different methods, but one common method is based on using the properties of functions and inner products in signal processing. A function is a mathematical object that maps an input to an output. An inner product is a way of calculating the similarity or overlap between two functions. The signal in the time domain is represented by a function that maps time to amplitude. The signal in the frequency domain is represented by another function that maps frequency to amplitude. This function is obtained by applying the Fourier transform to the signal in the time domain. The Fourier transform is an operation that converts a function from its original domain to a representation in another domain. The width of the signal in the time or frequency domain is related to how spread out or localized the function is in the corresponding domain. The proof of the Uncertainty Principle in signal processing involves showing that there is a lower bound to how spread out or localized the function can be in both domains simultaneously. This implies that there is an inherent trade-off between localizing the signal in the time and frequency domains with high precision. The more precisely we localize one domain, the less precisely we can localize the other domain. This trade-off is quantified by the Uncertainty Principle in signal processing.
-
13Curve fittingVideo lesson
Curve fitting with polynomials is a technique that finds a polynomial function that best approximates a set of data points. A polynomial function is a function that can be written as a sum of powers of a variable, such as y = a0 + a1x + a2x^2 + … + anx^n, where a0, a1, …, an are the coefficients and n is the degree of the polynomial. The degree of the polynomial determines how complex or flexible the function is. A higher degree polynomial can fit more data points, but it may also overfit the data and produce large errors for new data points. A lower degree polynomial may be simpler and more generalizable, but it may also underfit the data and miss some important features. We will first focus on understanding the concepts, and then we use Matlab to illustrate realistic examples. Matlab is a software that can perform curve fitting with polynomials using built-in functions such as polyfit and polyval. For instance, polyfit can find the coefficients of a polynomial that fits a set of data points in a least-squares sense, which means that it minimizes the sum of squared errors between the data points and the polynomial function. Polyval can evaluate the polynomial function at any given point or vector of points. The lecture will show how to use the functions to fit polynomials of different degrees to some example data sets. The lecture will also show how to plot the data points and the polynomial functions, and how to compare the quality of fit using intuition. The lecture will demonstrate how curve fitting with polynomials can help understand the behavior and trends of the data, and how to choose an appropriate degree of polynomial for different situations.
-
14Example of curve fitting - least squares methodVideo lesson
Curve fitting is the process of finding a mathematical function that best approximates a set of data points. The least squares method is a common technique for curve fitting that minimizes the sum of squared errors between the data points and the function. The errors are usually measured along the vertical direction from the data points to the function. The least squares method can be applied to different types of functions, such as linear, polynomial, exponential, logarithmic, and power functions. The lecture will demonstrate how curve fitting with the least squares method can help understand the behavior and trends of the data, and how to choose an appropriate type and degree of function for different situations.
As an example, we will work out how to fit a line to some data: x = [x1 x2 x3 x4 x5...]; y = [y1 y2 y3 y4 y5...];
A line is a linear function of the form y = ax + b, where a is the slope and b is the intercept. To fit a line to the data using the least squares method, we need to find the values of a and b that minimize the sum of squared errors between y and ax + b. We can write this as an optimization problem:
minimize S(a,b) = sum((y - ax - b)^2)
To solve this problem, we can use calculus and set the partial derivatives of S with respect to a and b equal to zero:
dS/da = -2 sum((y - ax - b)x) = 0, dS/db = -2 sum(y - ax - b) = 0
Solving these equations for a and b gives us the possibility of finding the best fitting line y. We can plot this line along with the data points and calculate the error if necessary.
We will see that the line fits the data very well.
-
15Gradient descentVideo lesson
Gradient descent is an optimization technique that finds the minimum of a function by iteratively updating a set of parameters in the opposite direction of the gradient of the function. The gradient of a function is a vector that points to the direction of the steepest ascent of the function. By moving in the opposite direction of the gradient, we can reduce the value of the function until we reach a local or global minimum. The size of the update step is determined by a learning rate parameter, which controls how fast or slow we move towards the minimum. A small learning rate may lead to slow convergence, but a large learning rate may cause overshooting or divergence. The lecture will explain the fundamental concepts and mathematical formulas behind gradient descent, and how it can be applied to different types of functions. The lecture will also show how to implement gradient descent in Matlab using built-in functions. The lecture will demonstrate how gradient descent can help solve various optimization problems, which are fundamental to curve fitting, machine learning, etc.
-
16Singular Value Decomposition - SVDVideo lesson
Singular value decomposition (SVD) is a powerful technique in linear algebra that allows us to factorize any matrix into three simpler matrices. It has many applications in science, engineering, and statistics, such as data compression, dimensionality reduction, image processing, and recommendation systems.
The SVD of a matrix M can be written in the form:
M = UΣW
where U and W are unitary matrices, Σ is a diagonal matrix (meaning it has only non-zero entries on the main diagonal). Usually, it is written in the form: M = UΣV*, where V* is the conjugate transpose of V.
The diagonal entries of Σ are called the singular values of M, and they measure how much each column of U or V contributes to M. The columns of U and V are called the left-singular and right-singular vectors of M, respectively, and they form two sets of orthogonal bases for the row and column spaces of M.
The SVD has many useful properties, such as:
The rank of M is equal to the number of non-zero singular values.
The pseudoinverse of M can be computed from the SVD by inverting the non-zero singular values and swapping U and V.
The best low-rank approximation of M can be obtained by keeping only the largest singular values and their corresponding singular vectors.
The SVD can be computed using various numerical methods. The SVD is not unique, but it can be chosen so that the singular values are in descending order. In this case, Σ is uniquely determined by M.
-
17Approximation of images with the SVDVideo lesson
To approximate an image with the SVD using eigenfaces, we need to perform the following steps:
If necessary, preprocess the images by cropping, resizing, and centering them to a common size and shape.
Compute the average image by taking the elementwise mean over all images.
Subtract the average image from every image to obtain centered images.
Compute the SVD of the matrix of centered images, where each column is a vectorized image.
Select a number of singular values and corresponding singular vectors that capture most of the variation among the images. These singular vectors are the eigenfaces.
Project each image onto the subspace spanned by the eigenfaces by taking the dot product of the vectorized image and each eigenface.
Reconstruct each image by adding the average image and a linear combination of the eigenfaces weighted by their projection coefficients.
-
18Supervised machine learning - extraction of features with SVD and WaveletsVideo lesson
Supervised machine learning (SML) is a branch of artificial intelligence that aims to learn from labeled data and make predictions for new data. SML involves two main steps: feature extraction and classification. Feature extraction is the process of transforming the raw data into a lower-dimensional and more informative representation that captures the relevant patterns and characteristics of the data. Classification is the process of assigning a label to a new data point based on its features and a learned model.
Singular value decomposition (SVD) and wavelets are two powerful techniques for feature extraction that can be applied to various types of data, such as images, audio, signals, and text. SVD is a linear algebra method that decomposes a matrix into three simpler matrices that reveal the most important directions and values of the data. Wavelets are mathematical functions that decompose a signal into different frequency components that capture the local and global features of the data.
In this lecture, we will learn how to use SVD and wavelets for feature extraction in SML.
-
19Linear regression: least squares method in matrix formVideo lesson
Linear regression is a statistical method that models the relationship between a dependent variable y and one or more independent variables x. The goal of linear regression is to find the best-fitting line that minimizes the sum of squared errors (SSE) between the observed values of y and the predicted values of y based on x.
Least squares is a technique that solves the linear regression problem by finding the values of the coefficients β that minimize the SSE. The least squares solution can be expressed in matrix form as:
β = (X’X)^-1 X’y
where X is a matrix of n observations by p predictors, y is a vector of n observations of the dependent variable, and β is a vector of p coefficients. The matrix X’X is called the normal matrix, and its inverse (X’X)^-1 is called the normal inverse.
The matrix form of least squares has several advantages, such as:
It provides a compact and elegant way to write the linear regression problem and its solution.
It allows us to use matrix operations and properties to manipulate and simplify the expressions.
It facilitates the computation and interpretation of various quantities related to linear regression, such as fitted values, residuals, sums of squares, variance, covariance, and confidence intervals.
-
20Linear regression: sensitivity to outliers in the dataVideo lesson
Linear regression is a statistical method that models the relationship between a dependent variable y and one or more independent variables x using a straight line. The line is fitted by minimizing a loss function that measures the discrepancy between the observed values of y and the predicted values of y based on x.
Outliers are observations that deviate significantly from the general pattern of the data. Outliers can have a large impact on the linear regression fit, as they can affect the slope, intercept, and goodness-of-fit of the line. Outliers can also distort the estimates of the standard errors and confidence intervals of the coefficients.
The l1 norm and the l2 norm are two common choices for the loss function in linear regression. The l1 norm is also known as the least absolute deviations (LAD) or mean absolute error (MAE), and it is defined as:
L1 = sum |y - yhat|
where yhat is the predicted value of y based on x. The l2 norm is also known as the least squares (LS) or mean squared error (MSE), and it is defined as:
L2 = sum (y - yhat)^2
The l1 norm and the l2 norm have different properties and advantages for linear regression, such as:
The l1 norm is more robust to outliers than the l2 norm, as it does not penalize large errors as much as the l2 norm does. The l1 norm can reduce the influence of outliers by giving them smaller weights in the fitting process.
The l2 norm is more sensitive to outliers than the l1 norm, as it penalizes large errors more than the l1 norm does. The l2 norm can amplify the influence of outliers by giving them larger weights in the fitting process.
The l1 norm can produce multiple solutions that have the same minimum value, as it has corners at each coordinate axis. The l1 norm can also induce sparsity in the coefficients, meaning that some coefficients can be exactly zero.
The l2 norm has a unique solution that can be found analytically, as it is smooth and convex. The l2 norm can also induce shrinkage in the coefficients, meaning that some coefficients can be close to zero but not exactly zero.
-
21Classification/decision treesVideo lesson
Classification/decision trees are a type of supervised learning algorithm that can be used to classify data into different categories based on a set of rules. A classification/decision tree is a graphical representation of the rules that split the data into smaller and more homogeneous groups.
The Fisher iris data set is a famous data set that consists of 150 observations of three species of iris flowers: setosa, versicolor, and virginica. Each observation has four measurements: sepal length, sepal width, petal length, and petal width. The goal is to classify each observation into one of the three species based on the measurements.
One way to build a classification/decision tree for the iris data set is to use the fitctree function in MATLAB. The function allows you to train a decision tree model using various algorithms and parameters. You can also visualize the decision tree structure and performance metrics using the view and loss functions.
To use the function, you need to load the iris data set and create a table with the measurements as predictor variables and the species as response variable. You can then call the fitctree function with the table as input and specify different options for decision trees, such as algorithm, split criterion, max depth, min leaf size, or pruning method.
The function will return a decision tree model object that contains information about the structure and performance of the model. You can see how the tree splits the data at each node based on a threshold value for one of the measurements. You can also see how many observations belong to each class at each node and leaf. You can evaluate the accuracy of the model by looking at the resubstitution error, cross-validation error, or test error. You can also see how well the model classifies new observations by using the predict function and computing the confusion matrix.
You can also prune the decision tree by removing some nodes or branches that do not improve the accuracy or generalization of the model. Pruning can help reduce overfitting and complexity of the tree. You can prune the tree manually by specifying a pruning level or automatically by using a pruning criterion, such as alpha or error.
You can also export the decision tree model to MATLAB workspace or generate MATLAB code for further analysis or deployment.
-
22Gaussian Mixture ModelsVideo lesson
Gaussian Mixture Models (GMM) are a type of unsupervised learning algorithm that can be used for clustering or density estimation. Clustering is the task of grouping data points based on their similarity or proximity. Density estimation is the task of estimating the probability distribution of the data.
GMM assumes that the data points are generated from a mixture of multiple Gaussian distributions with unknown parameters. A Gaussian distribution is a bell-shaped curve that is characterized by two parameters: mean and covariance. A mixture of Gaussian distributions is a weighted sum of multiple Gaussian distributions, where each distribution represents a cluster or component in the data.
The goal of GMM is to estimate the parameters of the mixture model from the data, such as the number of components, the weights, the means, and the covariances. This can be done using various methods, such as maximum likelihood estimation (MLE), expectation-maximization (EM), or Bayesian inference.
GMM has several advantages and disadvantages for clustering or density estimation, such as:
Advantages: It can handle both univariate and multivariate data. It can capture complex and non-linear shapes of clusters or distributions. It can provide soft clustering assignments, meaning that each data point has a probability of belonging to each cluster or component.
Disadvantages: It assumes that the data follows a mixture of Gaussian distributions, which may not be true in some cases. It requires specifying the number of components beforehand, which may not be known or easy to determine. It can be sensitive to initialization and local optima, meaning that it may not find the best solution depending on the starting point.
-
23Example of Gaussian mixture modelVideo lesson
Matlab can be used to create, fit, and evaluate GMM models using the gmdistribution class and its related functions.
In this lecture, we will see an example of how to use GMM in Matlab for clustering data on cats and dogs. We will use a synthetic dataset that contains some observations of two features taken from cats and dogs. Each observation belongs to either a cat or a dog class.
The steps of the example are as follows:
Load and plot the data.
Fit a GMM model to the data using the fitgmdist function. The fitgmdist function returns a gmdistribution object that contains the estimated parameters of the GMM model.
Plot the contour lines of the fitted GMM using the ezcontour function by evaluating the pdf of the GMM.
Evaluate the performance of the GMM clustering.
-
24Sparsity and compressed sensing: intro to sparsityVideo lesson
Sparsity is a property of signals or data that means they have only a few non-zero or significant elements. Sparsity can be expressed in different domains, such as the time domain, the frequency domain, or some other transform domain. Sparsity can be used to reduce the complexity and dimensionality of signals or data, as well as to enhance their features or patterns.
Compressed sensing is a branch of signal processing that exploits sparsity to acquire and reconstruct signals or data using fewer measurements than the traditional methods. Compressed sensing relies on two main principles: sparsity and incoherence. Sparsity means that the signal or data can be represented by a few coefficients in some basis or dictionary. Incoherence means that the measurement matrix and the sparsity basis or dictionary are as uncorrelated as possible.
The goal of compressed sensing is to solve an underdetermined system of linear equations of the form y = Ax, where y is the measurement vector, A is the measurement matrix, and x is the sparse signal or data vector. The solution to this problem is not unique, unless some additional information or constraint is imposed. Compressed sensing uses sparsity as a constraint and seeks the sparsest solution that satisfies the measurements. This can be formulated as an optimization problem that minimizes the l1 norm of x subject to y = Ax.
The lecture will focus on the topic of sparsity.
-
25Sparsity and compressed sensing: why "natural" signals are compressibleVideo lesson
Signals in nature are compressible because they often have some structure or regularity that can be exploited to reduce their complexity and dimensionality. Compressibility means that a signal can be represented by a few coefficients or parameters in some basis or "dictionary" without losing much information or quality.
One way to measure the compressibility of a signal is to use the concept of sparsity. Sparsity means that a signal has only a few non-zero or significant coefficients in some basis or dictionary. For example, a sinusoidal signal is sparse in the frequency domain, because it has only one non-zero coefficient in the Fourier basis. A piecewise constant signal is sparse in the wavelet domain, because it has only a few non-zero coefficients in the wavelet basis.
Another way to measure the compressibility of a signal is to use the concept of low-rankness. Low-rankness means that a signal can be approximated by a low-rank matrix or tensor, which has fewer degrees of freedom than the original signal. For example, an image can be approximated by a low-rank matrix using singular value decomposition (SVD), which decomposes the image into a product of two smaller matrices and a diagonal matrix. As another example, a video can be approximated by a low-rank tensor using higher-order SVD (HOSVD), which decomposes the video into a product of smaller tensors and a core tensor.
The lecture will cover the topic of compressibility and sparsity with some intuitive examples.
-
26Sparsity and compressed sensing: intro to compressed sensingVideo lesson
Compressed sensing is a branch of signal processing that exploits the property of sparsity to acquire and reconstruct signals using fewer measurements than the traditional methods. Sparsity means that a signal can be represented by a few coefficients in some basis or dictionary without losing much information or quality.
The lecture will cover the following topics:
The definition and examples of sparsity in different domains and applications
The theory and intuition behind compressed sensing and its main principles
The formulation and solution of the compressed sensing optimization problem using convex relaxation
The advantages and challenges of compressed sensing for signal acquisition and reconstruction
The lecture will introduce the basic concepts and notation of compressed sensing, such as:
The underdetermined system of linear equations y = Cx, where y is the measurement vector, C is the measurement matrix, and x is the sparse signal vector
The sparsity basis or dictionary B, where x = Bs and s is the sparse coefficient vector. B is a matrix which represents a type of Transform
The l1 norm minimization problem min ||s||_1 subject to y = CBs, which seeks the sparsest solution that satisfies the measurements
-
27Example of compressed sensingVideo lesson
The lecture will present an example of compressed sensing for a two-tone signal, which is a signal that consists of two sinusoidal components with different frequencies. The example will illustrate the following steps:
Generate a two-tone signal with known frequencies and amplitudes
Sample the signal at a sub-Nyquist rate using a random measurement matrix
Reconstruct the signal using l1 norm minimization with a sparsity basis related to the Discrete Cosine Transform (DCT)
Compare the reconstruction error between the initial signal and the one reconstructed using compressed sensing
-
28Definition of the Discrete Cosine Transform (DCT) and its inverseVideo lesson
The discrete cosine transform (DCT) is a type of Fourier-related transform that converts a signal or data from the spatial or temporal domain to the frequency domain. The DCT expresses a signal or data as a sum of cosine functions with different frequencies and amplitudes. The DCT has several advantages for signal processing and data compression, such as:
It can compact most of the information or energy of a signal or data into a few coefficients, which correspond to the low-frequency components. This property is known as energy compaction or decorrelation.
It can reduce the blocking artifacts or ringing effects that may occur when using other transforms, such as the discrete Fourier transform (DFT) or the discrete wavelet transform (DWT).
It can be computed efficiently using fast algorithms, such as the fast Fourier transform (FFT) or the fast DCT algorithm.
The inverse discrete cosine transform (IDCT) is the inverse operation of the DCT, which converts a signal or data from the frequency domain back to the spatial or temporal domain. The IDCT reconstructs a signal or data from its DCT coefficients. The IDCT preserves the information or energy of the original signal or data, except for some possible rounding errors due to finite precision arithmetic.
There are several types or variants of DCT and IDCT, which differ in their definitions, properties, and applications. The lecture will cover the derivation of the most common type of DCT and IDCT.
-
29Extra: formula which is crucial to finding the Inverse Discrete Cosine TransformVideo lesson
The inverse discrete cosine transform (IDCT) is a type of Fourier-related transform that converts a signal or data from the frequency domain back to the spatial or temporal domain. The IDCT reconstructs a signal or data from its discrete cosine transform (DCT) coefficients. The DCT expresses a signal or data as a sum of cosine functions with different frequencies and amplitudes.
The lecture will cover the the formula which is crucial to finding the IDCT from the DCT, which we used in the previous lecture.
The formula is based on the discrete sum of a product of two cosines.
This formula helps derive the IDCT from the DCT, and it arises by multiplying both sides of the DCT equation by a cosine function and summing over a certain integer. This results in a summation of products of two cosines, which can be . The formula can also be used to show that the DCT and IDCT are orthogonal transforms, meaning that they preserve the inner product or energy of the signal or data.
-
30Introduction to the section on mathematical modelsVideo lesson
This video is an introduction to the section on Dynamical models. The most important mathematical models that we will see are the so-called Lotka-Volterra model, which is also known as prey-predator model, the model of epidemics, and the model of population growth. We will analyze and solve these models using free and open-source software called Scilab (quite similar to Matlab). In particular, we will use a tool that is contained in Scilab called Xcos, which will help us construct the mathematical models.
The models presented in this section are very important in applied mathematics because they can explain a variety of phenomena. The Lotka-Volterra model derives its name from the mathematicians who first employed it to explain some real-life phenomena: Lotka used this model to explain the interaction between two molecules, so he was interested in chemical reactions, whereas Volterra was an Italian mathematician who used this model to explain why the number of sharks in the Adriatic Sea had increased substantially during the first world war with respect to the pre-war and the post-war periods.
The discovery about the greater percentage of sharks was made by Volterra's son-in-law, whose name was Umberto D’Ancona. D'Ancona was an Italian biologist who made this observation from the data he had collected, and asked Volterra to analyze this problem mathematically, knowing that Volterra was a respected mathematician. Volterra took the challenge and decided to create a mathematical model, which is now known as the Lotka-Volterra model, or prey-predator model. This model focuses on the interaction between two populations: a population of prey versus a population of predators. In this case, the population of predators is represented by sharks, whereas the population of prey is represented by prey-fish.
Volterra understood that the reason why the number of sharks increased dramatically during the first world war was due to the less intense activity of fishing, which had interfered with the interaction between sharks and prey-fish.
We are going to see this more thoroughly in this section; besides, the prey-predator model can be used to explain other interesting phenomena that I will mention at some point. After that, we are going to study epidemics and we will use the same concepts previously introduced with Scilab.
The mathematics that we need in this section not difficult; you just need to know what derivatives and functions are, but we are not going into the mathematical details of how to solve a model. In fact, I want to focus more on the practical applications.
-
31Pure prey-predator modelVideo lesson
The pure prey-predator model is a mathematical model that describes the dynamics of two interacting species, one as a prey and the other as a predator. The model assumes that the prey population grows exponentially in the absence of predators, and that the predator population decays exponentially in the absence of prey. The model also assumes that the interaction between prey and predator is proportional to their product, meaning that the rate of predation depends on how often they encounter each other.
The lecture will cover the formulation of the pure prey-predator model using a system of two first-order nonlinear differential equations.
The pure prey-predator model is also known as the Lotka-Volterra model, after Alfred Lotka and Vito Volterra, who independently proposed it in the 1920s. The model is one of the simplest and most influential models in mathematical ecology and population dynamics. The model can be used to study the effects of predation on population regulation, oscillations, coexistence, and biodiversity.
-
32Equilibrium points and their stabilityVideo lesson
In this lecture, we will explore the dynamics of a system that involves two interacting species: prey and predator. We will use a mathematical model that describes how the populations of both species change over time, depending on factors such as their intrinsic growth rates, their carrying capacities, and their functional responses to each other. We will focus on the case of a pure prey-predator model, where the predator depends entirely on the prey for its food source. (Note: in the pure model, the carrying capacities are infinite, so they do not appear in the model)
We will first introduce the basic concepts of equilibrium points and stability analysis, and explain how they can help us understand the behavior of the system. We will then identify the possible equilibrium points, which represent the steady states of the system. We will also classify them as trivial (when one or both species are extinct) or non-trivial (when both species coexist).
In this lecture we will not do it, but it is possible to perform a local stability analysis of each equilibrium point, by using linearization techniques and eigenvalue analysis. In a future lecture, we will determine mathematically the conditions under which each equilibrium point is stable or unstable, and we will also see Scilab simulations which will help clarify the idea and develop the intuition. We will also discuss how to interpret the stability results in terms of ecological implications.
We will show that under certain assumptions, the non-trivial equilibrium point is globally asymptotically stable, meaning that any initial condition will eventually converge to it. We will also illustrate the global stability result with numerical simulations.
By the end of this lecture, you should be able to:
Explain what equilibrium points and stability analysis are and why they are useful for studying prey-predator systems.
Derive and identify the equilibrium points of a pure prey-predator model.
-
33Equilibrium points in the prey-predator modelVideo lesson
In this lecture, we will learn how to calculate the equilibrium points of a pure prey-predator model. We will show how to find the equilibrium points by setting the derivatives equal to zero and solving for the state variables. We will derive and identify the possible equilibrium points: one trivial point where both species are extinct, and one non-trivial point where both species coexist.
-
34Introduction to ScilabVideo lesson
In this lecture, we will introduce Scilab and xcos, two open-source software tools for scientific computing and modeling. We will learn how to install, run and use them for various purposes.
Scilab is a high-level programming language that allows us to perform numerical calculations, data analysis, signal processing, optimization, control engineering, and more. Scilab has a syntax similar to MATLAB and can execute scripts or commands interactively. Scilab also has a rich set of built-in functions and libraries for different domains.
Xcos is a graphical editor and simulator for dynamic systems that are modeled by block diagrams. Xcos is a toolbox of Scilab that can be accessed from the Scilab console or menu. Xcos provides a user-friendly interface to design, load, save, compile and simulate models of continuous, discrete, hybrid or symbolic systems. Xcos also has a palette browser that contains various blocks grouped by categories.
In the subsequent lectures, we will show how to create and edit a simple xcos model using the graphical editor. We will explain how to drag and drop blocks from the palette browser, how to connect them with wires, how to set their parameters and labels, and how to organize them in sub-systems or super-blocks. We will also show how to use scopes, meters and other blocks for data visualization.
We will explain how to set the simulation options, such as the solver type, the integration method, the time range, etc. We will also show how to start, pause, resume or stop the simulation, and how to view the simulation results in graphs or tables.
By the end of this lecture, you should be able to:
Install and run Scilab and xcos on your computer.
Write and execute simple Scilab commands and scripts.
-
35Constructing the model with Scilab part 1Video lesson
This is the first part of a series of two lectures where we will learn how to construct and simulate a pure prey-predator model using Scilab and xcos.
We will show how to implement the model in xcos using blocks from different palettes, such as sources, sinks, mathematical operations, etc. We will also explain how to set the initial conditions and the values of the parameters using the block properties dialog.
Next, we will show how to compile and simulate the model using xcos interface. We will explain how to choose the simulation options, such as the solver type, the integration method, the time range, etc. We will also show how to start, pause, resume or stop the simulation, and how to view the simulation results in graphs or tables.
Finally, we will analyze and interpret the simulation results. We will plot the population number of both species versus time and observe their oscillatory behavior. We will also plot the phase portrait of the system and identify its equilibrium points. We will discuss how the parameters affect the dynamics and stability of the system.
By the end of this series of two lectures, you should be able to:
Recognize and write down the general form of a pure prey-predator model.
Implement and edit a pure prey-predator model in xcos using blocks from different palettes.
Compile and simulate a pure prey-predator model in xcos using the interface options.
View and analyze the simulation results using graphs, tables or Scilab commands.
Interpret the simulation results in terms of population dynamics and equilibrium points.
-
36Constructing the model with Scilab part 2Video lesson
This is the second part of a series of two lectures where we will learn how to construct and simulate a pure prey-predator model using Scilab and xcos. We will show how to implement the model in xcos using blocks from different palettes, such as sources, sinks, mathematical operations, etc. We will also explain how to set the initial conditions and the values of the parameters using the block properties dialog. Next, we will show how to compile and simulate the model using xcos interface. We will explain how to choose the simulation options, such as the solver type, the integration method, the time range, etc. We will also show how to start, pause, resume or stop the simulation, and how to view the simulation results in graphs or tables. Finally, we will analyze and interpret the simulation results. We will plot the population number of both species versus time and observe their oscillatory behavior. We will also plot the phase portrait of the system and identify its equilibrium points. We will discuss how the parameters affect the dynamics and stability of the system. By the end of this series of two lectures, you should be able to:
-Recognize and write down the general form of a pure prey-predator model.
-Implement and edit a pure prey-predator model in xcos using blocks from different palettes.
-Compile and simulate a pure prey-predator model in xcos using the interface options.
-View and analyze the simulation results using graphs, tables or Scilab commands.
- Interpret the simulation results in terms of population dynamics and equilibrium points.
Note: I have attached the pure prey-predator model built during the lecture.
-
37How parameters affect the output of the modelVideo lesson
In this lecture, we will learn how to explore the effects of changing the parameters of a pure prey-predator model using a numerical computing tool.
The parameters of the (pure) prey-predator model include the intrinsic growth rate of the prey, the intrinsic death rate and conversion efficiency of the predator, and the predation rate coefficient.
Next, we will show how to solve and simulate the model using a numerical computing tool.
Finally, we will analyze and interpret the simulation results using different scenarios. We will vary one or more parameters of the model and observe how they affect the population dynamics and stability of both species. We will discuss how different parameter values can lead to different outcomes, such as extinction, coexistence, oscillation, and so on.
-
38Influence of fishing on the modelVideo lesson
In this lecture, we will learn how to incorporate the effect of fishing on a pure prey-predator model using a graphical modeling and simulation tool. A pure prey-predator model is a system of two nonlinear differential equations that describe the interactions between two species: one as a prey and the other as a predator. The model assumes that the predator depends entirely on the prey for its food source, and that both species have logistic growth rates.
We will first review the general form and the meaning of the parameters of the pure prey-predator model, also known as the Lotka-Volterra model. We will then show how to modify the model to include fishing terms that represent the harvesting rate of both species by humans.
We will discuss how fishing can lead to extinction or coexistence of both species depending on their growth rates and harvesting rates.
By the end of this lecture, you should be able to:
Recognize and write down the general form of a pure prey-predator model.
Modify and edit a pure prey-predator model in a graphical tool to include fishing terms.
Interpret the equilibrium points in terms of population dynamics and equilibrium points under fishing pressure.
-
39Addition of logistic terms to the modelVideo lesson
In this lecture, we will learn how to improve the realism of a pure prey-predator model by adding logistic terms that account for the carrying capacities of both species. A pure prey-predator model is a system of two nonlinear differential equations that describe the interactions between two species: one as a prey and the other as a predator. The model assumes that the predator depends entirely on the prey for its food source, and that both species have exponential growth rates. We will first review the general form and the meaning of the parameters of the pure prey-predator model, also known as the Lotka-Volterra model. The parameters include the intrinsic growth rate and death rate of both species, and the predation rate coefficient. Next, we will show how to modify the model to include logistic terms that represent the density-dependent feedbacks of both species on their own growth rates. We will also explain how to set the values of the carrying capacities of both species using the graphical interface. Then, we will show how to solve and simulate the modified model using a numerical computing tool (xcos). Finally, we will analyze and interpret the simulation results using different scenarios. We will vary one or more parameters of the model and observe how they affect the population dynamics and stability of both species. By the end of this lecture, you should be able to: Recognize and write down the general form of a pure prey-predator model. Modify and edit a pure prey-predator model to include logistic terms. Solve and simulate a pure prey-predator model with logistic terms using a numerical computing tool. Explore and compare different scenarios by changing the parameters of the model with logistic terms. Interpret the simulation results in terms of population dynamics and stability under different parameter values with logistic terms.
Note: I have attached the prey-predator model with logistic terms built during the lecture.
-
40Model on the evolution of epidemicsVideo lesson
In this lecture, we will learn how to model and simulate the evolution of epidemics using xcos, a graphical editor and simulator for dynamic systems. We will use a mathematical model that describes how the number of susceptible, exposed, infectious and recovered individuals change over time, depending on factors such as the infection rate, the recovery rate, the mortality rate and the immunization rate. We will first introduce the basic concepts of epidemic modeling and explain how they can help us understand the spread and control of infectious diseases. We will review some common models, such as the SIR model, and their assumptions. Next, we will show how to implement and edit an epidemic model in xcos using blocks from different palettes, such as sources, sinks, mathematical operations, etc. We will explain how to set the initial conditions and the values of the parameters using the block properties dialog. Then, we will show how to compile and simulate an epidemic model using xcos interface. We will also show how to start, pause, resume or stop the simulation, and how to view the simulation results in graphs or tables. Finally, we will analyze and interpret the simulation results using different scenarios. We will vary one or more parameters of the model and observe how they affect the epidemic dynamics and outcomes. We will discuss how different parameter values can lead to different scenarios, such as endemicity, eradication, oscillation, and so on. By the end of this lecture, you should be able to: Recognize and write down the general form of an epidemic model. Implement and edit an epidemic model in xcos using blocks from different palettes. Compile and simulate an epidemic model in xcos using the interface options. View and analyze the simulation results using graphs, tables or commands. Explore and compare different scenarios by changing the parameters of the model. Interpret the simulation results in terms of epidemic spread and control under different parameter values.
Note: I have attached the epidemics model built during the lecture.
-
41Mathematical analysis of stabilityVideo lesson
In this lecture, we will learn how to use mathematical tools and methods to analyze the stability of equilibrium points of dynamical systems. An equilibrium point of a dynamical system is a state where the system does not change over time. The stability of an equilibrium point refers to the behavior of the system when it is slightly perturbed from that state.
We will first introduce the basic concepts related to stability. The nature of dynamical systems and their equilibrium points, can be of different types. In fact, we can have linear systems, nonlinear systems, etc.
We will show how to perform a local stability analysis of an equilibrium point by using linearization techniques and eigenvalue analysis. We will explain how to find the Jacobian matrix of a nonlinear system at an equilibrium point and how to compute its eigenvalues and eigenvectors. We will also discuss how to classify an equilibrium point as stable, unstable, based on the signs and magnitudes of the eigenvalues.
We will also discuss some limitations and challenges of this method.
By the end of this lecture, you should be able to:
Recognize and write down the general form of a dynamical system and its equilibrium points.
Perform a local stability analysis of an equilibrium point using linearization and eigenvalue analysis.
Apply the stability analysis methods to some examples of dynamical systems from different fields.
Know the limitations of the methods explained.
-
42Simulation and mathematics of the logistic model with one populationVideo lesson
In this lecture, we will learn how to use the logistic model to describe the growth of a population that is limited by environmental factors. We will also learn how to simulate and analyze the model using mathematical tools and software. The logistic model is a differential equation that relates the rate of change of the population size to the current population size and the carrying capacity of the environment. The carrying capacity is the maximum population size that can be sustained by the available resources. The logistic model assumes that every individual in the population has equal access to resources and an equal chance for survival. We will first introduce the general form and the meaning of the parameters of the logistic model, such as the intrinsic growth rate, the carrying capacity and the logistic growth rate. Next, we will show how to solve and simulate the logistic model using a numerical computing tool (Scilab/xcos). Finally, we will analyze and interpret the simulation results. By the end of this lecture, you should be able to: Recognize and write down the general form of a logistic model. Find the mathematical solution to the logistic model. Solve and simulate a logistic model using a numerical computing tool. Explore and compare different scenarios by changing the parameters of the model. Interpret the simulation results in terms of population growth and regulation under different parameter values.
Note: I have attached the logistic model built during the lecture.
-
43Dynamical systems and chaos: Lorenz systemVideo lesson
In this lecture, we will learn about the Lorenz system, a famous example of a dynamical system that exhibits chaotic behavior. The Lorenz system is a system of three ordinary differential equations that was originally derived by Edward Lorenz as a simplified model for atmospheric convection.
We will first introduce the general form and the meaning of the parameters of the Lorenz system. We will also review some physical interpretations and applications of the Lorenz system, such as weather prediction, lasers, dynamos, etc.
Next, we will show how to solve and simulate the Lorenz system using a numerical computing tool (Matlab). We will explain how to choose an appropriate solver method, such as ode45, and how to set the initial conditions and the time range. We will also show how to plot and save the simulation results in graphs or tables.
Finally, we will analyze and interpret the simulation results, by exploring some properties and phenomena of chaotic systems, such as sensitivity to initial conditions.
By the end of this lecture, you should be able to:
Recognize and write down the general form of a Lorenz system.
Solve and simulate a Lorenz system using a numerical computing tool (Matlab).
Plot and save the simulation results using graphs, tables or commands.
Explore and compare different scenarios by changing the parameters of the system.
Interpret the simulation results in terms of dynamical systems.
-
44Machine learning to find dynamical models behind data (SYNDy algorithm)Video lesson
In this lecture, we will learn how to use machine learning techniques to discover dynamical systems models from data. We will focus on the sparse identification of nonlinear dynamics (SINDy) algorithm, a powerful and versatile method that can identify both explicit and implicit models from noisy and sparse data.
We will first introduce the basic concepts and principles of SINDy, such as sparsity, regularization, library construction, optimization, etc. We will also review one example of dynamical systems that can be discovered by SINDy, namely the Lorenz system.
Next, we will show how to implement and apply SINDy to different types of data using a numerical computing tool. We will explain how to preprocess and differentiate the data, how to choose an appropriate library of candidate functions, how to solve the sparse regression problem.
By the end of this lecture, you should be able to:
Recognize and write down the general form of a SINDy model.
Implement and apply SINDy to different types of data using a numerical computing tool.
Interpret the results in terms of dynamical systems theory and machine learning methods under different parameter values.

External Links May Contain Affiliate Links read more