2022 수학 & 통계학/데이터 사이언스를 위한 수학의 기초

01. Course Overview

gool 2022. 5. 25. 18:37
Data Science ← Machine Learning, Artificial Intelligence, Big Data

 

 

 

1-1 Brief Overview of Machine Learning

 

 

 

What is Machine Learning?

 

A subfield of Artificial Intelligence that gives computers the ability to learn without being explicitly programmed.

 

"Learning"

▶ Any process a system improves performance from experience

 

 

Application of Machine Learning

 

1) Systems that can automatically adapt and customize themselves to individual users (personalization)

2) Systems that are too difficult or expensive to construct manually

3) Discover new knowledge from a large dataset

4) Mimic human and replace certain autonomous tasks

 

 

Why Machine Learning?

  • Increasing amount of available data
  • Many basic effective and efficient algorithms (ex: R, Python)

 

* The data is abundant, but the knowledge is expensive and scarce.

Therefore, we use machine learning techniques to generate knowledge.

 

 

 

Categories of Machine Learning

 

1) Supervised Learning

 

Given examples of inputs(x1, x2) and desired outputs(O, X), which are labeled data,

predict outputs on future inputs, extract general rules.

ex) classification, regression, time series prediction

 

* We have training data with correct answers to prepare the algorithm

 

Figuring out whether a certain point is O or X

 

 

2) Unsupervised Learning

 

Given only inputs(unlabeled data), automatically discover hidden features.

ex) clustering, outlier detection

 

* Identification of natural groups in data

Throw data into the algorithm without training data,

hope it make some kind of sense out of the data.

 

 

 

3) Semi-supervised Learning

 

Given both labeled and unlabeled data, leverage information to improve both tasks.

 

 

4) Reinforcement Learning

 

(1) the agent observes the state of the environment

(2) the agent performs tasks/actions based on the observation

(3) scalar rewards and punishments given from the environment

 

* Learn to select action sequences in a way that maximizes expected reward

 

 

 

 

Some Types of Machine Learning Algorithms

 

1) Regression(prediction)

 

Predicting a variable from data observations

 

Predict the future value of y from input x

 

 

2) Classification

  • Assigning observations to predefined groups
  • Predicting class from observations

 

 

3) Clustering

  • Splitting observations into groups based on similarity
  • Grouping observations into "meaningful" groups

 

 

4) Association

 

Seeing what often appears together with what

ex) people who buy diapers also tend to buy beer

 

 

 

Machine Learning and Other Disciplines

 

(1) Statistics

Inference from a sample

 

(2) Mathematics

Linear algebra and calculus to

  • solve regression problems
  • optimization functions

 

(3) Operations research or Computer science

Efficient algorithms to

  • solve the optimization problem
  • represent and evaluate the model of inference

 

 

* Basically, all mathematics

  • Probability & Statistics
  • Linear algebra
  • Calculus & Optimization
  • Graph theory

 

 

 

 

1-2 Introduction to Mathematical Fundamentals for Data Science

 

 

 

Machine Learning theory intersects statistical, probabilistic, computer science,

and algorithmic aspects.

Mathematical Fundamentals are necessary for a good understanding on

  • how the machine learning algorithm works
  • how we can get good results and interpret them properly

 

* This course will be focused on

1) Linear algebra, 2) Probability & statistics, 3) Multivariable calculus & optimization

 

 

 

Linear Algebra

 

Study of vectors and linear functions

 

"Vector"

▶ an object having both a magnitude and a direction

 

We use the arrow to represent a vector(a)

 

 

  • -a : a vector with the same magnitude as but is pointed in the opposite direction
  • λa : pointing the same direction as the direction of a, λ times the magnitude of a

 

Addition of Vectors

 

Addition of vectors a and b

 

* Things covered in this course

  • Operations on or between vectors and matrices
  • Span, linear independence, basis, dimension, ...
  • Linear transformations
  • Least squares problem

 

 

 

Probability & Statistics

 

Uncertainty is the key concept in Machine Learning

 

Probability theory is the mathematical study of uncertainty.

The design of machine learning algorithms often relies on probabilistic assumption of the data.

 

* Things covered in this course

  • Random variables, probability distribution, expectation, variance
  • Common distributions : Bernoulli, Binomial, Geometric, Poisson, Exponential, Uniform, Gaussian distributions
  • Joint and conditional distribution, independency, joint Gaussian
  • Basic statistics, confidence interval, hypothesis test, t-test

 

 

 

Optimization

 

Helps finding models to explain the data, fit machine learning models on the data

by choosing the parameters that either maximize or minimize a function.

ex) likelihood of the data, loss function, error obtained by the model on the (training) data

 

* Things covered in this course

  • Introduction to mathematical programming modeling
  • Linear programming and duality
  • Basics of multivariable calculus : derivatives, gradient, Hessian
  • Convex set and function
  • Nonlinear programming, Lagrangian relaxation and KKT conditions
  • Numerical optimization algorithms : gradient and Newton's algorithms

 

 

 

Basic Mathematical Concepts & Notations

 

1) Set Theory

 

Collection of objects which are called elements of the set

 

ex)

N = {1, 2, -4, 5}, M = {man, woman}

 

  • ℝ : set of real numbers
  • ℤ : set of integers
  • ∅ : empty set
  • |N| : cardinality(size) of a set N, number of elements in N
  • A⊆B : subset, every element in A is also an element of B
  • A⊂B : proper subset, A⊆B and A≠B
  • A∩B : intersection, set of elements which both sets have in common
  • A∪B : union, set of elements which belong to at least one of the sets
  • A×B : Cartesian product, {(a, b) | a ∈ A and b ∈ B}

    ex) A×B = {(a, c), (a, e), (c, c), (c, e)} when A = {a, c}, B = {c, e}

 

 

Specification of sets

  • List notation : {a, c, d, f}, {-1, 1, 3, 5, 7}, ...
  • Predicate notation : stating properties of its elements, {x | x ∈ ℤ and x < 7}, ...
  • Recursive rules : defining a set of rules which generates its members

    ex) (a) 2 ∈ A, (b) if x ∈ A, then x + 3 ∈ A, (c) nothing else belongs to A

           A = {2, 5, 8, ... }

 

 

 

2) p-Norms

 

The norms are normally used to calculate a length or a magnitude of a vector or matrices.

 

 

The following are three types of norms that are frequently encountered.

 

  • 1-norm : the sum of the absolute values of all the elements in x
  • 2-norm : the square root of the sum of the squares of the elements

 

* also called the Euclidean norm

: the 2-norm normally represents the Euclidean distance.

 

  • infinity-norm : the maximum absolute value entry in x

 

 

 

Some Basic Notations

 

 

 

(1) arg min : argument of the minimum

The arg min of f(x) over the set X

  • the set of x's such that f(x) is less than or equal to f(y) for all y's in the set X
  • a collection of x's which induce the minimum value of function f

 

(2) arg max : argument of the maximum

The arg max of f(x) over the set X

  • the set of x's for which function f attains its largest value