My Digital Garden

Maths for Data Science

Maths for Data Science

Key skills required

  • Modeling a process (physical or informational) by probing the underlying dynamics
  • Constructing hypotheses
  • Rigorously estimating the quality of the data source
  • Quantifying the uncertainty around the data and predictions
  • Identifying the hidden pattern from the stream of information
  • Understanding the limitation of a model
  • Understanding mathematical proof and the abstract logic behind it

Study Areas

Functions

  • Logarithm, exponential, polynomial functions, rational numbers
  • Basic geometry and theorems, trigonometric identities
  • Real numbers and complex numbers (basic properties)
  • Series, Summation, inequalities
  • Graphing and plotting, Cartesian coordinates / polar coordinates, conic sections

Statistics

  • Data summaries and descriptive statistics, central tendency, variance, covariance, correlation
  • Basic probability: basic idea, expectation, probability calculus, Bayes' theorem, conditional probability
  • Probability distribution functions: uniform, normal, binomial, chi-square, Student's t-distribution, central limit theorem
  • Sampling, measurement, error, random number generation
  • Hypothesis testing, A/B testing, confidence intervals, p-values
  • ANOVA, t-test
  • Linear regression, regularization
  • dimensionality reduction, principle component analysis [^1]

Linear Algebra

  • Basic properties of matrix and vectors: scalar multiplication, linear transformation, transpose, conjugate, rank, determinant
  • Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse
  • Special matrices: square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices
  • Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation
  • Vector space, basis, span, orthogonality, orthonormality, linear least square
  • Eigenvalues, eigenvectors, diagonalization, singular value decomposition

Calculus

  • Functions of a single variable, limit, continuity, differentiability
  • Mean value theorems, indeterminate forms, L’Hospital’s rule
  • Maxima and minima
  • Product and chain rule
  • Taylor’s series, infinite series summation/integration concepts
  • Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals
  • Beta and gamma functions
  • Functions of multiple variables, limit, continuity, partial derivatives
  • Basics of ordinary and partial differential equations

Discrete maths

  • Sets Basics, subsets, power sets
  • Counting functions, combinatorics, countability
  • Basic proof techniques: induction, proof by contradiction
  • Basics of inductive, deductive, and propositional logic
  • Basic data structures: stacks, queues, graphs, arrays, hash tables, trees
  • Graph properties: connected components, degree, maximum flow/minimum cut concepts, graph coloring
  • Recurrence relations and equations
  • Growth of functions and O(n) notation concept

Optimization and Operations Research

  • Basics of optimization, how to formulate the problem
  • Maxima, minima, convex function, global solution
  • Linear programming, simplex algorithm
  • Integer programming
  • Constraint programming, knapsack problem
  • Randomized optimization techniques: hill climbing, simulated annealing, genetic algorithms

Sources

Footnotes