# Information and Data Sciences (IDS) Undergraduate Courses (2020-21)

IDS 9.
Introduction to Information and Data Systems Research.
1 unit (1-0-0):
second term.
This course will introduce students to research areas in IDS through weekly overview talks by Caltech faculty and aimed at first-year undergraduates. Others may wish to take the course to gain an understanding of the scope of research in computer science. Graded pass/fail. Not offered 2020-21.

ACM/IDS 101 ab.
Methods of Applied Mathematics.
12 units (4-4-4):
first, second terms.
Prerequisites: Math 2/102 and ACM 95 ab or equivalent.
First term: Brief review of the elements of complex analysis and complex-variable methods. Asymptotic expansions, asymptotic evaluation of integrals (Laplace method, stationary phase, steepest descents), perturbation methods, WKB theory, boundary-layer theory, matched asymptotic expansions with first-order and high-order matching. Method of multiple scales for oscillatory systems. Second term: Applied spectral theory, special functions, generalized eigenfunction expansions, convergence theory. Gibbs and Runge phenomena and their resolution. Chebyshev expansion and Fourier Continuation methods. Review of numerical stability theory for time evolution. Fast spectrally-accurate PDE solvers for linear and nonlinear Partial Differential Equations in general domains. Integral-equations methods for linear partial differential equation in general domains (Laplace, Helmholtz, Schroedinger, Maxwell, Stokes). Homework problems in both 101 a and 101 b include theoretical questions as well as programming implementations of the mathematical and numerical methods studied in class.
Instructor: Bruno.

ACM/IDS 104.
Applied Linear Algebra.
9 units (3-1-5):
first term.
Prerequisites: Ma 1 abc, some familiarity with MATLAB, e.g. ACM 11 is desired.
This is an intermediate linear algebra course aimed at a diverse group of students, including junior and senior majors in applied mathematics, sciences and engineering. The focus is on applications. Matrix factorizations play a central role. Topics covered include linear systems, vector spaces and bases, inner products, norms, minimization, the Cholesky factorization, least squares approximation, data fitting, interpolation, orthogonality, the QR factorization, ill-conditioned systems, discrete Fourier series and the fast Fourier transform, eigenvalues and eigenvectors, the spectral theorem, optimization principles for eigenvalues, singular value decomposition, condition number, principal component analysis, the Schur decomposition, methods for computing eigenvalues, non-negative matrices, graphs, networks, random walks, the Perron-Frobenius theorem, PageRank algorithm.
Instructor: Zuev.

CMS/ACM/IDS 107.
Linear Analysis with Applications.
12 units (3-0-9):
first term.
Prerequisites: ACM/IDS 104 or equivalent, Ma 1 b or equivalent.
Covers the basic algebraic, geometric, and topological properties of normed linear spaces, inner-product spaces, and linear maps. Emphasis is placed both on rigorous mathematical development and on applications to control theory, data analysis and partial differential equations.
Instructor: Stuart.

CMS/ACM/IDS 113.
Mathematical Optimization.
12 units (3-0-9):
first term.
Prerequisites: ACM 11 and ACM 104, or instructor's permission.
This class studies mathematical optimization from the viewpoint of convexity. Topics covered include duality and representation of convex sets; linear and semidefinite programming; connections to discrete, network, and robust optimization; relaxation methods for intractable problems; as well as applications to problems arising in graphs and networks, information theory, control, signal processing, and other engineering disciplines.
Instructor: Chandrasekaran.

ACM/EE/IDS 116.
Introduction to Probability Models.
9 units (3-1-5):
first term.
Prerequisites: Ma 3, some familiarity with MATLAB, e.g. ACM 11 is desired.
This course introduces students to the fundamental concepts, methods, and models of applied probability and stochastic processes. The course is application oriented and focuses on the development of probabilistic thinking and intuitive feel of the subject rather than on a more traditional formal approach based on measure theory. The main goal is to equip science and engineering students with necessary probabilistic tools they can use in future studies and research. Topics covered include sample spaces, events, probabilities of events, discrete and continuous random variables, expectation, variance, correlation, joint and marginal distributions, independence, moment generating functions, law of large numbers, central limit theorem, random vectors and matrices, random graphs, Gaussian vectors, branching, Poisson, and counting processes, general discrete- and continuous-timed processes, auto- and cross-correlation functions, stationary processes, power spectral densities.
Instructor: Zuev.

CS/IDS 121.
Relational Databases.
9 units (3-0-6):
second term.
Prerequisites: CS 1 or equivalent.
Introduction to the basic theory and usage of relational database systems. It covers the relational data model, relational algebra, and the Structured Query Language (SQL). The course introduces the basics of database schema design and covers the entity-relationship model, functional dependency analysis, and normal forms. Additional topics include other query languages based on the relational calculi, data-warehousing and dimensional analysis, writing and using stored procedures, working with hierarchies and graphs within relational databases, and an overview of transaction processing and query evaluation. Extensive hands-on work with SQL databases.
Instructor: Hovik.

IDS/Ec/PS 126.
Applied Data Analysis.
9 units (3-0-6):
first term.
Prerequisites: Math 3/103 or ACM/EE/IDS 116, Ec 122 or IDS/ACM/CS 157 or Ma 112 a.
Fundamentally, this course is about making arguments with numbers and data. Data analysis for its own sake is often quite boring, but becomes crucial when it supports claims about the world. A convincing data analysis starts with the collection and cleaning of data, a thoughtful and reproducible statistical analysis of it, and the graphical presentation of the results. This course will provide students with the necessary practical skills, chiefly revolving around statistical computing, to conduct their own data analysis. This course is not an introduction to statistics or computer science. I assume that students are familiar with at least basic probability and statistical concepts up to and including regression.
Instructor: Katz.

EE/Ma/CS/IDS 127.
Error-Correcting Codes.
9 units (3-0-6):
second term.
Prerequisites: Ma 2.
This course develops from first principles the theory and practical implementation of the most important techniques for combating errors in digital transmission or storage systems. Topics include algebraic block codes, e.g., Hamming, BCH, Reed-Solomon (including a self-contained introduction to the theory of finite fields); and the modern theory of sparse graph codes with iterative decoding, e.g. LDPC codes, turbo codes. The students will become acquainted with encoding and decoding algorithms, design principles and performance evaluation of codes. Not Offered 2020-21.
Instructor: Kostina.

EE/Ma/CS/IDS 136.
Topics in Information Theory.
9 units (3-0-6):
third term.
Prerequisites: Ma 3 or ACM/EE/IDS 116 or CMS 117 or Ma/ACM/IDS 140a.
This class introduces information measures such as entropy, information divergence, mutual information, information density from a probabilistic point of view, and discusses the relations of those quantities to problems in data compression and transmission, statistical inference, language modeling, game theory and control. Topics include information projection, data processing inequalities, sufficient statistics, hypothesis testing, single-shot approach in information theory, large deviations.
Instructor: Kostina.

CMS/CS/IDS 139.
Analysis and Design of Algorithms.
12 units (3-0-9):
second term.
Prerequisites: Ma 2, Ma 3, Ma/CS 6a, CS 21, CS 38/138, and ACM/EE/IDS 116 or CMS/ACM/IDS 113 or equivalent.
This course develops core principles for the analysis and design of algorithms. Basic material includes mathematical techniques for analyzing performance in terms of resources, such as time, space, and randomness. The course introduces the major paradigms for algorithm design, including greedy methods, divide-and-conquer, dynamic programming, linear and semidefinite programming, randomized algorithms, and online learning.
Instructor: Mahadev.

Ma/ACM/IDS 140 ab.
Probability.
9 units (3-0-6):
first, second terms.
Prerequisites: For 140 a, Ma 108 b is strongly recommended.
Overview of measure theory. Random walks and the Strong law of large numbers via the theory of martingales and Markov chains. Characteristic functions and the central limit theorem. Poisson process and Brownian motion. Topics in statistics.
Instructors: Tamuz, Ouimet.

CS/IDS 142.
Distributed Computing.
9 units (3-2-4):
first term.
Prerequisites: CS 24, CS 38.
Programming distributed systems. Mechanics for cooperation among concurrent agents. Programming sensor networks and cloud computing applications. Applications of machine learning and statistics by using parallel computers to aggregate and analyze data streams from sensors. Not offered 2020-21.

CS/EE/IDS 143.
Communication Networks.
9 units (3-3-3):
first term.
Prerequisites: Ma 2, Ma 3, CS 24 and CS 38, or instructor permission.
This course focuses on the link layer (two) through the transport layer (four) of Internet protocols. It has two distinct components, analytical and systems. In the analytical part, after a quick summary of basic mechanisms on the Internet, we will focus on congestion control and explain: (1) How to model congestion control algorithms? (2) Is the model well defined? (3) How to characterize the equilibrium points of the model? (4) How to prove the stability of the equilibrium points? We will study basic results in ordinary differential equations, convex optimization, Lyapunov stability theorems, passivity theorems, gradient descent, contraction mapping, and Nyquist stability theory. We will apply these results to prove equilibrium and stability properties of the congestion control models and explore their practical implications. In the systems part, the students will build a software simulator of Internet routing and congestion control algorithms. The goal is not only to expose students to basic analytical tools that are applicable beyond congestion control, but also to demonstrate in depth the entire process of understanding a physical system, building mathematical models of the system, analyzing the models, exploring the practical implications of the analysis, and using the insights to improve the design.
Instructors: Low, Ralph.

CMS/CS/EE/IDS 144.
Networks: Structure & Economics.
12 units (3-4-5):
second term.
Prerequisites: Ma 2, Ma 3, Ma/CS 6 a, and CS 38, or instructor permission.
Social networks, the web, and the internet are essential parts of our lives, and we depend on them every day. This course studies how they work and the "big" ideas behind our networked lives. Questions explored include: What do networks actually look like (and why do they all look the same)?; How do search engines work?; Why do memes spread the way they do?; How does web advertising work? For all these questions and more, the course will provide a mixture of both mathematical analysis and hands-on labs. The course expects students to be comfortable with graph theory, probability, and basic programming.
Instructor: Wierman.

CS/IDS 150 ab.
Probability and Algorithms.
9 units (3-0-6):
first and third terms.
Prerequisites: part a: CS 38 and Ma 5 abc; part b: part a or another introductory course in discrete probability.
Part a: The probabilistic method and randomized algorithms. Deviation bounds, k-wise independence, graph problems, identity testing, derandomization and parallelization, metric space embeddings, local lemma. Part b: Further topics such as weighted sampling, epsilon-biased sample spaces, advanced deviation inequalities, rapidly mixing Markov chains, analysis of boolean functions, expander graphs, and other gems in the design and analysis of probabilistic algorithms. Parts a & b are offered in alternate years.
Instructor: Schulman.

CS/IDS 153.
Current Topics in Theoretical Computer Science.
9 units (3-0-6):
third term.
Prerequisites: CS 21 and CS 38, or instructor's permission.
May be repeated for credit, with permission of the instructor. Students in this course will study an area of current interest in theoretical computer science. The lectures will cover relevant background material at an advanced level and present results from selected recent papers within that year's chosen theme. Students will be expected to read and present a research paper. Not offered 2020-21.

ACM/IDS 154.
Inverse Problems and Data Assimilation.
9 units (3-0-6):
first term.
Prerequisites: Basic differential equations, linear algebra, probability and statistics: ACM/IDS 104, ACM/EE 106 ab, ACM/EE/IDS 116, IDS/ACM/CS 157 or equivalent.
Models in applied mathematics often have input parameters that are uncertain; observed data can be used to learn about these parameters and thereby to improve predictive capability. The purpose of the course is to describe the mathematical and algorithmic principles of this area. The topic lies at the intersection of fields including inverse problems, differential equations, machine learning and uncertainty quantification. Applications will be drawn from the physical, biological and data sciences. Not offered 2020-21.

CMS/CS/CNS/EE/IDS 155.
Machine Learning & Data Mining.
12 units (3-3-6):
second term.
Prerequisites: CS/CNS/EE 156 a.
Having a sufficient background in algorithms, linear algebra, calculus, probability, and statistics, is highly recommended. This course will cover popular methods in machine learning and data mining, with an emphasis on developing a working understanding of how to apply these methods in practice. The course will focus on basic foundational concepts underpinning and motivating modern machine learning and data mining approaches. We will also discuss recent research developments.
Instructor: Pachter.

IDS/ACM/CS 157.
Statistical Inference.
9 units (3-2-4):
third term.
Prerequisites: ACM/EE/IDS 116, Ma 3.
Statistical Inference is a branch of mathematical engineering that studies ways of extracting reliable information from limited data for learning, prediction, and decision making in the presence of uncertainty. This is an introductory course on statistical inference. The main goals are: develop statistical thinking and intuitive feel for the subject; introduce the most fundamental ideas, concepts, and methods of statistical inference; and explain how and why they work, and when they don't. Topics covered include summarizing data, fundamentals of survey sampling, statistical functionals, jackknife, bootstrap, methods of moments and maximum likelihood, hypothesis testing, p-values, the Wald, Student's t-, permutation, and likelihood ratio tests, multiple testing, scatterplots, simple linear regression, ordinary least squares, interval estimation, prediction, graphical residual analysis.
Instructor: Zuev.

IDS/ACM/CS 158.
Fundamentals of Statistical Learning.
9 units (3-3-3):
third term.
Prerequisites: Ma 3 or ACM/EE/IDS 116, IDS/ACM/CS 157.
The main goal of the course is to provide an introduction to the central concepts and core methods of statistical learning, an interdisciplinary field at the intersection of statistics, machine learning, information and data sciences. The course focuses on the mathematics and statistics of methods developed for learning from data. Students will learn what methods for statistical learning exist, how and why they work (not just what tasks they solve and in what built-in functions they are implemented), and when they are expected to perform poorly. The course is oriented for upper level undergraduate students in IDS, ACM, and CS and graduate students from other disciplines who have sufficient background in probability and statistics. The course can be viewed as a statistical analog of CMS/CS/CNS/EE/IDS 155. Topics covered include supervised and unsupervised learning, regression and classification problems, linear regression, subset selection, shrinkage methods, logistic regression, linear discriminant analysis, resampling techniques, tree-based methods, support-vector machines, and clustering methods. Not offered 2020-21.

CS/CNS/EE/IDS 159.
Advanced Topics in Machine Learning.
9 units (3-0-6):
third term.
Prerequisites: CS 155; strong background in statistics, probability theory, algorithms, and linear algebra; background in optimization is a plus as well.
This course focuses on current topics in machine learning research. This is a paper reading course, and students are expected to understand material directly from research articles. Students are also expected to present in class, and to do a final project. Not offered 2020-21.

EE/CS/IDS 160.
Fundamentals of Information Transmission and Storage.
9 units (3-0-6):
second term.
Basics of information theory: entropy, mutual information, source and channel coding theorems. Basics of coding theory: error-correcting codes for information transmission and storage, block codes, algebraic codes, sparse graph codes. Basics of digital communications: sampling, quantization, digital modulation, matched filters, equalization.
Instructor: Kostina.

CS/IDS 162.
Data, Algorithms and Society.
9 units (3-0-6):
second term.
Prerequisites: CS 38 and CS 155 or 156a.
This course examines algorithms and data practices in fields such as machine learning, privacy, and communication networks through a social lens. We will draw upon theory and practices from art, media, computer science and technology studies to critically analyze algorithms and their implementations within society. The course includes projects, lectures, readings, and discussions. Students will learn mathematical formalisms, critical thinking and creative problem solving to connect algorithms to their practical implementations within social, cultural, economic, legal and political contexts. Enrollment by application. Taught concurrently with VC 72 and can only be taken once, as VC 72 or CS/IDS 162.
Instructors: Mushkin, Ralph.

CS/CNS/EE/IDS 165.
Foundations of Machine Learning and Statistical Inference.
12 units (3-3-6):
second term.
Prerequisites: CMS/ACM/IDS 113, ACM/EE/IDS 116, CS 156 a, ACM/CS/IDS 157 or instructor's permission.
The course assumes students are comfortable with analysis, probability, statistics, and basic programming. This course will cover core concepts in machine learning and statistical inference. The ML concepts covered are spectral methods (matrices and tensors), non-convex optimization, probabilistic models, neural networks, representation theory, and generalization. In statistical inference, the topics covered are detection and estimation, sufficient statistics, Cramer-Rao bounds, Rao-Blackwell theory, variational inference, and multiple testing. In addition to covering the core concepts, the course encourages students to ask critical questions such as: How relevant is theory in the age of deep learning? What are the outstanding open problems? Assignments will include exploring failure modes of popular algorithms, in addition to traditional problem-solving type questions.
Instructor: Anandkumar.

EE/CS/IDS 167.
Introduction to Data Compression and Storage.
9 units (3-0-6):
third term.
Prerequisites: Ma 3 or ACM/EE/IDS 116.
The course will introduce the students to the basic principles and techniques of codes for data compression and storage. The students will master the basic algorithms used for lossless and lossy compression of digital and analog data and the major ideas behind coding for flash memories. Topics include the Huffman code, the arithmetic code, Lempel-Ziv dictionary techniques, scalar and vector quantizers, transform coding; codes for constrained storage systems. Given in alternate years; Not offered 2020-21.
Instructor: Kostina.

ACM/EE/IDS 170.
Mathematics of Signal Processing.
12 units (3-0-9):
third term.
Prerequisites: ACM/IDS 104, CMS/ACM/IDS 113, and ACM/EE/IDS 116; or instructor's permission.
This course covers classical and modern approaches to problems in signal processing. Problems may include denoising, deconvolution, spectral estimation, direction-of-arrival estimation, array processing, independent component analysis, system identification, filter design, and transform coding. Methods rely heavily on linear algebra, convex optimization, and stochastic modeling. In particular, the class will cover techniques based on least-squares and on sparse modeling. Throughout the course, a computational viewpoint will be emphasized.
Instructor: Hassibi.

CS/IDS 178.
Numerical Algorithms and their Implementation.
9 units (3-3-3):
third term.
Prerequisites: CS 2.
This course gives students the understanding necessary to choose and implement basic numerical algorithms as needed in everyday programming practice. Concepts include: sources of numerical error, stability, convergence, ill-conditioning, and efficiency. Algorithms covered include solution of linear systems (direct and iterative methods), orthogonalization, SVD, interpolation and approximation, numerical integration, solution of ODEs and PDEs, transform methods (Fourier, Wavelet), and low rank approximation such as multipole expansions.
Instructor: Desbrun.

IDS 197.
Undergraduate Reading in the Information and Data Sciences.
Units are assigned in accordance with work accomplished:
first, second, third terms.
Prerequisites: Consent of supervisor is required before registering.
Supervised reading in the information and data sciences by undergraduates. The topic must be approved by the reading supervisor and a formal final report must be presented on completion of the term. Graded pass/fail.
Instructor: Staff.

IDS 198.
Undergraduate Projects in Information and Data Sciences.
Units are assigned in accordance with work accomplished:
first, second, third terms.
Prerequisites: Consent of supervisor is required before registering.
Supervised research in the information and data sciences. The topic must be approved by the project supervisor and a formal report must be presented upon completion of the research. Graded pass/fail.
Instructor: Staff.

IDS 199.
Undergraduate thesis in the Information and Data Sciences.
9 units (1-0-8):
first, second, third terms.
Prerequisites: instructor's permission, which should be obtained sufficiently early to allow time for planning the research.
Individual research project, carried out under the supervision of a faculty member and approved by the option representative. Projects must include significant design effort and a written Report is required. Open only to upperclass students. Not offered on a pass/fail basis.
Instructor: Staff.

### Please Note

The online version of the Caltech Catalog is provided as a convenience; however, the printed version is the only authoritative source of information about course offerings, option requirements, graduation requirements, and other important topics.