NUMERICAL LINEAR ALGEBRA IN DATA EXPLORATION
CSci 8363 -- Fall 2015
Linear Algebra has contributed many methods for handling very large quantities
of numerical data.
Here we examine many of these linear algebra methods
and how they have been applied to the exploration and analysis of very large
After a brief review of some basic concepts in linear algebra,
most of the class will be devoted to how these linear
algebra methods have been used in information retrieval,
and the like.
Examples of methods we will examine are Latent
Least Squares Fit, possibly under a sparsity constraint,
Support Vector Machines,
recent ideas on sparse approximation methods using L1 regularization.
A collection of basic
research papers, some of a tutorial nature, will be used for the class.
Examples will be taken from vision recognition systems, biological gene
analysis, document retrieval.
Students should be familiar with basic linear algebra concepts and methods
such as Gaussian elimination for systems of linear equations, plus some
concepts such as matrix eigenvalues, singular values, and matrix least squares problems,
though some time will be spent reviewing these latter topics.
Basic concepts in optimization like first order optimality conditions
and duality will also be useful.
Students will be expected to do the following.
- Students are to present [at least] one research
or tutorial papers during the course of the
semester, by rotation.
Talks should highlight the main points, and summarize the theoretical and
experimental results present in the paper being presented. For very
theoretically technical papers, you should at least be able to explain what
the main results are and what they mean, even if the detailed derivations are
too complex for a short presentation.
Long papers may be split up into parts presented separately (e.g.,
basic results/algorithms and examples/applications).
(30% of the final grade)
- The papers listed here are a mix of historical foundational papers, recent
papers showing variations on the original ideas or how the abstract methods
have been applied to specific applications. Students should either select
from among these papers, or may select another similar relevant paper after
checking with first with the instructor.
Submit a weekly synopsis of each week's material, with your own reactions.
This should be limited to a paragraph or two pointing out what the main
take-away message you got out of each lecture.
(10% of the final grade)
- Also submit as separate item some comments on how the lecture
was presented, to be passed on to the speaker (without attribution).
- Develop and carry out a research project based on one or more recent
research papers devoted to topics studied in this class.
A research project can be a literature survey, an experimental study of some
methods proposed in a paper or of an application of one of the methods studied
in this class. To give an approximate scale of the effort required, you
should expect to devote about 50 hours of time during the course of the
semester. In about 3 weeks, you should submit a 1 page description of
your proposed project.
(10% of the final grade)
- Write a 10-15 page report on your research project at a level that would
be appropriate for publication in a workshop or conference.
in Computer Science.
(40% of the final grade)
- Give a short presentation on your project during the last 2 weeks of the
(10% of the final grade)
- Your project will count toward the Project Requirements for a Plan C MS degree
- Extra effort on the project or participation in class will be counted as an extra boost to your final grade.
- All submissions should be via
Link to Current Semester Plan and List of Papers
(Link to Old Semester Plan and List of Papers from the previous offering of this class in 2012
- Review of SVD and related topics
- Classical Dimensionality Reduction
- Latent Semantic Indexing
- Principal Component Analysis
- Tensor decompositions
- Concept Decomposition
- Randomized Matrix Approximations
- Specialized Matrix Approximations in applied statistics
- Fisher Linear Discriminant Analysis (LDA)
- Multi-Dimensional Scaling (MDS)
- Isomap and Locally Linear Embedding (LLE)
- Non-Negative Matrix Factorization (NMF)
- Graphs and Link Analysis → Web Searches
- Power Law Distribution
- Importance of Nodes/Links via Page Rank, Commute Times.
- Hubs and Authorities (HITS)
- Connectivity, Almost Shortest Paths, Partitioning.
- Spectral Graph Partitioning
- Dynamics - Consensus Computations.
- Theoretically fast matrix inversion.
- Sparse Approximations and Optimization
- Review of First Order Optimality Conditions in Optimization.
- basis pursuit, compressed sensing
- L1 regularization LASSO
- group lasso → sparse basis
- variations: low rank (nuclear norm) low variation (image processing)
- clustering with constraints.
- Iterative Algorithms, proximal methods.
For Further Information
Contact Daniel Boley, 6-209 KHKH, firstname.lastname@example.org,