Sparse & Functional Principal Components Analysis - Rice Statistics [PDF]

Apr 4, 2014 - Sparse & Functional Principal Components Analysis. Genevera I. Allen. Department of Statistics and Ele

0 downloads 5 Views 2MB Size

Report

Download PDF

PNG Network

Recommend Stories

Interference and noise-adjusted principal components analysis

Where there is ruin, there is hope for a treasure. Rumi

evaluation of principal components analysis (pca)

Don’t grieve. Anything you lose comes round in another form. Rumi

Core Jail Functional Components (PDF)

You often feel tired, not because you've done too much, but because you've done too little of what sparks

COMPUTATIONAL ISSUES IN PRINCIPAL COMPONENTS ANALYSIS Daniel Povey Microsoft

Your task is not to seek for love, but merely to seek and find all the barriers within yourself that

Application of Principal Components Analysis to Change Detection

Ego says, "Once everything falls into place, I'll feel peace." Spirit says "Find your peace, and then

Principal Component Projection Without Principal Component Analysis

Be like the sun for grace and mercy. Be like the night to cover others' faults. Be like running water

Sparse PLS discriminant analysis

Just as there is no loss of basic energy in the universe, so no thought or action is without its effects,

Principal Component Analysis

If you are irritated by every rub, how will your mirror be polished? Rumi

Principal Component Analysis

When you do things from your soul, you feel a river moving in you, a joy. Rumi

Principal Component Analysis (PCA)

Don't be satisfied with stories, how things have gone with others. Unfold your own myth. Rumi

Idea Transcript

Sparse & Functional Principal Components Analysis Genevera I. Allen Department of Statistics and Electrical and Computer Engineering, Rice University, Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital.

April 4, 2014

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

1 / 38

1

Motivation

2

Background & Challenges: Regularized PCA

3

Sparse & Functional PCA Model

4

Sparse & Functional PCA Algorithm

5

Simulation Studies

6

Case Study: EGG Data

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

2 / 38

Structured Big-Data Structured Data = Data associated with locations. Time Series, Longitudinal & Spatial Data. Image data & Network Data.

Tree Data. Object-Oriented Data

Examples of Massive Structured Data: Neuroimaging and Neural Recordings: MRI, fMRI, EEG, MEG, PET, DTI, direct neuronal recordings (spike trains), optigenetics.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

3 / 38

Structured Big-Data

Data matrix, XL×T , of L brain locations by T time points. Goal: Unsupervised analysis of brain activation patterns and temporal neural activation patterns. Principal Components Analysis: Exploratory Data Analysis.

Dimension Reduction.

Pattern Recognition.

Data Visualization.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

3 / 38

Review: PCA Models 1

Covariance: Leading eigenspace of Gaussian covariance. I I

Model: X ∼ N(0, I ⊗ Σp×p ) and estimate leading eigenspace of Σ. Empirical Optimization Problem: T T T maximize vT k X X vk subject to vk vk = 1 & vk vj = 0 ∀ k > j. vk

2

Matrix Factorization: Low-rank mean structure. I

I

Model: X = M + E for mean matrix M that is low-rank and E iid additive noise. Empirical Optimization Problem: minimize || X − U D VT ||2F subject to UT U = I & VT V = I. U,D,V

Solution: Singular Value Decomposition (SVD).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

4 / 38

Review: Regularized PCA

Big-Data settings, regularize the leading eigenvalues leading to . . . 1

Functional PCA. I

I 2

Encourage PC factors to be smooth with respect to known data structure. (Rice & Silverman, 1991; Silverman, 1996; Ramsay, 2006; Huang et al., 2008). Leads to consistent PC estimates. (Silverman, 1996)

Sparse PCA. I

I

Automatic feature / variable selection. (Jollieffe et al., 2003; Zou et al., 2006; d’Aspermont et al., 2007; Shen & Huang, 2008) Leads to consistent PC estimates. (Johnstone & Lu, 2009; Amini & Wainwright, 2009; Shen et al., 2012, Vu & Lei, 2013)

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

5 / 38

Why Sparse & Functional PCA? 1

Applied Motivation (Neuroimaging):

2

Improved signal recovery, feature selection, interpretation, data visualization.

3

Question: Is there a general mathematical framework for regularization in the context of PCA?

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

6 / 38

Why Sparse & Functional PCA? 1

Applied Motivation (Neuroimaging):

Objectives (i) Formulate a (good) optimization framework to achieve SFPCA. (ii) Develop a scalable algorithm to fit SFPCA. (iii) Carefully study the properties of the model and algorithm from an optimization perspective.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

6 / 38

Preview

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

7 / 38

Preview

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

7 / 38

1

Motivation

2

Background & Challenges: Regularized PCA

3

Sparse & Functional PCA Model

4

Sparse & Functional PCA Algorithm

5

Simulation Studies

6

Case Study: EGG Data

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

8 / 38

Setup Matrix Factorization: Low-Rank Mean Model.

Xn×p =

K X

dk uk vT k +

k=1

Assume data, X, previously centered. PC Factors v and/or u are sparse and/or smooth. is iid noise; dk is fixed, but unknown. Rows and / or columns of X arise from discretized curves or other features associated with locations.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

9 / 38

Functional PCA & Two-Way FPCA Encouraging smoothness: Continuous functions: 2nd derivatives quantify curvature. R Penalty: f 00 (t)f 00 (t)dt. Penalizes average squared curvature. Discrete extension: Squared second differences between adjacent variables. Discrete extension in matrix form: X βT Ω β = (βj+1 − 2βj + βj−1 )2 Ωp×p 0 is the second differences matrix. Other possible choices Ω that encourages smoothness. (Ramsay 2006).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

10 / 38

Functional PCA & Two-Way FPCA Functional PCA maximize vT XT X v subject to vT (I + αv Ωv ) v = 1. v

Silverman (1996) Equivalent to: Regression Approach: (Huang et al., 2008) ˆ = argminu || X vˆ − u ||22 . u n o ˆ − v ||22 + α vT Ω v . vˆ = argminv || XT u Half-Smoothing.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

10 / 38

Functional PCA & Two-Way FPCA

Two-Way Functional PCA 1 maximize uT X v − uT (I + αu Ωu ) u vT (I + αv Ωv ) v . u,v 2 Huang et al. (2009) Equivalent to two-way half-smoothing. Related to two-way `2 penalized regression.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

10 / 38

Sparse PCA & Two-Way SPCA

Sparse PCA via Penalized Regression ˆ = argminu || X vˆ − u ||22 . u n o ˆ − v ||22 + λ|| v ||1 . vˆ = argminv || XT u Shen & Huang (2008). Other SPCA approaches: Semi-definite programming. Covariance thresholding.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

11 / 38

Sparse PCA & Two-Way SPCA

Two-Way Sparse PCA maximize uT X v −λu || u ||1 − λv || v ||1 u,v

subject to uT u ≤ 1 & vT v ≤ 1. Allen et al. (2011); Lagrangian of Witten et al. (2009); Related to Lee et al. (2010). SPCA of Shen & Huang (2008) a special case. Related to two-way penalized regression.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

11 / 38

Two-Way Regularized PCA

Alternating Penalized Regressions ˆ = argminu || X vˆ − u ||22 + λu P u (u) . u n o ˆ − v ||22 + λv P v (v) . vˆ = argminv || XT u Questions: 1

Are sparse AND smoothing `2 penalties permitted?

2

What penalty types lead to convergent solutions?

3

What optimization problem is this class of algorithms solving?

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

12 / 38

We consider . . . Objective Flexible combinations of smoothness and / or sparsity on the row and / or column PC factors. Four Penalties: Row Sparsity: λu P u (u), for example λu || u ||1 . Row Smoothness: αu uT Ωu u;

(n×n)

Ωu

0.

Column Sparsity: λv P v (v) for example λv || v ||1 . Column Smoothness: αv vT Ωv v;

(p×p)

Ωv

0.

Approach: Iteratively solve for the best rank-one solution in a greedy manner.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

13 / 38

Formulating an Optimization Problem We want . . . 1

To generalize existing methods for PCA, SPCA, FPCA, and two-way SPCA and FPCA. I

2

These should all be special cases when regularization parameters are active.

Desirable numerical properties: I I I

Identifiable PC factors. Non-degenerate and Well-scaled solution and PC factors Balanced regularization (NO regularization masking). F

3

Regularization Masking: ∃ a λ, α > 0 where the solution does not depend on both λ and α.

A computationally tractable algorithm in big-data settings.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

14 / 38

Optimization Approaches: Natural Extensions? Question: 1

Why not add `1 and smoothing `2 penalties to the Frobenius-norm loss of the SVD problem? minimize

u(:uT u≤1),v(:vT v≤1)

2

T T || X −(d) u vT ||T F + λu || u ||1 + λv || v ||1 + αu u Ωu u +αv v Ωv v .

Unidentifiable, degenerate, does not generalize. Why not add `1 penalties to the two-way FPCA optimization problem of Huang et al. (2009)? maximize

u(:uT u≤1),v(:vT v≤1)

3

uT X v −

1 T u (I + αu Ωu ) u vT (I + αv Ωv ) v −λu || u ||1 − λv || v ||1 2

Why not add smoothing `2 penalties to the two-way SPCA problem of Witten et al. (2009)? maximize

u(:uT u≤1),v(:vT v≤1)

uT X v −λu || u ||1 − λv || v ||1 − αu uT Ωu u −αv vT Ωv v .

Regularization masking, computationally challenging. G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

15 / 38

1

Motivation

2

Background & Challenges: Regularized PCA

3

Sparse & Functional PCA Model

4

Sparse & Functional PCA Algorithm

5

Simulation Studies

6

Case Study: EGG Data

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

16 / 38

Assumptions on Penalties

A1 Ωu 0 & Ωv 0. A2 P()’s are positive, homogeneous of order one, i.e. P(cx) = cP(x) ∀ c > 0. I I

Sparse penalties: `1 -norm, SCAD, MC+, etc. Structured sparse: group lasso, fused lasso, generalized lasso, etc.

A3 If P()’s non-convex, then P() can be decomposed into the difference of two convex functions, P(x) = P 1 (x) − P 2 (x) for P 1 () and P 2 () convex. I

Includes common non-convex penalties such as SCAD, MC+, log-concave, etc.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

17 / 38

SFPCA Optimization Problem

Rank-one Sparse & Functional PCA the solution to:

maximize uT X v −λu P u (u) − λv P v (v) u,v

subject to uT (I + αu Ωu ) u ≤ 1 & vT (I + αv Ωv ) v ≤ 1. Penalty Parameters: λu , λv , αu , αv ≥ 0.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

18 / 38

Relationship to Other PCA Approaches Theorem (i) If λu , λv , αu , αv = 0, then equivalent to PCA / the SVD of X.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

19 / 38

Relationship to Other PCA Approaches Theorem (i) If λu , λv , αu , αv = 0, then equivalent to PCA / the SVD of X. (ii) If λu , αu , αv = 0, then equivalent to Sparse PCA (Shen & Huang, 2008).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

19 / 38

Relationship to Other PCA Approaches Theorem (i) If λu , λv , αu , αv = 0, then equivalent to PCA / the SVD of X. (ii) If λu , αu , αv = 0, then equivalent to Sparse PCA (Shen & Huang, 2008). (iii) If αu , αv = 0, then equivalent to a special case of the two-way Sparse PCA (Allen et al., 2011), which is the Lagrangian of (Witten et al., 2009) and closely related to that (Lee et al., 2010).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

19 / 38

Relationship to Other PCA Approaches Theorem (i) If λu , λv , αu , αv = 0, then equivalent to PCA / the SVD of X. (ii) If λu , αu , αv = 0, then equivalent to Sparse PCA (Shen & Huang, 2008). (iii) If αu , αv = 0, then equivalent to a special case of the two-way Sparse PCA (Allen et al., 2011), which is the Lagrangian of (Witten et al., 2009) and closely related to that (Lee et al., 2010). (iv) If λu , λv , αu = 0, then equivalent to the Functional PCA solution of (Silverman, 1996) and (Huang et al., 2008).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

19 / 38

Relationship to Other PCA Approaches Theorem (i) If λu , λv , αu , αv = 0, then equivalent to PCA / the SVD of X. (ii) If λu , αu , αv = 0, then equivalent to Sparse PCA (Shen & Huang, 2008). (iii) If αu , αv = 0, then equivalent to a special case of the two-way Sparse PCA (Allen et al., 2011), which is the Lagrangian of (Witten et al., 2009) and closely related to that (Lee et al., 2010). (iv) If λu , λv , αu = 0, then equivalent to the Functional PCA solution of (Silverman, 1996) and (Huang et al., 2008). (v) If λu , λv = 0, then equivalent to the two-way FPCA solution of (Huang et al., 2009).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

19 / 38

Relationship to Other PCA Approaches

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

20 / 38

Desirable Numerical Properties Theorem 1

Identifiable (u and v) up to a sign change.

2

Balanced regularization (no regularization masking): I I

3

∃ a λmax s.t. u∗ = 0. u ∗ ∗ (u , v ) depend on all non-zero regularization parameters.

Well-scaled and non-degenerate. I

I

Either u∗,T (I + αu Ωu ) u∗ = 1 and v∗,T (I + αv Ωv ) v∗ = 1, Or u∗ = 0 and v∗ = 0.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

21 / 38

1

Motivation

2

Background & Challenges: Regularized PCA

3

Sparse & Functional PCA Model

4

Sparse & Functional PCA Algorithm

5

Simulation Studies

6

Case Study: EGG Data

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

22 / 38

SFPCA Optimization Problem Rank-one Sparse & Functional PCA the solution to:

maximize uT X v −λu P u (u) − λv P v (v) u,v

subject to uT (I + αu Ωu ) u ≤ 1 & vT (I + αv Ωv ) v ≤ 1.

Non-convex, non-differentiable, QCQP. P() convex =⇒ bi-convex. I

Convex in v with u fixed as well as the converse.

Idea: Alternating optimization.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

23 / 38

Relationship to Penalized Regression

Theorem The solution to the SFPCA problem with respect to u is given by: αu T 1 2 ˆ = argminu || X v − u ||2 + λu P u (u) + u Ωu u u 2 2 ( ˆ /||ˆ u u||I+αu Ωu ||ˆ u||I+αu Ωu > 0 u∗ = 0 otherwise. Analogous to re-scaled Elastic Net problem! (Zou & Hastie, 2005). Result holds because of A2, order-one penalties.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

24 / 38

Relationship to Penalized Regression

minimize u

1 αu T || X v − u ||22 + λu P u (u) + u Ωu u 2 2

Can be solved by (accelerated) proximal gradient descent. Geometric convergence, O(k logk), for convex penalties. A3, difference of convex, ensures convergence for non-convex penalties. Closed form proximal operator for many penalties: proxP (y, λ) = argminx { 12 ||x − y||22 + λP(x)}. I

Some form of thresholding (i.e. soft-thresholding for `1 penalty).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

24 / 38

SFPCA Algorithm Rank-One Algorithm 1

2

Initialize u and v to that of the rank-1 SVD of X. Set Su = I + αu Ωu and Lu = λmax (Su ); set Sv = I + αv Ωv and Lv = λmax (Sv ). Repeat until convergence: 1

2

3

4

3

ˆ . Repeat Estimate u until convergence: (t+1) u = proxP u u(t) + L1 (X v∗ − Su u(t) ), λLuu . ( ˆ /||ˆ u u||Su ||ˆ u||Su > 0 ∗ Set u = 0 otherwise. Estimate vˆ. Repeat until convergence: 1 (t+1) (t) v = proxP v v + L (XT u∗ − Sv v(t) ), λLvv . ( vˆ/||ˆ v||Sv ||ˆ v||Sv > 0 ∗ Set v = 0 otherwise.

Return u = u∗ /|| u∗ ||2 , v = v∗ /|| v∗ ||2 , and d = uT X v. G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

25 / 38

Convergence Theorem: Convergence of rank-one SFPCA. For P u and P v convex, The SFPCA Algorithm convergences to a stationary point of the SFPCA problem. The solution is unique given an initial starting point.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

26 / 38

Convergence Theorem: Convergence of rank-one SFPCA. For P u and P v convex, The SFPCA Algorithm convergences to a stationary point of the SFPCA problem. The solution is unique given an initial starting point. Multi-rank solutions can be computed in a greedy manner (power method). Several deflation schemes available (Mackey, 2009). Subtraction deflation most common: I I

Fit rank-one SFPCA to X to estimate 1st PC factors. Subtract rank one fit, X −d1 u1 vT 1 , and apply SFPCA to estimate 2nd PC factors.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

26 / 38

Extensions: Non-Negativity Constraints

maximize uT X v −λu P u (u) − λv P v (v) u,v

subject to uT (I + αu Ωu ) u ≤ 1 & vT (I + αv Ωv ) v ≤ 1, u ≥ 0 & v ≥ 0.

Corollary Replace proxP () with the positive proximal operator, 1 2 prox+ P (y, λ) = argminx:x≥0 { 2 ||x − y||2 + λP(x)} . Same convergence guarantees hold.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

27 / 38

Selecting Regularization Parameters Cross-validation (CV), Generalized CV, BIC, etc. Nested vs. Grid search. Ideas: Select αu and λu together. Exploit connection to Elastic Net to compute degrees of freedom for BIC. Example for `1 penalty: Let A(u) be the active set of u. −1 αu ˆ df (αu , λu ) = trace IA(u) − Ωu (A(u), A(u)) . 2

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

28 / 38

1

Motivation

2

Background & Challenges: Regularized PCA

3

Sparse & Functional PCA Model

4

Sparse & Functional PCA Algorithm

5

Simulation Studies

6

Case Study: EGG Data

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

29 / 38

Simulation I Setup: Rank-3 Model with sparse & smooth right factors. P T Xn×p = K k=1 dk uk vk + iid

ij ∼ N(0, 1). uk random orthonormal vectors of length n; D = diag([n/4, n/5, n/6]T ). vk fixed signal vectors of length p = 200:

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

30 / 38

Simulation I

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

30 / 38

Simulation I Table: n = 100 Results. v1

v2

v3

TP FP r∠ TP FP r∠ TP FP r∠ rSE

G. I. Allen (Rice & BCM)

TWFPCA 0.153 5.980 3.660 0.668

SSVD 0.897 0.323 0.625 0.783 0.320 0.549 0.771 0.316 0.855 0.760

PMD 0.568 0.001 2.220 0.657 0.106 0.597 0.514 0.066 1.270 1.000

Sparse & Functional PCA

SGPCA 0.768 0.006 0.726 0.445 0.002 0.829 0.499 0.004 1.010 0.737

SFPCA 0.935 0.052 0.189 0.713 0.047 0.438 0.883 0.054 0.468 0.450

April 4, 2014

30 / 38

Simulation I Table: n = 300 Results. v1

v2

v3

TP FP r∠ TP FP r∠ TP FP r∠ rSE

G. I. Allen (Rice & BCM)

TWFPCA 0.768 52.300 33.100 1.170

SSVD 0.973 0.322 0.487 0.919 0.319 0.428 0.943 0.314 0.545 0.790

PMD 0.509 0.000 15.700 0.773 0.000 1.310 0.530 0.000 5.940 3.380

Sparse & Functional PCA

SGPCA 0.921 0.005 0.553 0.839 0.038 0.488 0.849 0.015 0.631 0.809

SFPCA 0.987 0.068 0.152 0.967 0.048 0.320 0.972 0.060 0.131 0.655

April 4, 2014

30 / 38

Simulation I

SFPCA also improves feature selection . . .

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

30 / 38

Simulation II

Setup: Rank-2 Model with sparse & smooth spatial (25 × 25 grid) and temporal (200-length vector) factors.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

31 / 38

Simulation II

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

31 / 38

Simulation II

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

31 / 38

Simulation II

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

31 / 38

1

Motivation

2

Background & Challenges: Regularized PCA

3

Sparse & Functional PCA Model

4

Sparse & Functional PCA Algorithm

5

Simulation Studies

6

Case Study: EGG Data

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

32 / 38

EEG Predisposition to Alcoholism Data: EEG measures electrical signals in the active brain over time. Sampled from 64 channels at 256Hz. Consider 1st alcoholic subject over epochs relating to non-matching image stimuli. Data matrix: 57 × 5376, channel location by epoch time points (21 epochs of 256 time points each). Ωu weighted squared second differences matrix using spherical distances between channel locations. Ωv squared second differences matrix between time points.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

33 / 38

EEG Results PCA Results:

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

34 / 38

EEG Results Independent Component Analysis (ICA) Results:

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

34 / 38

EEG Results Penalized Matrix Decomposition & Two-Way FPCA Results:

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

34 / 38

EEG Results Sparse & Functional PCA Results:

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

34 / 38

EEG Results Comparison: PCA:

ICA:

SFPCA:

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

34 / 38

EEG Results

SFPCA Notes: 3.28 seconds to converge. (Software entirely in Matlab). BIC selected λu = 0 (spatial sparsity) for first 5 components. BIC selected αu = 10 − 12, αv = 0.5 − 10, and λv = 1 − 2.5 for first 5 components. Flexible, data-driven selection of appropriate amount of regularization.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

34 / 38

Summary & Future Work Summary SFPCA generalizes much of the existing literature on regularized PCA via alternating regressions. SFPCA has the flexibility to permit many types of regularization in a data-driven manner. SFPCA results in better signal recovery and more interpretable factors as well as improved feature selection. Future Statistical Work: Statistical consistency, especially in high-dimensional settings. Extensions to other multivariate methods: CCA, PLS, LDA, Clustering, and etc.

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

35 / 38

The Bigger Picture: Modern Multivariate Analysis Goal: Flexible, data-driven approaches for analyzing complex big-data. Approach: Alternating penalized regressions framework and deflation for any eigenvalue or singular value problems. Can mix and match any of the following: Generalizations that permit non-iid noise: Generalized PCA (Allen et al., 2013). Non-negativity constraints. (Today’s talk; Zaas, ; Allen and Maletic-Savatic, 2011). Higher-order data and multi-way arrays. (Allen, 2012; Allen, 2013). Structured Signal: Sparsity and/or Smoothness. (Today’s talk).

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

36 / 38

Acknowledgments Funding:

National Science Foundation, Division of Mathematical Sciences 1209017 & 1264058.

Software available at: http://www.stat.rice.edu/∼gallen/software.html

G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

37 / 38

References G. I. Allen, “Sparse and Functional Principal Components Analysis”, arXiv:1309.2895, 2013. G. I. Allen, L. Grosenick, and J. Taylor, ”A Generalized Least Squares Matrix Decomposition”, (To Appear) Journal of the American Statistical Association, Theory & Methods, 2014. G. I. Allen and M. Maletic-Savatic, ”Sparse Non-negative Generalized PCA with Applications to Metabolomics”, Bioinformatics, 27:21, 3029-3035, 2011. G. I. Allen, ”Sparse Higher-Order Principal Components Analysis”, In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, 2012. G. I. Allen, C. Peterson, M. Vannucci, and M. Maletic-Savatic, ”Regularized Partial Least Squares with an Application to NMR Spectroscopy”, Statistical Analysis and Data Mining, 6:4, 302-314, 2013. G. I. Allen, ”Multi-way Functional Principal Components Analysis”, In IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, 2013. J. Huang, H. Shen and A. Buja, “The analysis of two-way functional data using two-way regularized singular value decompositions”, Journal of the American Statistical Association, Theory & Methods, 104:488, 2009. D. M. Witten, R. Tibshirani, and T. Hastie, “A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis”, Biostatistics, 10:3, 515-534, 2009. G. I. Allen (Rice & BCM)

Sparse & Functional PCA

April 4, 2014

38 / 38

Sparse & Functional Principal Components Analysis - Rice Statistics [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch