Matrix in R

Sources

Matrix in R tutorial by Prof. Giovanni Petris.
doParallel guide by Weston and Calaway available here.
Miscellaneous internet resources.
Before you run a module or chunk, make sure that you have all the packages installed.
We need the R package ‘Matrix’ and ‘doParallel’.

Matrices

To arrange values into a matrix, we use the matrix() function:

a <- matrix(1 : 6, nrow = 2, ncol = 3)
a

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

Individual entries can be referred to using a pair of indices. For example, the element in the second row, third column can be printed as:

a[2, 3]

## [1] 6

Rows, Columns

An entire row or column can be retrieved by specifying its index and leaving the other index empty:

a[2, ]

## [1] 2 4 6

Similarly to the way R works with vectors, you can use negative indices to exclude row or colums of a matrix:

a[-2,]

## [1] 1 3 5

Matrix indices

Note how the values 1,2,3,4,5,6 are used to create the matrix: the first two are used to fill the first column, then the next two to fill the second column, and so on.
R allows matrices to be indexed by a single number, e.g.,

a[5]

## [1] 5

The fifth element of a is the fifth element of the vector obtained by “unrolling” the matrix, column by column.

Fill by row

Sometimes you need to fill a matrix row by row, instead than column by column.
You can change the default behavior as follows:

a <- matrix(1 : 6, nrow = 2, ncol = 3, byrow = TRUE)
a

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

Diagonals

We have seen rbind() and cbind() before.
e.g. We can use cbind() to construct a 3 x 3 Hilbert matrix. The (i, j) entry is 1/(i + j - 1).

H3 <- 1 / cbind(1 : 3, 2 : 4, 3 : 5)
H3

##           [,1]      [,2]      [,3]
## [1,] 1.0000000 0.5000000 0.3333333
## [2,] 0.5000000 0.3333333 0.2500000
## [3,] 0.3333333 0.2500000 0.2000000

Diagonals and Determinants

The diagonal of a matrix can be extracted using diag().

diag(H3)

## [1] 1.0000000 0.3333333 0.2000000

The determinant of a square matrix can be obtained with det().

det(H3)

## [1] 0.000462963

Triangular

The functions lower.tri() and upper.tri() can be used to obtain the lower and upper triangular parts of matrices. The output of the functions is a matrix of logical elements, with TRUE representing the relevant triangular elements. For example,

lower.tri(H3)

##       [,1]  [,2]  [,3]
## [1,] FALSE FALSE FALSE
## [2,]  TRUE FALSE FALSE
## [3,]  TRUE  TRUE FALSE

Triangular

A typical use of these functions is to set the upper or lower triangular part of a matrix to zero, thus constructing a triangular matrix.

Htri <- H3
Htri[lower.tri(Htri)] <- 0
Htri

##      [,1]      [,2]      [,3]
## [1,]    1 0.5000000 0.3333333
## [2,]    0 0.3333333 0.2500000
## [3,]    0 0.0000000 0.2000000

Multiplication

Matrix multiplication in R is performed by the operator %*%. For example,

A <- matrix(c(1, 2, 0, 1), 2)
B <- matrix(c(1, 3, 3, 1, 4, 5), nr = 2)
A

##      [,1] [,2]
## [1,]    1    0
## [2,]    2    1

##      [,1] [,2] [,3]
## [1,]    1    3    4
## [2,]    3    1    5

Matrices must conform, i.e. multiplication must make sense !

Multiplication

A %*% B

##      [,1] [,2] [,3]
## [1,]    1    3    4
## [2,]    5    7   13

dim(A)

## [1] 2 2

ncol(A)

## [1] 2

nrow(B)

## [1] 2

Multiplication

With \(A\) and \(B\) defined as above the matrix product \(BA\) in not defined, but the product \(B^T A\) is (\(B^T\) denotes the transpose of \(B\)). - The transpose of a matrix can be obtained with the function t().

t(B) %*% A

##      [,1] [,2]
## [1,]    7    3
## [2,]    5    1
## [3,]   14    5

Crossprod

A much more efficient way of computing the same matrix product is via the function crossprod().

all.equal(crossprod(B, A), t(B) %*% A)

## [1] TRUE

A similar function, tcrossprod(), computes the matrix product \(AB^T\) whenever the product is well defined.

Exercise (Try at home)

Consider the matrix

X <- cbind(1, seq(-1, 1, length = 11))
X

##       [,1] [,2]
##  [1,]    1 -1.0
##  [2,]    1 -0.8
##  [3,]    1 -0.6
##  [4,]    1 -0.4
##  [5,]    1 -0.2
##  [6,]    1  0.0
##  [7,]    1  0.2
##  [8,]    1  0.4
##  [9,]    1  0.6
## [10,]    1  0.8
## [11,]    1  1.0

Use crossprod() and tcrossprod() to find \(X^T X\) and \(X X^T\).
Note the dimensions of the resulting matrices.

Matrix inversion

The inverse of a matrix can be found using solve():

Ainv <- solve(A)
Ainv

##      [,1] [,2]
## [1,]    1    0
## [2,]   -2    1

all.equal(Ainv %*% A, diag(2))

## [1] TRUE

Solve linear system

Inverse in R uses QR decomposition, that we might briefly talk about at the very end (if time permits)!
For linear systems we don’t need to compute inverses. We can use Gaussian Elimination.

b <- c(5, 3)
solve(A, b) # use this

## [1]  5 -7

Ainv %*% b # don't use this

##      [,1]
## [1,]    5
## [2,]   -7

Time comparison

Compare execution times for the different methods of solving the Least Squares problem when the number of observations is large.
Let us start by making up our own random data: \(n \times p\) design matrix \(X\) and an \(n \times 1\) response vector \(y\).
Take \(n = 5 \times 10^5\).

set.seed(123)
n <- 5e5; p <- 20
X <- matrix(rnorm(n * p, mean = 1 : p, sd = 10), nr = n, nc = p, byrow = TRUE)
y <- rowSums(X) + rnorm(n)

OLS

The naive approach to solve the LS problem is to compute: \(\hat{\beta} = (X^T X)^{-1}X^T y\)

system.time(bHat1 <- solve(t(X) %*% X) %*% t(X) %*% y)

##    user  system elapsed 
##    0.44    0.03    0.47

Not so naive: use crossprod.

system.time(bHat2 <- solve(crossprod(X), crossprod(X, y)))

##    user  system elapsed 
##     0.2     0.0     0.2

Inverting a matrix using Cholesky helps

A symmetric positive definite matrix can be factorized as: \[ A = L L^T \]

p = 1000; A = array(rnorm(p*p), c(p,p))
A = crossprod(A)
ptm <- proc.time(); B = solve(A); proc.time()-ptm

##    user  system elapsed 
##    0.81    0.00    0.81

ptm <- proc.time(); B = chol2inv(A); proc.time()-ptm

##    user  system elapsed 
##    0.22    0.00    0.22

Storage of sparse matrices

Sparse matrices appear in many areas of Statistics and Machine learning.
1. For large Markov chain, the transition matrix becomes sparse.
2. Large design matrix / contingency tables.
3. Modern dataset with 0 as imputation for missing values.
If matrix \(A_{n\times n}\) has average \(k\) non-zero entries in each row and \(k << n\), then storage reduces from \(O(n^2)\) to \(O(n)\).

Storage of unstructured sparse matrices

# Demo 2: saving sparse matrices in R can save memory
library(Matrix)
m1 = matrix(0, nrow=1000, ncol=1000)
m2 = Matrix(0, nrow=1000, ncol=1000, sparse=TRUE)
object.size(m1)

## 8000136 bytes

object.size(m2)

## 5072 bytes

Sources

Matrices

Rows, Columns

Matrix indices

Fill by row

Diagonals

Diagonals and Determinants

Triangular

Triangular

Multiplication

Multiplication

Multiplication

Crossprod

Exercise (Try at home)

Matrix inversion

Solve linear system

Time comparison

OLS

More advanced topics

Inverting a matrix using Cholesky helps

Storage of sparse matrices

Storage of unstructured sparse matrices