Our intention is to use sparse matrix multiplications to help reduce computation time.

Given a large sparse matrix `X`

, we want to compute some matrix multiplications associated with a scaled \(\tilde{X}\). We notice that after scaling, `X`

becomes a dense matrix and is not possible for a sparse matrix multiplication. So we construct formulae to apply sparse matrix multiplication first on a standardized `X`

since standardization does not affect its sparsity. Then we perform centering to get the same result.

There are two types of matrix multiplications we want to investigate:

Compute \(\tilde{X}b\), where \(\tilde{X}\) is an n by p scaled matrix and \(b\) is a p vector.

Compute \(\tilde{X}^Ty\), where \(\tilde{X}\) is an n by p scaled matrix and \(y\) is an n vector.

This strategy has a decent performance when computing both \(\tilde{X}b\) and \(\tilde{X}^Ty\), compared to simple matrix multiplication `%*%`

.

Suppose we want to compute \(\tilde{X}b\), where \(\tilde{X}\) is a scaled n by p matrix and \(b\) is a p vector. Our goal is to express \(\tilde{X}b\) into a term involving unscaled \(X\) matrix multiplication to achieve sparse matrix operation.

\[\begin{equation} \begin{aligned} \tilde{X}b &= \sum_{j=1}^{p} \tilde{X}_{\cdot j} b_j \\ &= \sum_{j=1}^{p} \frac{X_{\cdot j}-\mu_j}{\sigma_j}b_j \\ &= \sum_{j=1}^{p}\frac{X_{\cdot j}}{\sigma_j}b_j - \sum_{j=1}^{p} \frac{\mu_j}{\sigma_j}b_j \\ &= X b / \sigma - \mu^Tb/\sigma, \end{aligned} \end{equation}\]

where \(\mu\) is a *p*-vector of column means, and \(\sigma\) is a *p*-vector of column standard deviations.

Suppose we want to compute \(\tilde{X}^Ty\), where \(\tilde{X}\) is a scaled n by p matrix and \(y\) is an n vector. Similarly, we express \(\tilde{X}^Ty\) using unscaled \(X\) so that we can perform sparse matrix multiplication. We have the following:

\[\begin{equation} \begin{aligned} \tilde{X}^Ty &= \sum_{i=1}^{n} \tilde{X}_{i.}y_i \\ &= \sum_{i=1}^{n} \frac{X_{i.} - \mu}{\sigma}y_i \\ &= \frac{1}{\sigma}\sum_{i=1}^{n}X_{i.}y_i - \frac{\mu}{\sigma}\sum_{i=1}^{n} y_i \\ &= \frac{1}{\sigma}(X^Ty) - \frac{\mu}{\sigma}y^T 1, \end{aligned} \end{equation}\]

where \(\mu\) is a *p*-vector of column means, and \(\sigma\) is a *p*-vector of columnwise standard deviations.

We simulate an `n = 1000`

by `p = 10000`

matrix `X`

at sparsity \(99\%\), i.e. \(99\%\) entries are zeros. We compare results between normal matrix computation and our sparse strategy as well as comparing speed using microbenchmark.

```
create_sparsity_mat <- function(sparsity, n, p) {
nonzero <- round(n*p*(1-sparsity))
nonzero.idx <- sample(n*p, nonzero)
mat <- numeric(n*p)
mat[nonzero.idx] <- 1
mat <- matrix(mat, nrow=n,ncol=p)
return(mat)
}
n <- 1000
p <- 10000
```

```
X.dense <- create_sparsity_mat(0.99,n,p)
X.sparse <- as(X.dense,"dgCMatrix")
X.tilde <- susieR:::set_X_attributes(X.dense) #returns a scaled X if input is a dense matrix
X <- susieR:::set_X_attributes(X.sparse) #return an unsacled sparse X if input is a sparse matrix
#but computes column means and standard deviations
```

The final results of two methods when computing \(\tilde{X}b\) are very close.

```
compute_Xb_benchmark <- microbenchmark(
dense = (use.normal.Xb <- X.tilde%*%b),
sparse = (use.sparse.Xb <- susieR:::compute_Xb(X,b)),
times = 20,unit = "s")
```

Our sparse strategy demonstrates an obvious advantage over the normal matrix multiplication in computing \(\tilde{X}b\).

```
autoplot(compute_Xb_benchmark)
# Coordinate system already present. Adding new coordinate system, which will replace the existing one.
```

The final results of two methods when computing \(\tilde{X}^Ty\) are almost the same.

```
compute_Xty_benchmark = microbenchmark(
dense = (use.normal.Xty <- t(X.tilde)%*%y),
sparse = (use.sparse.Xty <- susieR:::compute_Xty(X, y)),
times = 20,unit = "s")
```

Our sparse strategy evidently has a better performance than the normal method in computing \(\tilde{X}^Ty\).

```
autoplot(compute_Xty_benchmark)
# Coordinate system already present. Adding new coordinate system, which will replace the existing one.
```