Linear modeling of high-dimensional designed data based on ASCA/APCA family of methods
limpca.Rd
This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. This method handles unbalanced design. More features should be included in the future (e.g. generalized linear models, random effects, ...).
The core functions of the package are:
data2LmpDataList
Converts data to a lmpDataList, the input argument for
lmpModelMatrix
.lmpModelMatrix
Creates the model matrix \(\mathbf{X}\) from the design matrix and the model formula.
lmpEffectMatrices
Estimates the model by OLS based on the outcomes and model matrices provided in the outputs of the
lmpModelMatrix
function and calculates the estimated effect matrices \(\hat{\mathbf{M}}_0, \hat{\mathbf{M}}_1, ...\hat{\mathbf{M}}_F\) and residual matrix \(\hat{\mathbf{E}}\). It calculates also the type III percentage of variance explained by each effect.lmpBootstrapTests
Tests the significance of one or a combination of the model effects using bootstrap. This function is based on the outputs of the
lmpEffectMatrices
function.lmpPcaEffects
Performs a PCA on each of the effect matrices from the outputs of
lmpEffectMatrices
. It has an option to choose the method applied: ASCA, APCA or ASCA-E. Combined effects (i.e. linear combinations of original effect matrices) can also be created and decomposed by PCA.
The functions allowing the visualisation of the Linear Models results are:
lmpScreePlot
Provides a barplot of the percentage of variance associated to the PCs of the effect matrices ordered by importance based on the outputs of
lmpContributions
.lmpContributions
This reports the contribution of each effect to the total variance, but also the contribution of each PC to the total variance per effect. Moreover, these contributions are summarized in a barplot.
lmpScorePlot
Draws the score plots of each effect matrix provided in the
lmpPcaEffects
function output.lmpLoading1dPlot
orlmpLoading2dPlot
Plots the loadings as a line plot (1D) or in 2D as a scatterplot.
lmpScoreScatterPlotM
Plots the scores of all model effects simultaneously in a scatterplot matrix. By default, the first PC only is kept for each model effect.
lmpEffectPlot
Plots the ASCA scores by effect levels for a given model effect and for one PC at a time. This graph is especially appealing to interpret interactions or combined effects.
Other useful functions to visualise and explore by PCA the multivariate data are:
plotDesign
Provides a graphical representation of the experimental design. It allows to visualize factor levels and check the design balance.
plotScatter
Produces a plot describing the relationship between two columns of the outcomes matrix \(\mathbf{Y}\). It allows to choose colors and symbols for the levels of the design factors. Ellipses, polygons or segments can be added to group different sets of points on the graph.
plotScatterM
Produces a scatter plot matrix between the selected columns of the outcomes matrix \(\mathbf{Y}\) choosing specific colors and symbols for up to four factors from the design on the upper and lower diagonals.
plotMeans
Draws, for a given response variable, a plot of the response means by levels of up to three categorical factors from the design. When the design is balanced, it allows to visualize main effects or interactions for the response of interest. For unbalanced designs, this plot must be used with caution.
plotLine
Generates the response profile of one or more observations i.e. plots of one or more rows of the outcomes matrix on the y-axis against the m response variables on the x-axis. Depending on the response type (spectra, gene expression...), point, line or segment plots can be used.
pcaBySvd
Operates a principal component analysis on the \(\mathbf{Y}\) outcome/response matrix by a singular value decomposition. Outputs are can be visulised with the functions
pcaScorePlot
,pcaLoading1dPlot
,pcaLoading2dPlot
andpcaScreePlot
.pcaScorePlot
Produces score plots from the
pcaBySvd
output.pcaLoading1dPlot
orpcaLoading2dPlot
Plots the PCA loadings as a line plot (1D) or in 2D as a scatterplot.
pcaScreePlot
Returns a bar plot of the percentage of variance explained by each Principal Component (PC) calculated by
pcaBySvd
.
Details
Package: | limpca |
Type: | Package |
License: | GPL-2 |
See the package vignettes (vignette(package = "limpca")
) for detailed case studies.
References
Thiel, M., Benaiche, N., Martin, M., Franceschini, S., Van Oirbeek, R., Govaerts, B. (2023). limpca: An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods. Journal of Chemometrics. e3482. https://doi.org/10.1002/cem.3482
Martin, M. (2020). Uncovering informative content in metabolomics data: from pre-processing of 1H NMR spectra to biomarkers discovery in multifactorial designs. Prom.: Govaerts, B. PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, UCLouvain, Belgium. http://hdl.handle.net/2078.1/227671
Thiel M., Feraud B. and Govaerts B. (2017). ASCA+ and APCA+: Extensions of ASCA and APCA in the analysis of unbalanced multifactorial designs. Journal of Chemometrics. 31:e2895. https://doi.org/10.1002/cem.2895