Linear modeling of high-dimensional designed data based on ASCA/APCA family of methods
limpca.RdThis package has for objectives to provide a method to make Linear Models for high-dimensional designed data. This method handles unbalanced design. More features should be included in the future (e.g. generalized linear models, random effects, ...).
The core functions of the package are:
data2LmpDataListConverts data to a lmpDataList, the input argument for
lmpModelMatrix.lmpModelMatrixCreates the model matrix \(\mathbf{X}\) from the design matrix and the model formula.
lmpEffectMatricesEstimates the model by OLS based on the outcomes and model matrices provided in the outputs of the
lmpModelMatrixfunction and calculates the estimated effect matrices \(\hat{\mathbf{M}}_0, \hat{\mathbf{M}}_1, ...\hat{\mathbf{M}}_F\) and residual matrix \(\hat{\mathbf{E}}\). It calculates also the type III percentage of variance explained by each effect.lmpBootstrapTestsTests the significance of one or a combination of the model effects using bootstrap. This function is based on the outputs of the
lmpEffectMatricesfunction.lmpPcaEffectsPerforms a PCA on each of the effect matrices from the outputs of
lmpEffectMatrices. It has an option to choose the method applied: ASCA, APCA or ASCA-E. Combined effects (i.e. linear combinations of original effect matrices) can also be created and decomposed by PCA.
The functions allowing the visualisation of the Linear Models results are:
lmpScreePlotProvides a barplot of the percentage of variance associated to the PCs of the effect matrices ordered by importance based on the outputs of
lmpContributions.lmpContributionsThis reports the contribution of each effect to the total variance, but also the contribution of each PC to the total variance per effect. Moreover, these contributions are summarized in a barplot.
lmpScorePlotDraws the score plots of each effect matrix provided in the
lmpPcaEffectsfunction output.lmpLoading1dPlotorlmpLoading2dPlotPlots the loadings as a line plot (1D) or in 2D as a scatterplot.
lmpScoreScatterPlotMPlots the scores of all model effects simultaneously in a scatterplot matrix. By default, the first PC only is kept for each model effect.
lmpEffectPlotPlots the ASCA scores by effect levels for a given model effect and for one PC at a time. This graph is especially appealing to interpret interactions or combined effects.
Other useful functions to visualise and explore by PCA the multivariate data are:
plotDesignProvides a graphical representation of the experimental design. It allows to visualize factor levels and check the design balance.
plotScatterProduces a plot describing the relationship between two columns of the outcomes matrix \(\mathbf{Y}\). It allows to choose colors and symbols for the levels of the design factors. Ellipses, polygons or segments can be added to group different sets of points on the graph.
plotScatterMProduces a scatter plot matrix between the selected columns of the outcomes matrix \(\mathbf{Y}\) choosing specific colors and symbols for up to four factors from the design on the upper and lower diagonals.
plotMeansDraws, for a given response variable, a plot of the response means by levels of up to three categorical factors from the design. When the design is balanced, it allows to visualize main effects or interactions for the response of interest. For unbalanced designs, this plot must be used with caution.
plotLineGenerates the response profile of one or more observations i.e. plots of one or more rows of the outcomes matrix on the y-axis against the m response variables on the x-axis. Depending on the response type (spectra, gene expression...), point, line or segment plots can be used.
pcaBySvdOperates a principal component analysis on the \(\mathbf{Y}\) outcome/response matrix by a singular value decomposition. Outputs are can be visulised with the functions
pcaScorePlot,pcaLoading1dPlot,pcaLoading2dPlotandpcaScreePlot.pcaScorePlotProduces score plots from the
pcaBySvdoutput.pcaLoading1dPlotorpcaLoading2dPlotPlots the PCA loadings as a line plot (1D) or in 2D as a scatterplot.
pcaScreePlotReturns a bar plot of the percentage of variance explained by each Principal Component (PC) calculated by
pcaBySvd.
Details
| Package: | limpca | 
| Type: | Package | 
| License: | GPL-2 | 
See the package vignettes (vignette(package = "limpca")) for detailed case studies.
References
Thiel, M., Benaiche, N., Martin, M., Franceschini, S., Van Oirbeek, R., Govaerts, B. (2023). limpca: An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods. Journal of Chemometrics. e3482. https://doi.org/10.1002/cem.3482
Martin, M. (2020). Uncovering informative content in metabolomics data: from pre-processing of 1H NMR spectra to biomarkers discovery in multifactorial designs. Prom.: Govaerts, B. PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, UCLouvain, Belgium. http://hdl.handle.net/2078.1/227671
Thiel M., Feraud B. and Govaerts B. (2017). ASCA+ and APCA+: Extensions of ASCA and APCA in the analysis of unbalanced multifactorial designs. Journal of Chemometrics. 31:e2895. https://doi.org/10.1002/cem.2895