Package ubic.basecode.math.linearmodels
Class LeastSquaresFit
java.lang.Object
ubic.basecode.math.linearmodels.LeastSquaresFit
For performing "bulk" linear model fits, but also offers simple methods for simple univariate and multivariate
regression for a single vector of dependent variables (data). Has support for ebayes-like shrinkage of variance.
Data with missing values is handled but is less memory efficient and somewhat slower. The main cost is that when there are no missing values, a single QR decomposition can be performed.
- Author:
- paul
-
Constructor Summary
ConstructorsConstructorDescriptionLeastSquaresFit
(DoubleMatrix1D vectorA, DoubleMatrix1D vectorB) Least squares fit between two vectors.LeastSquaresFit
(DoubleMatrix1D vectorA, DoubleMatrix1D vectorB, DoubleMatrix1D weights) Stripped-down interface for simple use.ANOVA not possible (use the other constructors)LeastSquaresFit
(DoubleMatrix2D A, DoubleMatrix2D b, DoubleMatrix2D weights) Weighted least squares fit between two matricesLeastSquaresFit
(ObjectMatrix<String, String, Object> sampleInfo, DenseDoubleMatrix2D data) LeastSquaresFit
(ObjectMatrix<String, String, Object> sampleInfo, DenseDoubleMatrix2D data, boolean interactions) LeastSquaresFit
(ObjectMatrix<String, String, Object> design, DoubleMatrix<String, String> b) NamedMatrix allows easier handling of the results.LeastSquaresFit
(ObjectMatrix<String, String, Object> design, DoubleMatrix<String, String> data, boolean interactions) NamedMatrix allows easier handling of the results.LeastSquaresFit
(DesignMatrix designMatrix, DoubleMatrix2D b, DoubleMatrix2D weights) Preferred interface for weighted least squares fit between two matricesLeastSquaresFit
(DesignMatrix designMatrix, DoubleMatrix<String, String> data) Preferred interface if you want control over how the design is set up.LeastSquaresFit
(DesignMatrix designMatrix, DoubleMatrix<String, String> data, DoubleMatrix2D weights) Weighted least squares fit between two matrices -
Method Summary
Modifier and TypeMethodDescriptionprotected List
<GenericAnovaResult> anova()
Compute ANOVA based on the model fit (Type I SSQ, sequential)protected void
ebayesUpdate
(double d, double v, DoubleMatrix1D vp) Provide results of limma eBayes algorithm.The matrix of coefficients x for Ax = b (parameter estimates).double
int
double
boolean
boolean
summarize
(boolean anova) summarizeByKeys
(boolean anova)
-
Constructor Details
-
LeastSquaresFit
Preferred interface if you want control over how the design is set up.- Parameters:
designMatrix
-data
-
-
LeastSquaresFit
public LeastSquaresFit(DesignMatrix designMatrix, DoubleMatrix<String, String> data, DoubleMatrix2D weights) Weighted least squares fit between two matrices- Parameters:
designMatrix
-data
-weights
- to be used in modifying the influence of the observations in data.
-
LeastSquaresFit
Preferred interface for weighted least squares fit between two matrices- Parameters:
designMatrix
-b
- the dataweights
- to be used in modifying the influence of the observations in vectorB.
-
LeastSquaresFit
Least squares fit between two vectors. Always adds an intercept!- Parameters:
vectorA
- DesignvectorB
- Data
-
LeastSquaresFit
Stripped-down interface for simple use. Least squares fit between two vectors. Always adds an intercept!- Parameters:
vectorA
- DesignvectorB
- Dataweights
- to be used in modifying the influence of the observations in vectorB.
-
LeastSquaresFit
ANOVA not possible (use the other constructors)- Parameters:
A
- Design matrix, which will be used directly in least squares regressionb
- Data matrix, containing data in rows.
-
LeastSquaresFit
Weighted least squares fit between two matrices- Parameters:
A
- Designb
- Dataweights
- to be used in modifying the influence of the observations in b. If null, will be ignored.
-
LeastSquaresFit
- Parameters:
sampleInfo
- information that will be converted to a design matrix; intercept term is added.data
- Data matrix
-
LeastSquaresFit
public LeastSquaresFit(ObjectMatrix<String, String, Object> sampleInfo, DenseDoubleMatrix2D data, boolean interactions) - Parameters:
sampleInfo
-data
-interactions
- add interaction term (two-way only is supported)
-
LeastSquaresFit
NamedMatrix allows easier handling of the results.- Parameters:
design
- information that will be converted to a design matrix; intercept term is added.b
- Data matrix
-
LeastSquaresFit
public LeastSquaresFit(ObjectMatrix<String, String, Object> design, DoubleMatrix<String, String> data, boolean interactions) NamedMatrix allows easier handling of the results.- Parameters:
design
- information that will be converted to a design matrix; intercept term is added.data
- Data matrix
-
-
Method Details
-
getCoefficients
The matrix of coefficients x for Ax = b (parameter estimates). Each column represents one fitted model (e.g., one gene); there is a row for each parameter.- Returns:
-
getDfPrior
public double getDfPrior() -
getFitted
-
getResidualDof
public int getResidualDof() -
getResidualDofs
-
getResiduals
-
getStudentizedResiduals
- Returns:
- externally studentized residuals (assumes we have only one QR)
-
getVarPost
-
getVarPrior
public double getVarPrior() -
getWeights
-
isHasBeenShrunken
public boolean isHasBeenShrunken() -
isHasMissing
public boolean isHasMissing() -
summarize
- Returns:
- summaries. ANOVA will not be computed. If ebayesUpdate has been run, variance and degrees of freedom estimated using the limma eBayes algorithm will be used.
-
summarize
- Parameters:
anova
- if true, ANOVA will be computed- Returns:
-
summarizeByKeys
- Parameters:
anova
- perform ANOVA, otherwise only basic summarization will be done. If ebayesUpdate has been run, variance and degrees of freedom estimated using the limma eBayes algorithm will be used.- Returns:
-
anova
Compute ANOVA based on the model fit (Type I SSQ, sequential)The idea is to add up the sums of squares (and dof) for all parameters associated with a particular factor.
This code is more or less ported from R summary.aov.
- Returns:
-
ebayesUpdate
Provide results of limma eBayes algorithm. These will be used next time summarize is called on this.- Parameters:
d
- dfPriorv
- varPriorvp
- varPost
-