Class AbstractMultiAssayExpressionDataMatrix<T>
- All Implemented Interfaces:
BulkExpressionDataMatrix<T>,ExpressionDataMatrix<T>,MultiAssayBulkExpressionDataMatrix<T>
- Direct Known Subclasses:
EmptyExpressionMatrix,ExpressionDataBooleanMatrix,ExpressionDataDoubleMatrix,ExpressionDataIntegerMatrix,ExpressionDataStringMatrix
Implementation note: The underlying DoubleMatrixNamed is indexed by Integers, which are in turn mapped to BioAssays etc. held here. Thus the 'names' of the underlying matrix are just numbers.
- Author:
- pavlidis
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedCopy constructor.protectedCreate a simple multi-assay matrix.protectedAbstractMultiAssayExpressionDataMatrix(ExpressionExperiment ee, Collection<BioAssayDimension> dimension) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidaddToRowMaps(BulkExpressionDataVector vector) protected voidaddToRowMaps(CompositeSequence designElement, QuantitationType qt, BioAssayDimension dim) Add a design element to the row maps.intcolumns()Obtain the total number of columns.intNumber of columns that use the given design element.protected abstract Stringformat(int row, int column) Format the value at the provided indices of the matrix.protected StringProduce a string representation of the type of values held in the matrix.get(CompositeSequence designElement, BioAssay bioAssay) Access a single value of the matrix.Obtain the largestBioAssayDimensionthat covers all the biomaterials in this matrix.Obtain the dimension for the columns of this matrix.getBioAssayDimension(CompositeSequence designElement) Produce a BioAssayDimension representing the matrix columns for a specific row.Obtain all theBioAssayDimensions that are used in this matrix.getBioAssayForColumn(int index) Obtain an assay corresponding to a given column.getBioAssaysForColumn(int index) getBioMaterialForColumn(int index) Obtain a biomaterial corresponding to a column.T[]Access a single column of the matrix.intgetColumnIndex(BioAssay bioAssay) Obtain the column index of a given assay.intgetColumnIndex(BioMaterial bioMaterial) protected StringgetColumnLabel(int j) Obtain a label suitable for describing a column of the matrix.getDesignElementForRow(int index) Return a design element for a given index.Obtain all the design elements in this data matrix.Return the expression experiment this matrix is holding data for, if known.Obtain the quantitation type for this matrix.getQuantitationType(CompositeSequence designElement) Return the quantitation type used for data from the given design element.Return the quantitation types for this matrix.T[]getRow(CompositeSequence designElement) Return a row that 'came from' the given design element.getRowElement(int index) intgetRowIndex(CompositeSequence designElement) int[]getRowIndices(CompositeSequence designElement) Obtain all the rows that correspond to the given design element, ornullif the design element is not found.protected StringgetRowLabel(int i) Obtain a label suitable for describing a row of the matrix.introws()protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors) Selects all the vectors passed in (uses them to initialize the data)protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes) protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes) protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, QuantitationType quantitationType) protected voidNote: In the current versions of Gemma, we require that there can be only a single BioAssayDimension.protected voidsetUpColumnElements(LinkedHashMap<BioMaterial, Set<BioAssay>> bioMaterialMap) Methods inherited from class ubic.gemma.core.datastructure.matrix.AbstractExpressionDataMatrix
format, format, toStringMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface ubic.gemma.core.datastructure.matrix.BulkExpressionDataMatrix
getRawMatrix, hasMissingValues, sliceColumns, sliceColumnsMethods inherited from interface ubic.gemma.core.datastructure.matrix.ExpressionDataMatrix
get, getColumn, getRow, sliceRows
-
Constructor Details
-
AbstractMultiAssayExpressionDataMatrix
Create a simple multi-assay matrix.Add rows with
addToRowMaps(CompositeSequence, QuantitationType, BioAssayDimension)and callsetUpColumnElements()once you are done during the constructor of the subclass. -
AbstractMultiAssayExpressionDataMatrix
protected AbstractMultiAssayExpressionDataMatrix(@Nullable ExpressionExperiment ee, Collection<BioAssayDimension> dimension) -
AbstractMultiAssayExpressionDataMatrix
protected AbstractMultiAssayExpressionDataMatrix(AbstractMultiAssayExpressionDataMatrix<T> sourceMatrix) Copy constructor.
-
-
Method Details
-
getBioAssayDimensions
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain all theBioAssayDimensions that are used in this matrix.- Specified by:
getBioAssayDimensionsin interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain the dimension for the columns of this matrix.- Specified by:
getBioAssayDimensionin interfaceBulkExpressionDataMatrix<T>- Specified by:
getBioAssayDimensionin interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBestBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain the largestBioAssayDimensionthat covers all the biomaterials in this matrix.- Specified by:
getBestBioAssayDimensionin interfaceMultiAssayBulkExpressionDataMatrix<T>- Returns:
- the best
BioAssayDimensionfor this matrix, orOptional.empty()if no such dimension exists
-
getBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrixProduce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Specified by:
getBioAssayDimensionin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
designElement- de- Returns:
- the dimension applicable to the design element or
nullif the design element is not present in the matrix
-
getBioMaterials
- Specified by:
getBioMaterialsin interfaceBulkExpressionDataMatrix<T>
-
columns
public int columns()Description copied from interface:ExpressionDataMatrixObtain the total number of columns.- Specified by:
columnsin interfaceExpressionDataMatrix<T>
-
columns
Description copied from interface:MultiAssayBulkExpressionDataMatrixNumber of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Specified by:
columnsin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
el- el- Returns:
- int
-
getBioAssaysForColumn
- Specified by:
getBioAssaysForColumnin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
index- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioAssayForColumn
Description copied from interface:BulkExpressionDataMatrixObtain an assay corresponding to a given column.- Specified by:
getBioAssayForColumnin interfaceBulkExpressionDataMatrix<T>
-
getBioMaterialForColumn
Description copied from interface:BulkExpressionDataMatrixObtain a biomaterial corresponding to a column.- Specified by:
getBioMaterialForColumnin interfaceBulkExpressionDataMatrix<T>- Specified by:
getBioMaterialForColumnin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
index- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumn
Description copied from interface:BulkExpressionDataMatrixAccess a single column of the matrix.- Specified by:
getColumnin interfaceBulkExpressionDataMatrix<T>- Returns:
- a vector for the given column, or null if the column is not present
-
getColumnIndex
- Specified by:
getColumnIndexin interfaceBulkExpressionDataMatrix<T>- Specified by:
getColumnIndexin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
bioMaterial- bm- Returns:
- the index of the column for the data for the bioMaterial, or -1 if missing
-
rows
public int rows()- Specified by:
rowsin interfaceExpressionDataMatrix<T>- Returns:
- int
-
getRow
Description copied from interface:ExpressionDataMatrixReturn a row that 'came from' the given design element.- Specified by:
getRowin interfaceExpressionDataMatrix<T>- Parameters:
designElement- de- Returns:
- the corresponding row or null if the design element is not found in the matrix
-
getDesignElements
Description copied from interface:ExpressionDataMatrixObtain all the design elements in this data matrix.- Specified by:
getDesignElementsin interfaceExpressionDataMatrix<T>
-
getDesignElementForRow
Description copied from interface:ExpressionDataMatrixReturn a design element for a given index.- Specified by:
getDesignElementForRowin interfaceExpressionDataMatrix<T>
-
get
Description copied from interface:BulkExpressionDataMatrixAccess a single value of the matrix. Note that because there can be multiple bioassays per column and multiple design elements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Specified by:
getin interfaceBulkExpressionDataMatrix<T>- Parameters:
designElement- debioAssay- ba- Returns:
- the value at the given design element and bioassay, or
nullif the value is missing
-
getExpressionExperiment
Description copied from interface:ExpressionDataMatrixReturn the expression experiment this matrix is holding data for, if known.- Specified by:
getExpressionExperimentin interfaceExpressionDataMatrix<T>
-
getQuantitationTypes
Description copied from interface:MultiAssayBulkExpressionDataMatrixReturn the quantitation types for this matrix. Often (usually) there will be just one.- Specified by:
getQuantitationTypesin interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getQuantitationType
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain the quantitation type for this matrix.In the case of multi-assay matrices, more than one quantitation type may be present. When possible, those are merged with
QuantitationTypeUtils.mergeQuantitationTypes(Collection).- Specified by:
getQuantitationTypein interfaceExpressionDataMatrix<T>- Specified by:
getQuantitationTypein interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getQuantitationType
Description copied from interface:MultiAssayBulkExpressionDataMatrixReturn the quantitation type used for data from the given design element.- Specified by:
getQuantitationTypein interfaceMultiAssayBulkExpressionDataMatrix<T>- Returns:
- the quantitation type applicable for the row or
nullif the design element is not present in the matrix
-
getRowElements
- Specified by:
getRowElementsin interfaceExpressionDataMatrix<T>- Returns:
- list of elements representing the row 'labels'.
-
getRowIndex
- Specified by:
getRowIndexin interfaceExpressionDataMatrix<T>- Returns:
- the index for the given design element, or -1 if not found
-
getRowIndices
Description copied from interface:ExpressionDataMatrixObtain all the rows that correspond to the given design element, ornullif the design element is not found.- Specified by:
getRowIndicesin interfaceExpressionDataMatrix<T>
-
getRowElement
- Specified by:
getRowElementin interfaceExpressionDataMatrix<T>
-
getColumnIndex
Obtain the column index of a given assay.- Specified by:
getColumnIndexin interfaceBulkExpressionDataMatrix<T>- Returns:
- the index, or -1 if not found
-
format
Format the value at the provided indices of the matrix.- Specified by:
formatin classAbstractExpressionDataMatrix<T>
-
addToRowMaps
-
addToRowMaps
protected void addToRowMaps(CompositeSequence designElement, QuantitationType qt, BioAssayDimension dim) Add a design element to the row maps.- Parameters:
qt- The quantitation type for this design element.dim- The dimension for this design element.- Throws:
IllegalStateException- if the row or design element is already mapped.
-
setUpColumnElements
protected void setUpColumnElements()Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension. Thus this code is overly complex. If an experiment has multiple BioAssayDimensions (due to multiple arrays), we merge the vectors (e.g., needed in the last case shown below). However, the issue of having multiple "BioMaterials" per "BioAssay" still exists.
Deals with the fact that the bioassay dimensions can vary in size, and don't even need to overlap in the biomaterials used. In the case where there is a single BioAssayDimension this reduces to simply associating each column with a bioassay (though we are forced to use an integer under the hood).
For example, in the following diagram "-" indicates a biomaterial, while "*" indicates a bioassay. Each row of "*" indicates samples run on a different microarray design (a different bio assay material). In the examples we assume there is just a single biomaterial dimension.
--------------- ***** -- only a few samples run on this platform ********** -- ditto **** -- these samples were not run on any of the other platforms .A simpler case:
--------------- *************** *********** *******
A more typical and easy case (one microarray design used):
---------------- ****************
If every sample was run on two different array designs:
---------------- **************** ****************
Every sample was run on a different array design:
----------------------- ****** ********* ********Because there can be limited or no overlap between the bioassay dimensions, we cannot assume the dimensions of the matrix will be defined by the longest BioAssayDimension. Note that later in processing, this possible lack of overlap is fixed by sample matching or vector merging; this class has to deal with the general case.
-
setUpColumnElements
-
selectVectors
protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors) Selects all the vectors passed in (uses them to initialize the data) -
selectVectors
protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes) -
selectVectors
protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes) -
selectVectors
protected List<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, QuantitationType quantitationType) -
formatRepresentation
Description copied from class:AbstractExpressionDataMatrixProduce a string representation of the type of values held in the matrix.- Overrides:
formatRepresentationin classAbstractExpressionDataMatrix<T>
-
getRowLabel
Description copied from class:AbstractExpressionDataMatrixObtain a label suitable for describing a row of the matrix.- Specified by:
getRowLabelin classAbstractExpressionDataMatrix<T>
-
getColumnLabel
Description copied from class:AbstractExpressionDataMatrixObtain a label suitable for describing a column of the matrix.- Specified by:
getColumnLabelin classAbstractExpressionDataMatrix<T>
-