Class GeoConverterImpl
- java.lang.Object
-
- ubic.gemma.core.loader.expression.geo.GeoConverterImpl
-
- All Implemented Interfaces:
GeoConverter
,Converter<GeoData,Object>
@Component @Scope("prototype") public class GeoConverterImpl extends Object implements GeoConverter
Convert GEO domain objects into Gemma objects. Usually we trigger this by passing in GeoSeries objects. GEO has four basic kinds of objects: Platforms (ArrayDesigns), Samples (BioAssays), Series (Experiments) and DataSets (which are curated Experiments). Note that a sample can belong to more than one series. A series can include more than one dataset. GEO also supports the concept of a superseries. See http://www.ncbi.nlm.nih.gov/projects/geo/info/soft2.html. A curated expression data set is at first represented by a GEO "GDS" number (a curated dataset), which maps to a series (GSE). HOWEVER, multiple datasets may go together to form a series (GSE). This can happen when the "A" and "B" arrays were both run on the same samples. Thus we actually normally go by GSE. This service can be used in database-aware or unaware states. However, it has prototype scope as it has some 'global' data structures used during processing.- Author:
- keshav, pavlidis
-
-
Constructor Summary
Constructors Constructor Description GeoConverterImpl()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
clear()
Remove old results.Collection<Object>
convert(Collection<? extends GeoData> geoObjects)
Object
convert(GeoData geoObject)
byte[]
convertData(List<Object> vector, QuantitationType qt)
Convert a vector of strings into a byte[] for saving in the database.void
convertSubsetToExperimentalFactor(ExpressionExperiment expExp, GeoSubset geoSubSet)
Converts Geo subsets to experimental factors.Taxon
getPrimaryArrayTaxon(Collection<Taxon> platformTaxa, Collection<String> probeTaxa)
This method determines the primary taxon on the array: There are 4 main branches of logic.protected String
makeTitle(String title, String appendix)
Form title (will be experiment name) and ensure is valid lengthvoid
setElementLimitForStrictness(int tooManyElements)
void
setForceConvertElements(boolean forceConvertElements)
void
setSplitByPlatform(boolean splitByPlatform)
String
toString()
-
-
-
Method Detail
-
clear
public void clear()
Description copied from interface:GeoConverter
Remove old results. Call this prior to starting conversion of a full dataset.- Specified by:
clear
in interfaceGeoConverter
-
convert
public Collection<Object> convert(Collection<? extends GeoData> geoObjects)
-
convertSubsetToExperimentalFactor
public void convertSubsetToExperimentalFactor(ExpressionExperiment expExp, GeoSubset geoSubSet)
Description copied from interface:GeoConverter
Converts Geo subsets to experimental factors. This adds a new factor value to the experimental factor of an experimental design, and adds the factor value to each BioMaterial of a specific BioAssay.- Specified by:
convertSubsetToExperimentalFactor
in interfaceGeoConverter
- Parameters:
expExp
- experimentgeoSubSet
- geo subset
-
getPrimaryArrayTaxon
public Taxon getPrimaryArrayTaxon(Collection<Taxon> platformTaxa, Collection<String> probeTaxa) throws IllegalArgumentException
This method determines the primary taxon on the array: There are 4 main branches of logic. 1.First it checks if there is only one platform taxon defined on the GEO submission: If there is that is the primary taxon. 2.If multiple taxa are given for the platform then the taxa are checked to see if they share a common parent if so that is the primary taxon e.g. salmonid where atlantic salmon and rainbow trout are given. 3.Finally the probeTaxa are looked at and the most common probe taxa is calculated as the primary taxon 4. No taxon found throws an error- Specified by:
getPrimaryArrayTaxon
in interfaceGeoConverter
- Parameters:
platformTaxa
- Collection of taxa that were given on the GEO array submission as platform taxaprobeTaxa
- Collection of taxa strings defining the taxon of each probe on the array.- Returns:
- Primary taxon of array as determined by this method
- Throws:
IllegalArgumentException
-
setSplitByPlatform
public void setSplitByPlatform(boolean splitByPlatform)
- Specified by:
setSplitByPlatform
in interfaceGeoConverter
- Parameters:
splitByPlatform
- If true, and the series uses more than one platform, split it up. This often isn't necessary/desirable. This is overridden if the series uses more than one species, in which case it is always split up.
-
convertData
public byte[] convertData(List<Object> vector, QuantitationType qt)
Convert a vector of strings into a byte[] for saving in the database. . Blanks(missing values) are treated as NAN (double), 0 (integer), false (booleans) or just empty strings (strings). Other invalid values are treated the same way as missing data (to keep the parser from failing when dealing with strange GEO files that have values like "Error" for an expression value).- Specified by:
convertData
in interfaceGeoConverter
- Parameters:
vector
- of Strings to be converted to primitive values (double, int etc)qt
- The quantitation type for the values to be converted.
-
setForceConvertElements
public void setForceConvertElements(boolean forceConvertElements)
- Specified by:
setForceConvertElements
in interfaceGeoConverter
- Parameters:
forceConvertElements
- Set the behaviour when a platform that normally would not be loaded in detail is encountered, such as an Exon array.
-
setElementLimitForStrictness
public void setElementLimitForStrictness(int tooManyElements)
- Specified by:
setElementLimitForStrictness
in interfaceGeoConverter
- Parameters:
tooManyElements
- this is here for tests only. The default value should be okay otherwise.
-
-