Class GeoConverterImpl

java.lang.Object
ubic.gemma.core.loader.expression.geo.GeoConverterImpl
All Implemented Interfaces:
GeoConverter, Converter<GeoData, Identifiable>

@Component @Scope("prototype") public class GeoConverterImpl extends Object implements GeoConverter
Convert GEO domain objects into Gemma objects.

Usually we trigger this by passing in GeoSeries objects.

GEO has four basic kinds of objects: Platforms (ArrayDesign), Samples (BioMaterial), Series (ExpressionExperiment) and DataSets (which are curated ExpressionExperiment). Note that a sample can belong to more than one series. A series can include more than one dataset. GEO also supports the concept of a super-series. See SOFT submission instructions.

A curated expression data set is at first represented by a GEO "GDS" number (a curated dataset), which maps to a series (GSE). HOWEVER, multiple datasets may go together to form a series (GSE). This can happen when the "A" and "B" arrays were both run on the same samples. Thus, we actually normally go by GSE.

This service can be used in database-aware or unaware states. However, it has prototype scope as it has some 'global' data structures used during processing.

Author:
kesv, pavlidis
  • Constructor Details

    • GeoConverterImpl

      public GeoConverterImpl()
  • Method Details

    • clear

      public void clear()
      Description copied from interface: GeoConverter
      Clear the state of the converter.

      Call this prior to starting conversion of a full dataset.

      Specified by:
      clear in interface GeoConverter
    • convert

      public Collection<Identifiable> convert(Collection<? extends GeoData> geoObjects)
      Specified by:
      convert in interface Converter<GeoData, Identifiable>
      Specified by:
      convert in interface GeoConverter
    • convert

      public <T extends Identifiable> Collection<T> convert(Collection<? extends GeoData> geoObjects, Class<T> dataType)
      Description copied from interface: GeoConverter
      Convert a collection of GeoData objects, retaining only elements of the specified data type.
      Specified by:
      convert in interface GeoConverter
    • convert

      public Identifiable convert(GeoData geoObject)
      Specified by:
      convert in interface Converter<GeoData, Identifiable>
      Specified by:
      convert in interface GeoConverter
    • convert

      public ArrayDesign convert(GeoPlatform geoPlatform)
      Specified by:
      convert in interface GeoConverter
    • convert

      public Collection<ExpressionExperiment> convert(GeoSeries geoSeries)
      Specified by:
      convert in interface GeoConverter
    • convert

      public Collection<ExpressionExperiment> convert(GeoSeries geoSeries, boolean skipDataVectors)
      Specified by:
      convert in interface GeoConverter
    • convert

      public ExpressionExperiment convert(GeoDataset geoDataset, boolean skipDataVectors)
      Specified by:
      convert in interface GeoConverter
    • convertSubsetToExperimentalFactor

      public void convertSubsetToExperimentalFactor(ExpressionExperiment expExp, GeoSubset geoSubSet)
      Description copied from interface: GeoConverter
      Converts Geo subsets to experimental factors. This adds a new factor value to the experimental factor of an experimental design, and adds the factor value to each BioMaterial of a specific BioAssay.
      Specified by:
      convertSubsetToExperimentalFactor in interface GeoConverter
      Parameters:
      expExp - experiment
      geoSubSet - geo subset
    • getPrimaryArrayTaxon

      public Taxon getPrimaryArrayTaxon(Collection<Taxon> platformTaxa, Collection<String> probeTaxa) throws IllegalArgumentException
      This method determines the primary taxon on the array: There are 4 main branches of logic. 1.First it checks if there is only one platform taxon defined on the GEO submission: If there is that is the primary taxon. 2.If multiple taxa are given for the platform then the taxa are checked to see if they share a common parent if so that is the primary taxon e.g. salmonid where atlantic salmon and rainbow trout are given. 3. Finally, the probeTaxa are looked at and the most common probe taxa is calculated as the primary taxon 4. No taxon found throws an error
      Specified by:
      getPrimaryArrayTaxon in interface GeoConverter
      Parameters:
      platformTaxa - Collection of taxa that were given on the GEO array submission as platform taxa
      probeTaxa - Collection of taxa strings defining the taxon of each probe on the array.
      Returns:
      Primary taxon of array as determined by this method
      Throws:
      IllegalArgumentException
    • setSplitByPlatform

      public void setSplitByPlatform(boolean splitByPlatform)
      Specified by:
      setSplitByPlatform in interface GeoConverter
      Parameters:
      splitByPlatform - If true, and the series uses more than one platform, split it up. This often isn't necessary/desirable. This is overridden if the series uses more than one species, in which case it is always split up.
    • setForceConvertElements

      public void setForceConvertElements(boolean forceConvertElements)
      Specified by:
      setForceConvertElements in interface GeoConverter
      Parameters:
      forceConvertElements - Set the behaviour when a platform that normally would not be loaded in detail is encountered, such as an Exon array.
    • setElementLimitForStrictness

      public void setElementLimitForStrictness(int tooManyElements)
      Specified by:
      setElementLimitForStrictness in interface GeoConverter
      Parameters:
      tooManyElements - this is here for tests only. The default value should be okay otherwise.
    • makeTitle

      protected String makeTitle(String title, String appendix)
      Form title (will be experiment name) and ensure is valid length
      Parameters:
      appendix - can be null; e.g. species or platform name added when we are splitting up a record.
      Returns:
      title