public interface ArrayDesignSequenceProcessingService
Modifier and Type | Method and Description |
---|---|
void |
assignSequencesToDesignElements(Collection<CompositeSequence> designElements,
Collection<BioSequence> sequences)
Associate sequences with an array design.
|
void |
assignSequencesToDesignElements(Collection<CompositeSequence> designElements,
File fastaFile)
Associate sequences with an array design.
|
void |
assignSequencesToDesignElements(Collection<CompositeSequence> designElements,
InputStream fastaFile) |
Collection<BioSequence> |
processAffymetrixDesign(ArrayDesign arrayDesign,
InputStream probeSequenceFile,
Taxon taxon)
Use this to add sequences to an existing Affymetrix design.
|
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
InputStream sequenceFile,
InputStream sequenceIdentifierFile,
SequenceType sequenceType,
Taxon taxon)
Read from FASTA file when the sequence file lacks any way to link the sequences back to the probes.
|
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
InputStream sequenceFile,
SequenceType sequenceType)
The sequence file must provide an unambiguous way to associate the sequences with design elements on the
array.
|
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
InputStream sequenceFile,
SequenceType sequenceType,
Taxon taxon)
The sequence file must provide an unambiguous way to associate the sequences with design elements on the
array.
|
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
InputStream sequenceIdentifierFile,
String[] databaseNames,
String blastDbHome,
Taxon taxon,
boolean force)
Intended for use with array designs that use sequences that are in genbank, but the accessions need to be
assigned after the array is already in the system.
|
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
InputStream sequenceIdentifierFile,
String[] databaseNames,
String blastDbHome,
Taxon taxon,
boolean force,
FastaCmd fc) |
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
String[] databaseNames,
boolean force) |
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
String[] databaseNames,
String blastDbHome,
boolean force)
For the case where the sequences are retrieved simply by the Genbank accession.
|
Collection<BioSequence> |
processArrayDesign(ArrayDesign arrayDesign,
String[] databaseNames,
String blastDbHome,
boolean force,
FastaCmd fc)
Provided primarily for testing.
|
BioSequence |
processSingleAccession(String sequenceId,
String[] databaseNames,
String blastDbHome,
boolean force)
Update a single sequence in the system.
|
Taxon |
validateTaxon(Taxon taxon,
ArrayDesign arrayDesign) |
void assignSequencesToDesignElements(Collection<CompositeSequence> designElements, Collection<BioSequence> sequences)
sequences,
- for Affymetrix these should be the Collapsed probe sequences.designElements
- design elementsvoid assignSequencesToDesignElements(Collection<CompositeSequence> designElements, File fastaFile) throws IOException
designElements
- design elementsfastaFile
- fasta fileIOException
- when IO problems occur.void assignSequencesToDesignElements(Collection<CompositeSequence> designElements, InputStream fastaFile) throws IOException
IOException
Collection<BioSequence> processAffymetrixDesign(ArrayDesign arrayDesign, InputStream probeSequenceFile, Taxon taxon) throws IOException
arrayDesign
- An existing ArrayDesign that already has compositeSequences filled in.probeSequenceFile
- InputStream from a tab-delimited probe sequence file.taxon
- validated taxonIOException
- when IO problems occur.Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, InputStream sequenceFile, SequenceType sequenceType) throws IOException
sequenceFile
- FASTA formatsequenceType
- - e.g., SequenceType.DNA (generic), SequenceType.AFFY_PROBE, or SequenceType.OLIGO.arrayDesign
- platformIOException
- when IO problems occur.FastaParser
Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, InputStream sequenceFile, SequenceType sequenceType, Taxon taxon) throws IOException
sequenceFile
- FASTA, Affymetrix or tabbed format (depending on the type)sequenceType
- - e.g., SequenceType.DNA (generic), SequenceType.AFFY_PROBE, or SequenceType.OLIGO.taxon
- - if null, attempt to determine it from the array design.arrayDesign
- platformIOException
- when IO problems occur.FastaParser
Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, InputStream sequenceFile, InputStream sequenceIdentifierFile, SequenceType sequenceType, Taxon taxon) throws IOException
arrayDesign
- platformsequenceFile
- FASTAsequenceIdentifierFile
- two columns of probe ids and sequence IDs (the same ones in the sequenceFile)taxon
- - if null, attempt to determine it from the array designIOException
Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, InputStream sequenceIdentifierFile, String[] databaseNames, String blastDbHome, Taxon taxon, boolean force) throws IOException
sequenceIdentifierFile
- Sequence file has two columns: column 1 is a probe id, column 2 is a genbank
accession or sequence name, delimited by tab. Sequences will be fetched from BLAST databases if possible;
ones missing will be sought directly in Gemma.force
- If true, if an existing BioSequence that matches is found in the system, any existing sequence
information in the BioSequence will be overwritten.arrayDesign
- plaftormtaxon
- taxonblastDbHome
- blast db homedatabaseNames
- database namesIOException
- when IO problems occur.Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, InputStream sequenceIdentifierFile, String[] databaseNames, String blastDbHome, Taxon taxon, boolean force, FastaCmd fc) throws IOException
IOException
Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, String[] databaseNames, boolean force)
Collection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, String[] databaseNames, String blastDbHome, boolean force)
databaseNames
- the names of the BLAST-formatted databases to search (e.g., nt, est_mouse)blastDbHome
- where to find the blast databases for sequence retrievalforce
- If true, then when an existing BioSequence contains a non-empty sequence value, it will be
overwritten with a new one.arrayDesign
- platformCollection<BioSequence> processArrayDesign(ArrayDesign arrayDesign, String[] databaseNames, String blastDbHome, boolean force, FastaCmd fc)
databaseNames
- the names of the BLAST-formatted databases to search (e.g., nt, est_mouse)blastDbHome
- where to find the blast databases for sequence retrievalforce
- If true, then when an existing BioSequence contains a non-empty sequence value, it will be
overwritten with a new one.arrayDesign
- platformfc
- fasta commandBioSequence processSingleAccession(String sequenceId, String[] databaseNames, String blastDbHome, boolean force)
force
- If true, if an existing BioSequence that matches if found in the system, any existing sequence
information in the BioSequence will be overwritten.databaseNames
- database namesblastDbHome
- blast db homesequenceId
- sequence idTaxon validateTaxon(Taxon taxon, ArrayDesign arrayDesign) throws IllegalArgumentException
IllegalArgumentException
Copyright © 2005–2023 Pavlidis lab, Michael Smith Laboratories and Department of Psychiatry, University of British Columbia. All rights reserved.