Package ubic.gemma.core.goldenpath
Class GoldenPathSequenceAnalysis
- java.lang.Object
-
- ubic.gemma.core.goldenpath.GoldenPath
-
- ubic.gemma.core.goldenpath.GoldenPathSequenceAnalysis
-
public class GoldenPathSequenceAnalysis extends GoldenPath
Using the Goldenpath databases for comparing sequence alignments to gene locations.- Author:
- pavlidis
-
-
Field Summary
-
Fields inherited from class ubic.gemma.core.goldenpath.GoldenPath
log
-
-
Constructor Summary
Constructors Constructor Description GoldenPathSequenceAnalysis(Taxon taxon)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Collection<BlatAssociation>
findAssociations(String chromosome, Long queryStart, Long queryEnd, String starts, String sizes, String strand, ThreePrimeDistanceMethod method, ProbeMapperConfig config)
Given a physical location, identify overlapping genes or predicted genes.Gene
findClosestGene(String chromosome, Long queryStart, Long queryEnd, String strand, int maxWindow)
Given a location, find the nearest gene on the same strand, including only "known", "refseq" or "ensembl" transcripts.Collection<Gene>
findESTs(String chromosome, Long regionStart, Long regionEnd, String strand)
Check to see if there are ESTs that overlap with this region.Collection<GeneProduct>
findKnownGenesByLocation(String chromosome, Long start, Long end, String strand)
Find "Known" genes contained in or overlapping a region.Collection<GeneProduct>
findRefGenesByLocation(String chromosome, Long start, Long end, String strand)
Find RefSeq genes contained in or overlapping a region.Collection<Gene>
findRNAs(String chromosome, Long regionStart, Long regionEnd, String strand)
Check to see if there are mRNAs that overlap with this region.Collection<BlatResult>
findSequenceLocations(String identifier)
Collection<BioSequence2GeneProduct>
getThreePrimeDistances(String identifier, ThreePrimeDistanceMethod method)
Uses default mapping settingsCollection<? extends BioSequence2GeneProduct>
getThreePrimeDistances(BlatResult br, ThreePrimeDistanceMethod method)
Given a physical location, find how close it is to the 3' end of a gene it is in, using default mapping settings.
-
-
-
Constructor Detail
-
GoldenPathSequenceAnalysis
public GoldenPathSequenceAnalysis(Taxon taxon)
-
-
Method Detail
-
findAssociations
public Collection<BlatAssociation> findAssociations(String chromosome, Long queryStart, Long queryEnd, String starts, String sizes, String strand, ThreePrimeDistanceMethod method, ProbeMapperConfig config)
Given a physical location, identify overlapping genes or predicted genes.- Parameters:
chromosome
- The chromosome name (the organism is set by the constructor)queryStart
- The start base of the region to query (the start of the alignment to the genome)queryEnd
- The end base of the region to query (the end of the alignment to the genome)starts
- Locations of alignment block starts in target. (comma-delimited from blat)sizes
- Sizes of alignment blocks (comma-delimited from blat)strand
- Either + or - indicating the strand to look on, or null to search both strands.method
- The constant representing the method to use to locate the 3' distance.config
- configuration- Returns:
- A list of BioSequence2GeneProduct objects. The distance stored by a ThreePrimeData will be 0 if the sequence overhangs the found genes (rather than providing a negative distance). If no genes are found, the result is null; These are transient instances, not from Gemma's database
-
findClosestGene
public Gene findClosestGene(String chromosome, Long queryStart, Long queryEnd, String strand, int maxWindow)
Given a location, find the nearest gene on the same strand, including only "known", "refseq" or "ensembl" transcripts.- Parameters:
chromosome
- chromosomequeryStart
- startqueryEnd
- endstrand
- Either '+' or '-'maxWindow
- the number of bases on each side to look, at most, in addition to looking inside the given region.- Returns:
- the Gene closest to the given location. This is a transient instance, not from Gemma's database.
-
findESTs
public Collection<Gene> findESTs(String chromosome, Long regionStart, Long regionEnd, String strand)
Check to see if there are ESTs that overlap with this region. We provisionally promote the ESTs to the status of genes for this purpose.- Parameters:
chromosome
- chromosomeregionStart
- the region to be checkedregionEnd
- endstrand
- the strand- Returns:
- The ESTs which overlap the query region. (using the all_est table)
-
findKnownGenesByLocation
public Collection<GeneProduct> findKnownGenesByLocation(String chromosome, Long start, Long end, String strand)
Find "Known" genes contained in or overlapping a region. Note that the NCBI symbol may be blank, when the gene is not a refSeq gene.- Parameters:
chromosome
- chromosomestart
- startend
- endstrand
- strand- Returns:
- This is a collection of transient instances, not from Gemma's database.
-
findRefGenesByLocation
public Collection<GeneProduct> findRefGenesByLocation(String chromosome, Long start, Long end, String strand)
Find RefSeq genes contained in or overlapping a region.- Parameters:
chromosome
- chromosomestart
- startstrand
- strandend
- end- Returns:
- This is a collection of transient instances, not from Gemma's database.
-
findRNAs
public Collection<Gene> findRNAs(String chromosome, Long regionStart, Long regionEnd, String strand)
Check to see if there are mRNAs that overlap with this region. We promote the mRNAs to the status of genes for this purpose.- Parameters:
chromosome
- chromosomeregionStart
- the region to be checkedregionEnd
- endstrand
- the strand- Returns:
- The mRNAs which overlap the query region.
-
findSequenceLocations
public Collection<BlatResult> findSequenceLocations(String identifier)
- Parameters:
identifier
- A Genbank accession referring to an EST or mRNA. For other types of queries this will not return any results.- Returns:
- Set containing Lists of PhysicalLocation representing places GoldenPath says the sequence referred to by the identifier aligns. If no results are found the Set will be empty.
-
getThreePrimeDistances
public Collection<? extends BioSequence2GeneProduct> getThreePrimeDistances(BlatResult br, ThreePrimeDistanceMethod method)
Given a physical location, find how close it is to the 3' end of a gene it is in, using default mapping settings.- Parameters:
br
- BlatResult holding the parameters needed.method
- The constant representing the method to use to locate the 3' distance.- Returns:
- a collection of distances
-
getThreePrimeDistances
public Collection<BioSequence2GeneProduct> getThreePrimeDistances(String identifier, ThreePrimeDistanceMethod method)
Uses default mapping settings- Parameters:
identifier
- identifiermethod
- the method- Returns:
- bio seq 2 gene producs
-
-