Package ubic.basecode.math
Class Distance
- java.lang.Object
-
- ubic.basecode.math.Distance
-
public class Distance extends Object
Alternative distance and similarity metrics for vectors.- Author:
- Paul Pavlidis
-
-
Constructor Summary
Constructors Constructor Description Distance()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static double
correlationOfStandardized(double[] xe, double[] ye)
Highly optimized implementation of the Pearson correlation.static double
correlationOfStandardized(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Like correlationofNormedFast, but takes DoubleArrayLists as inputs, handles missing values correctly, and does more error checking.static double
euclDistance(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Calculate the Euclidean distance between two vectors.static double
manhattanDistance(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Calculate the Manhattan distance between two vectors.static double
spearmanRankCorrelation(cern.colt.list.DoubleArrayList x)
Convenience function to compute the rank correlation when we just want to know if the values are "in order".static double
spearmanRankCorrelation(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Spearman Rank Correlation.
-
-
-
Method Detail
-
correlationOfStandardized
public static double correlationOfStandardized(double[] xe, double[] ye)
Highly optimized implementation of the Pearson correlation. The inputs must be standardized - mean zero, variance one, without any missing values.- Parameters:
xe
- A standardized vectorye
- A standardized vector- Returns:
- Pearson correlation coefficient.
-
correlationOfStandardized
public static double correlationOfStandardized(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Like correlationofNormedFast, but takes DoubleArrayLists as inputs, handles missing values correctly, and does more error checking. Assumes the data has been converted to z scores already.- Parameters:
x
- A standardized vectory
- A standardized vector- Returns:
- The Pearson correlation between x and y.
-
euclDistance
public static double euclDistance(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Calculate the Euclidean distance between two vectors.- Parameters:
x
- DoubleArrayListy
- DoubleArrayList- Returns:
- Euclidean distance between x and y
-
manhattanDistance
public static double manhattanDistance(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Calculate the Manhattan distance between two vectors.- Parameters:
x
- DoubleArrayListy
- DoubleArrayList- Returns:
- Manhattan distance between x and y
-
spearmanRankCorrelation
public static double spearmanRankCorrelation(cern.colt.list.DoubleArrayList x)
Convenience function to compute the rank correlation when we just want to know if the values are "in order". Values in perfect ascending order are a correlation of 1, descending is -1.- Parameters:
x
-- Returns:
-
spearmanRankCorrelation
public static double spearmanRankCorrelation(cern.colt.list.DoubleArrayList x, cern.colt.list.DoubleArrayList y)
Spearman Rank Correlation. This does the rank transformation of the data. Only mutually non-NaN values are used.- Parameters:
x
- DoubleArrayListy
- DoubleArrayList- Returns:
- Spearman's rank correlation between x and y or NaN if it could not be computed.
-
-