|
Synopsis
This is a simple program for identifying genes showing an expression patterns of interest in a data file.
In many cases, it is probably best used as an adjunct to other statistical methods such as ANOVA. In
the simplest cases, this program effectively performs a t-test between two groups in your data. Because the
method is so simple, you can easily find situations where it is unable to identify genes that you would
be interested in. There is much more explanation of what is going on here on the page discussing
templates.
patternmatch
[-a: use absolute value of correl;
-r: rdb formatl line is present;
-d: negative correlations not set to 1.0]
<pattern> <data>
Inputs
- A pattern file, which is a space-delimited file containing, on the first line, a template definition. Much more
detail is here.
- A data file in RDB-like format (watch out for the format line: use the -r format
if necessary)
Outputs
- A file containing the list of genes in one column and the correlations with the pattern in the second column, and
corresponding p-values in the third column. The file name is derived from the file name, in
the form "datafile-pattern-correlpvals.txt"
Options
- -a: Use the absolute value of the correlation
- -r: input file is rdb format, so the second line should be ignored.
- -d: Negative correlations are not set to 1.0. If this is not set, then the software
assumes that negative correlations are not of interest. If you use the 'absolute value of the correlation coefficient'
switch in patternmatch, then all correlations will be positive and this option
doesn't do anything. In other situations setting negative correlations to a pvalue of 1.0 avoids giving high scores to expression patterns which match
the opposite of the template used.
Dependencies
Problems/bugs
- None known. The pvalue calculation was combined into this script only recently!
Version history
Script
References
|