Home

layout files

Many of the scripts doing statistical analysis require a 'layout' file. This is a simple file that describes the organization of the data with respect to experimental conditions. This page documents how to set up a layout file for a new data file.

Step 1: Determine what the experimental variables and levels are

A variable is a factor in your experiment which is considered an 'independent variable'. Examples are genotype, drug treatment, time, age, etc. A level is a value that a variable can take. Examples for the variables given as examples are "wildtype", "mutant"; "untreated", "treated"; "0 minutes", "1 hour", "3 hours"; "2years old", "10 years old". In principle there can be any number of variables and levels, but the software provided here only handles a restricted range of possiblities (described elsewhere).

Step 2: Determine which data columns correspond to which conditions

The data file is a tab-delimited text file with one row representing the dependent variable measurements for one set of observations. For microarray analysis, this means that each row represents the expression measurements for one gene. The columns then represent different arrays which were run. It helps in later analysis if the data columns are arranged by condition: for example, put the "wild type" columns all together and the "mutant" columns all together after that. So the top of your data file might look like this:

genemutantmutantmutantwildtypewildtypewildtype
100001_at-36.377.864.489.4126.686.2
100002_at1504.21512944.51157.916521358.9
100003_at845.9966.51057.4987.4764.1878.5

(etc, for many lines)

Step 3: Build the layout file

Now we just assemble this information into one small text file. Like all other input files used by this software, it must be saved as plain text.

The name of a variable is indicated by a "=" (equals sign). Each level of the variable is preceded by a "%" (percent sign). These symbols cannot be part of the names of your variables or levels, or the software will get confused. After the name of the level, the numbers of each column that have that level of the given variable are listed (the numbering starts from 0).

An example will make this clear:

	=genotype
	%mutant
	0
	1
	2
	%wildtype
	3
	4
	5
	

The actual file is here.

A more complex example with two variables might look like this:

	=genotype
	%mutant
	0
	1
	2
	3
	4
	5
	%wildtype
	6
	7
	8
	9
	10
	11
	=age
	%juvenile
	0
	1
	2
	6
	7
	8
	%aged
	3
	4
	5
	9
	10
	11
	

The actual file is here.

In the second example, there are 12 columns in the data file. The experiment examined four types of animals:

  • mutant juveniles (columns 0,1,2) (remember the columns are counted starting from zero)
  • mutant aged (columns 3,4,5)
  • wildtype juveniles (columns 6,7,8)
  • wildtype aged (columns 9,10,11)

Summing up

Hopefully this page has provided enough information for you to create your own layout files. Be sure to remember the following important rules:

  • The layout file must be saved as plain text.
  • The numbering of columns starts at zero, not one.
  • The numbering given in the layout file must correspond to the order in your data file. If you change the data file column order, the layout will no longer be correct.
  • Be sure to take care that for each variable, every column is represented exactly once. Be careful not to omit columns or list them twice for the same variable.
  • In principle you can have as many variables and levels as you want, but the software that uses the layouts can only handle certain situations.
  • \

References

--