BdClust parameters

BdClust

Input
Expression data	Input file in seltag format
Fields select	List of fields - List of expression fields (tissues) used to calculate correlation between gene expression profiles, namely field indices in data format of input file starting from 1 (column numeration is not depend on the Case Names option). Examples of input: 1;2;3-7;12; 1-12; Selection data - Filename for fields selection in XML format. This is another way to set the list of fields.
Genes for select	Genes for select - List of genes to calculate correlation, namely gene indices in data set of input file starting from 1 (column numeration is not depend on the Case Names option). Examples of input: 1;2;3-7;12; 1-12; Gene list - Filename for genes selection in XML format for Gene List 1. This is another way to set the list of genes.
Output
Result	Name of output file
Options
Select clustering objects	Select clustering objects: genes or samples.
Type of distance	Type of distance between expression profiles. Several types of correlations are possible: 1-r_ij; 1-\|r_ij\|; 1+r_ij; Squared Euclidian distance; Euclidian distance; Manhattan distance; Chebyshev distance.
Type of correlation	Type of correlation coefficient. Three types of correlations are possible: Pearson's r, Spearman rank correlation and Kendall tau correlation.
Type of distance threshold	Type of distance threshold for clustering: User-specified Average distance
Threshold Value	The value of threshold, if user-specified type is set.
Clustering speed	This parameter set clustering speed: Fast mode stores distance matrix in memory (needs more memory for large data), Slow mode recalculates distance between gene pair (no memory limitations, appropriate for very large data).
Missing data treatment	Option to treat missing data. Several options are possible: Substitute by means (missing data are substituted by expression means in corresponding field); Case-wise deletion (correlations/distances are calculated by excluding cases that have missing data for any of the selected variables, all correlations are based on the same set of data); Pair-wise deletion (correlations/distances between each pair of profiles are calculated from all fields/samples having valid data for those two profiles).