|
BdClust |
Input | |
Expression data | Input file in seltag format |
Fields select |
List of fields -
List of expression fields (tissues) used to calculate correlation between gene expression profiles,
namely field indices in data format of input file starting from 1 (column numeration is not depend on
the Case Names option).
Examples of input: 1;2;3-7;12; 1-12; Selection data - Filename for fields selection in XML format. This is another way to set the list of fields. |
Genes for select | Genes for select
- List of genes to calculate correlation, namely gene indices in data set of input file starting from 1
(column numeration is not depend on the Case Names option).
Examples of input: 1;2;3-7;12; 1-12; Gene list - Filename for genes selection in XML format for Gene List 1. This is another way to set the list of genes. |
Output | |
Result | Name of output file |
Options | |
Select clustering objects | Select clustering objects: genes or samples. |
Type of distance | Type of distance between expression profiles. Several types of correlations are possible: 1-rij; 1-|rij|; 1+rij; Squared Euclidian distance; Euclidian distance; Manhattan distance; Chebyshev distance. |
Type of correlation | Type of correlation coefficient. Three types of correlations are possible: Pearson's r, Spearman rank correlation and Kendall tau correlation. |
Type of distance threshold | Type of distance threshold for clustering:
User-specified Average distance |
Threshold Value | The value of threshold, if user-specified type is set. |
Clustering speed | This parameter set clustering speed: Fast mode stores distance matrix in memory (needs more memory for large data), Slow mode recalculates distance between gene pair (no memory limitations, appropriate for very large data). |
Missing data treatment | Option to treat missing data. Several options are possible: Substitute by means (missing data are substituted by expression means in corresponding field); Case-wise deletion (correlations/distances are calculated by excluding cases that have missing data for any of the selected variables, all correlations are based on the same set of data); Pair-wise deletion (correlations/distances between each pair of profiles are calculated from all fields/samples having valid data for those two profiles). |