Replace missing values from normal distribution


Cox Lab


November 15, 2023

1 General

2 Brief description

Missing values will be replaced by random numbers that are drawn from a normal distribution. The parameters of this distribution can be optimized to simulate a typical abundance region that the missing values would have if they had been measured. In the absence of any a priori knowledge, the distribution of random numbers should be similar to the valid values. Often, missing values represent low abundance measurements. The default values are chosen to mimic this case.

3 Parameters

3.1 Width

Defines the width of the Gaussian distribution relative to the standard deviation of measured values (default: 0.3). A value of 0.5 would mean that the width of the distribution used for drawing random numbers is half of the standard deviation of the data.

3.2 Down shift

Specifies the amount by which the distribution used for the random numbers is shifted downwards (default: 1.8). This is in units of the standard deviation of the valid data.

3.3 Mode

Specifies whether the replacement of missing values should be applied to each expression column separately (default) or on the whole matrix at once (“Total matrix”).

3.4 Columns

Selected expression columns, where missing values should be replaced (default: all expression columns are selected).

4 Parameter window