Skip to content

MAR multivariate: mMAR Class

mMAR

mMAR

mMAR(X: DataFrame, y: ndarray, n_xmiss: int = 2, missTarget: bool = False, n_Threads: int = 1)

A class to generate missing data in a dataset based on the Missing At Random (MAR) mechanism for multiple features simultaneously.

Args: X (pd.DataFrame): The dataset to receive the missing data. y (np.array): The label values from dataset n_xmiss (int): The number of features in the dataset that will receive missing values. Default is 2. missTarget (bool, optional): A flag to generate missing into the target.

Example Usage:

# Create an instance of the MAR class
generator = MAR(X, y, n_xmiss=4)

# Generate missing values using the random strategy
data_md = generator.random(missing_rate = 20)

random

random(missing_rate: int = 10) -> pd.DataFrame

Generate missing data using parallel processing.

correlated

correlated(missing_rate: int = 10) -> pd.DataFrame

Generate missing data using parallel processing.

median

median(missing_rate: int = 10) -> pd.DataFrame

Generate missing data using parallel processing.

pattern_missingness

pattern_missingness(patterns: List[Dict] = None, missing_rate: int = 10, std: bool = True, verbose: bool = False, seed: Optional[int] = None, lower_range: float = -3, upper_range: float = 3, max_diff_with_target: float = 0.001, max_iter: int = 100)

Generate missing data using pattern-based multivariate amputation.

References: [2] van Buuren, S., J. P. L. Brand, C. G. M. Groothuis-Oudshoorn, and D. B. Rubin. Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12):1049–1064, 2006.

[3] Schouten, R. M., P. Lugtig, and G. Vink. Generating missing values for simulation purposes: a multivariate amputation procedure. Journal of Statistical Computation and Simulation, 88(15):2909–2930, 2018.