MAR multivariate: mMAR Class
mMAR
mMAR
A class to generate missing data in a dataset based on the Missing At Random (MAR) mechanism for multiple features simultaneously.
Args: X (pd.DataFrame): The dataset to receive the missing data. y (np.array): The label values from dataset n_xmiss (int): The number of features in the dataset that will receive missing values. Default is 2. missTarget (bool, optional): A flag to generate missing into the target.
Example Usage:
# Create an instance of the MAR class
generator = MAR(X, y, n_xmiss=4)
# Generate missing values using the random strategy
data_md = generator.random(missing_rate = 20)
random
Generate missing data using parallel processing.
correlated
Generate missing data using parallel processing.
median
Generate missing data using parallel processing.
pattern_missingness
pattern_missingness(patterns: List[Dict] = None, missing_rate: int = 10, std: bool = True, verbose: bool = False, seed: Optional[int] = None, lower_range: float = -3, upper_range: float = 3, max_diff_with_target: float = 0.001, max_iter: int = 100)
Generate missing data using pattern-based multivariate amputation.
References: [2] van Buuren, S., J. P. L. Brand, C. G. M. Groothuis-Oudshoorn, and D. B. Rubin. Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12):1049–1064, 2006.
[3] Schouten, R. M., P. Lugtig, and G. Vink. Generating missing values for simulation purposes: a multivariate amputation procedure. Journal of Statistical Computation and Simulation, 88(15):2909–2930, 2018.