Welcome to miceforest’s documentation!
miceforest imputes missing data using LightGBM in an iterative method known as Multiple Imputation by Chained Equations (MICE). It was designed to be:
Uses lightgbm as a backend
Has efficient mean matching solutions.
Can utilize GPU training
Can impute pandas dataframes and numpy arrays
Handles categorical data automatically
Fits into a sklearn pipeline
User can customize every aspect of the imputation process
- Production Ready
Can impute new, unseen datasets very quickly
Kernels are efficiently compressed during saving and loading
Data can be imputed in place to save memory
Can build models on non-missing data
There are very extensive beginner and advanced tutorials on the github readme. Below is a table of contents for the topics covered: