Welcome to miceforest’s documentation!

miceforest imputes missing data using LightGBM in an iterative method known as Multiple Imputation by Chained Equations (MICE). It was designed to be:

Fast
- Uses lightgbm as a backend
- Has efficient mean matching solutions.
- Can utilize GPU training
Flexible
- Can impute pandas dataframes and numpy arrays
- Handles categorical data automatically
- Fits into a sklearn pipeline
- User can customize every aspect of the imputation process
Production Ready
- Can impute new, unseen datasets very quickly
- Kernels are efficiently compressed during saving and loading
- Data can be imputed in place to save memory
- Can build models on non-missing data

There are very extensive beginner and advanced tutorials on the github readme. Below is a table of contents for the topics covered:

Contents: