fusilli.fusionmodels.tabularfusion.mcvae_modelο
This module implements the MCVAE (multi-channel variational autoencoder) model for fusing two types of tabular data.
Functions
|
Simple early stopping function for the MCVAE model's training. |
Classes
|
Class for creating the MCVAE (multi-channel variational autoencoder) joint latent space. |
|
This class implements a model that fuses the two types of tabular data using the MCVAE approach. |
- class MCVAESubspaceMethod(datamodule, k=None, max_epochs=5000, train_subspace=True)[source]ο
Bases:
object
Class for creating the MCVAE (multi-channel variational autoencoder) joint latent space.
If you want to change the tolerance or patience for early stopping, you can do so by adding extra keyword arguments to the function prepare_fusion_data. For example:
βmcvae_patienceβ: value, βmcvae_toleranceβ: value,
where value is the number of epochs for patience and the tolerance for tolerance.
- datamoduleο
Datamodule object containing the data.
- Type:
datamodule object
- num_latent_dimsο
Number of latent dimensions.
- Type:
int
- fit_modelο
Mcvae object containing the fitted model.
- Type:
Mcvae object
- __init__(datamodule, k=None, max_epochs=5000, train_subspace=True)[source]ο
- Parameters:
datamodule (datamodule object) β Datamodule object containing the data.
k (int, optional) β Number of latent dimensions, by default None
max_epochs (int, optional) β Maximum number of epochs, by default 5000
train_subspace (bool, optional) β Whether to train the subspace model, by default True.
- convert_to_latent(test_dataset)[source]ο
Converts the test dataset to the latent space.
- Parameters:
test_dataset (list) β List containing the two types of tabular data.
- Returns:
test_mean_latents (torch.Tensor) β Tensor containing the mean latents of the dataset.
labels (pd.DataFrame) β Dataframe containing the labels of the dataset.
[self.num_latent_dims, None, None] (list) β List containing the dimensions of the data.
- get_latents(dataset)[source]ο
Gets the latent representations of the multimodal dataset. The two latent spaces are averaged to form the joint latent space.
- Parameters:
dataset (list) β List containing the two types of tabular data.
- Returns:
mean_latents β Array containing the mean latents of the dataset.
- Return type:
np.array
- load_ckpt(checkpoint_path)[source]ο
Loads the checkpoint.
- Parameters:
checkpoint_path (list) β List containing the path to the checkpoint.
- Return type:
None
- train(train_dataset, val_dataset=None)[source]ο
Trains the model.
- Parameters:
train_dataset (list) β List containing the two types of tabular data.
val_dataset (list, optional) β List containing the two types of tabular data, by default None
- Returns:
mean_latents (torch.Tensor) β Tensor containing the mean latents of the dataset.
labels (pd.DataFrame) β Dataframe containing the labels of the dataset.
- class MCVAE_tab(prediction_task, data_dims, multiclass_dimensions)[source]ο
Bases:
ParentFusionModel
,Module
This class implements a model that fuses the two types of tabular data using the MCVAE approach. MCVAE: multi-channel variational autoencoder.
The MCVAE creates a joint latent space of the two types of tabular data based off a joint latent prior and joint decoding.
References
- Antelmi, L., Ayache, N., Robert, P., & Lorenzi, M. (2019). Sparse Multi-Channel Variational
Autoencoder for the Joint Analysis of Heterogeneous Data. Proceedings of the 36th International Conference on Machine Learning, 302β311. https://proceedings.mlr.press/v97/antelmi19a.html
- subspace_methodο
Class of the subspace method:
MCVAESubspaceMethod
- Type:
class
- latent_space_layersο
Dictionary containing the layers of the 1st type of tabular data. Here the first type of tabular data is the joint latent space created in the mcvae_subspace_method class.
- Type:
dict
- fused_dimο
Number of features of the fused layers. This is the flattened output size of the latent space layers.
- Type:
int
- fused_layersο
Sequential layer containing the fused layers.
- Type:
nn.Sequential
- final_predictionο
Sequential layer containing the final prediction layers.
- Type:
nn.Sequential
- __init__(prediction_task, data_dims, multiclass_dimensions)[source]ο
- Parameters:
prediction_task (str) β Type of prediction to be performed.
data_dims (list) β List containing the dimensions of the data.
multiclass_dimensions (int) β Number of classes in the multiclass classification task.
- forward(x)[source]ο
Forward pass of the model.
- Parameters:
x (torch.Tensor) β torch.Tensor containing the input data: joint latent space of the two types of tabular data.
- Returns:
out_pred β List containing the predictions.
- Return type:
list
- fusion_type = 'subspace'ο
Type of fusion.
- Type:
str
- method_name = 'MCVAE Tabular'ο
Name of the method.
- Type:
str
- modality_type = 'tabular_tabular'ο
Type of modality.
- Type:
str
- subspace_methodο
alias of
MCVAESubspaceMethod
- mcvae_early_stopping_tol(patience, tolerance, loss_logs, verbose=False)[source]ο
Simple early stopping function for the MCVAE modelβs training.
- Parameters:
patience (int) β Number of epochs to wait before stopping
tolerance (int) β Tolerance for loss
loss_logs (list) β List of loss logs
verbose (bool) β Whether to print out information
- Returns:
i β Epoch to stop at
- Return type:
int