{% hint style="info" %} This page provides an overview of the Federated Learning module in MEDomicsLab, offering insights into both the application's interface and the backend package employed for conducting experiments. {% endhint %}
The Federated Learning Module in MEDomicsLab simulates the process of federated learning and allows for training models in a decentralized manner using multiple datasets. This approach preserves privacy and enhances data security by ensuring that data never leaves its original location.
- Decentralized Training: Models are trained across multiple nodes without transferring raw data.
- Privacy Preservation: Utilizing techniques like differential privacy to ensure data confidentiality.
- Hyperparameter Optimization: Tools to automatically tune and optimize model hyperparameters for improved performance.
- Transfer learning: Allowing the user to use pre-trained models to initialize the central server to improve the model performance
{% embed url="https://youtu.be/7vQFheeRvp4" %}
The Federated Learning module in the MEDomicsLab application uses MEDfl in the backend, a standalone Python package designed for simulating federated learning.
You can also use MEDfl independently from the app to create your networks and pipelines directly with code. Below is a brief example demonstrating how to do that.
# install MEDfl
pip install MEDfl# import MEDfl
import MEDfl
# ... import the rest of the dependencies here
# Create a network "Net_1"
Net_1 = Network(name="Auto_Net")
Net_1.create_network()
# Create a MasterDataSet from Net_1
Net_1.create_master_dataset()
# FLsetup creation
autoFl = FLsetup(name = "Flsetup_1", description = "The first fl setup",network = Net_1)
autoFl.create()
# Create node
hospital = Node(name="hospital_1", train=1)
Net_1.add_node(hospital)
hospital.upload_dataset("hospital_1"+'_dataset', base_url + '/notebooks/data/nodesData/output_1'+'.csv')
# Create FLDataSet
fl_dataset = autoFl.create_federated_dataset(
output="deceased",
fit_encode=[],
to_drop=["deceased"]
)
# Load the pre-trained model
model = Model.load_model("../../notebooks/.ipynb_checkpoints/trainedModels/grid_search_classifier.pth")
# Create the strategy
aggreg_algo = Strategy(config['aggreg_algo'],
fraction_fit=1.0,
fraction_evaluate=1.0,
min_fit_clients=2,
min_evaluate_clients=2,
min_available_clients=2,
initial_parameters=global_model.get_parameters())
aggreg_algo.create_strategy()
# Create The server
server = FlowerServer(global_model,
strategy=aggreg_algo,
num_rounds=server_rounds,
num_clients=len(fl_dataset.trainloaders),
fed_dataset=fl_dataset,
diff_privacy=config['dp_activate'],
# You can change the resources located for each client based on your machine
client_resources={
'num_cpus': 1.0, 'num_gpus': 0.0}
)
# Create the pipeline
ppl_1 = FLpipeline(name="the second fl_pipeline",
description="This is our first FL pipeline",
server=server)
# Run the Training of the model
history = ppl_1.server.run()
# Test the model
report = ppl_1.auto_test()
For more detailed examples, you can check the tutorials on the GitHub repository.
The interface of the MEDfl module in the MEDomicsLab application provides a user-friendly space where you can visually manage and connect multiple nodes to create your federated learning pipelines. Each node type in the interface has a specific role and attributes, allowing you to build and customize your federated learning networks seamlessly.
Below is a table explaining the role and attributes of each node:
| Node | Description | input | Output |
|---|---|---|---|
![]() | The Dataset Node is where you specify the master dataset for your experiment. The master dataset is used differently based on the type of network you create:
To select a master dataset, click on the "Select Dataset" button, choose the file, and specify the target of the dataset. | / | Dataset |
![]() | The Network Node is responsible for creating the federated network. A new screen will appear when you click on it, displaying additional node types: the Client Node and the Server Node. You will have the option to add multiple clients and a central server that will aggregate the results. | Dataset | Network |
![]() | The FL Setup Node is responsible for configuring the federated learning setup. The user only needs to specify the name and description of the setup. | Network | Flsetup |
![]() | The FL Dataset Node creates the federated dataset, which generates train, test, and validation loaders from the clients' datasets. To create a federated dataset, the user must specify two parameters:
| Flsetupt | FL dataset |
![]() | The Model Node is responsible for creating the model that initializes the federated learning process. The user has several options based on whether they activate or deactivate transfer learning:
| FL datatset | model |
![]() | The Optimize Node is responsible for hyperparameter optimization. Users can optimize hyperparameters using the following methods:
For more details on Optuna, you can find additional information here. | Model + Dataset | Model |
![]() | The FL Strategy Node is responsible for creating the server strategy to aggregate and manage the network it contains. This includes defining:
| Model | fl strategy |
![]() | The Train Model Node is used to define the client resources for training, specifying whether to utilize GPU or CPU resources during the training process. | flstrategy | train results |
![]() | The Train Model Node is used to define the client resources for training, specifying whether to utilize GPU or CPU resources during the training process. | train results | save results |
![]() | This node is used to merge two or more results files into one file | save results / none |
.png)
.png)
.png)
.png)
.png)
.png)
.png)
.png)
.png)
.png)
 (1).png)