Skip to content
Snippets Groups Projects
Commit 69ba7019 authored by Marie Weiel's avatar Marie Weiel :zap:
Browse files

fix typo

parent adf8de76
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:ea5b6890 tags: %% Cell type:markdown id:ea5b6890 tags:
# Skalierbare Methoden der Künstlichen Intelligenz # Skalierbare Methoden der Künstlichen Intelligenz
Dr. Charlotte Debus (charlotte.debus@kit.edu) Dr. Charlotte Debus (charlotte.debus@kit.edu)
Dr. Markus Götz (markus.goetz@kit.edu) Dr. Markus Götz (markus.goetz@kit.edu)
Dr. Marie Weiel (marie.weiel@kit.edu) Dr. Marie Weiel (marie.weiel@kit.edu)
Dr. Kaleb Phipps (kaleb.phipps@kit.edu) Dr. Kaleb Phipps (kaleb.phipps@kit.edu)
## Übung 1 am 19.11.24: Paralleles k-Means-Clustering ## Übung 1 am 19.11.24: Paralleles k-Means-Clustering
In der ersten Übung beschäftigen wir uns mit der k-Means-Clusteranalyse und möglichen Parallelisierungsansätzen (siehe Vorlesung vom 07.11.24). Dazu verwenden wir den [Cityscapes](https://www.cityscapes-dataset.com/)-Datensatz. Dieser Datensatz bietet unter anderem 5000 hochaufgelöste Bilder von Straßenszenen aus 50 verschiedenen Städten. In der ersten Übung beschäftigen wir uns mit der k-Means-Clusteranalyse und möglichen Parallelisierungsansätzen (siehe Vorlesung vom 07.11.24). Dazu verwenden wir den [Cityscapes](https://www.cityscapes-dataset.com/)-Datensatz. Dieser Datensatz bietet unter anderem 5000 hochaufgelöste Bilder von Straßenszenen aus 50 verschiedenen Städten.
Jedes dieser Bilder besteht aus 2048 x 1024 Pixeln mit drei 256-Bit RGB-Farbkanälen pro Pixel, die in einer "Short-Fat"-Matrix mit 5000 x 6 291 456 seriellen Einträgen zusammengefasst sind: 5000 Bilder x (3 Kanäle x 2048 Pixel x 1024 Pixel) = 5000 x 6 291 456 Jedes dieser Bilder besteht aus 2048 x 1024 Pixeln mit drei 256-Bit RGB-Farbkanälen pro Pixel, die in einer "Short-Fat"-Matrix mit 5000 x 6 291 456 seriellen Einträgen zusammengefasst sind: 5000 Bilder x (3 Kanäle x 2048 Pixel x 1024 Pixel) = 5000 x 6 291 456
Für unsere Aufgabe benutzen wir 300 dieser Samples. Sie finden diese auf dem bwUniCluster im Workspace `VL-ScalableAI` unter folgendem Pfad: Für unsere Aufgabe benutzen wir 300 dieser Samples. Sie finden diese auf dem bwUniCluster im Workspace `VL-ScalableAI` unter folgendem Pfad:
`/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5` `/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5`
### Aufgabe 1 ### Aufgabe 1
Untenstehend finden Sie eine serielle Implementierung des k-Means-Algorithmus in Python 3 unter Verwendung der Programmbibliothek für maschinelles Lernen [PyTorch](https://pytorch.org/). Untenstehend finden Sie eine serielle Implementierung des k-Means-Algorithmus in Python 3 unter Verwendung der Programmbibliothek für maschinelles Lernen [PyTorch](https://pytorch.org/).
Führen Sie den Code auf einem CPU-basierten Knoten bzw. einer GPU auf dem bwUniCluster aus. Beachten Sie dabei, dass der Code für die GPU-Nutzung angepasst werden muss. Vergleichen Sie die Laufzeit. Was fällt Ihnen auf? Führen Sie den Code auf einem CPU-basierten Knoten bzw. einer GPU auf dem bwUniCluster aus. Beachten Sie dabei, dass der Code für die GPU-Nutzung angepasst werden muss. Vergleichen Sie die Laufzeit. Was fällt Ihnen auf?
*Hinweis: Laden Sie zunächst die benötigten Module auf dem bwUniCluster. Setzen Sie dann eine virtuelle Umgebung mit Python auf, in der Sie die benötigten Pythonpakete installieren. Erstellen Sie basierend auf untenstehendem Code ein Python-Skript, welches Sie mithilfe eines Bash-Skripts über SLURM auf dem Cluster submitten (siehe Übung vom 05.11.24). Nachfolgend finden Sie ein Template für das Submit-Skript für den CPU-Job inklusive der benötigten Module. Für die GPU-Nutzung müssen die #SBATCH-Optionen entsprechend angepasst werden. Weitere Informationen dazu finden Sie [hier](https://wiki.bwhpc.de/e/BwUniCluster_2.0_Slurm_common_Features).* *Hinweis: Laden Sie zunächst die benötigten Module auf dem bwUniCluster. Setzen Sie dann eine virtuelle Umgebung mit Python auf, in der Sie die benötigten Pythonpakete installieren. Erstellen Sie basierend auf untenstehendem Code ein Python-Skript, welches Sie mithilfe eines Bash-Skripts über SLURM auf dem Cluster submitten (siehe Übung vom 05.11.24). Nachfolgend finden Sie ein Template für das Submit-Skript für den CPU-Job inklusive der benötigten Module. Für die GPU-Nutzung müssen die #SBATCH-Optionen entsprechend angepasst werden. Weitere Informationen dazu finden Sie [hier](https://wiki.bwhpc.de/e/BwUniCluster_2.0_Slurm_common_Features).*
%% Cell type:code id:27a9e43b tags: %% Cell type:code id:27a9e43b tags:
``` python ``` python
#!/bin/bash #!/bin/bash
#SBATCH --job-name=kmeans_cpu # job name #SBATCH --job-name=kmeans_cpu # job name
#SBATCH --partition=single # queue for resource allocation #SBATCH --partition=single # queue for resource allocation
#SBATCH --time=30:00 # wall-clock time limit #SBATCH --time=30:00 # wall-clock time limit
#SBATCH --mem=40000 # memory #SBATCH --mem=40000 # memory
#SBATCH --nodes=1 # number of nodes to be used #SBATCH --nodes=1 # number of nodes to be used
#SBATCH --mail-type=ALL # Notify user by email when certain event types occur. #SBATCH --mail-type=ALL # Notify user by email when certain event types occur.
#SBATCH --mail-user=u????@student.kit.edu # notification email address #SBATCH --mail-user=u????@student.kit.edu # notification email address
export VENVDIR=<path/to/your/venv/folder> # Export path to your Python3.11 virtual environment. export VENVDIR=<path/to/your/venv/folder> # Export path to your Python3.11 virtual environment.
export PYDIR=<path/to/your/python/script> # Export path to directory containing Python script. export PYDIR=<path/to/your/python/script> # Export path to directory containing Python script.
# Set up modules. # Set up modules.
module purge # Unload all currently loaded modules. module purge # Unload all currently loaded modules.
module load compiler/gnu/13.3 # Load required modules. module load compiler/gnu/13.3 # Load required modules.
module load mpi/openmpi/4.1 module load mpi/openmpi/4.1
module load devel/cuda/12.4 module load devel/cuda/12.4
module load lib/hdf5/1.14.4-gnu-13.3-openmpi-4.1 module load lib/hdf5/1.14.4-gnu-13.3-openmpi-4.1
source ${VENVDIR}/bin/activate # Activate your virtual environment. source ${VENVDIR}/bin/activate # Activate your virtual environment.
python -u ${PYDIR}/kmeans.py # Run your Python script. python -u ${PYDIR}/kmeans.py # Run your Python script.
``` ```
%% Cell type:code id:4113e8f9-128b-4d10-add7-4a64d470456e tags: %% Cell type:code id:4113e8f9-128b-4d10-add7-4a64d470456e tags:
``` python ``` python
""" """
Serial implementation of k-means clustering in PyTorch Serial implementation of k-means clustering in PyTorch
""" """
import time import time
import h5py import h5py
import torch import torch
class KMeans: class KMeans:
""" """
Serial k-means clustering in PyTorch. Serial k-means clustering in PyTorch.
Attributes Attributes
---------- ----------
n_clusters : int n_clusters : int
The number of clusters, i.e., k. The number of clusters, i.e., k.
max_iter : int max_iter : int
The maximum number of iterations to perform. The maximum number of iterations to perform.
tol : float tol : float
The tolerance for the convergence criterion. The tolerance for the convergence criterion.
_centroids : Union[None, torch.Tensor] _centroids : Union[None, torch.Tensor]
The current centroids. The current centroids.
_matching_centroids : Union[None, torch.Tensor] _matching_centroids : Union[None, torch.Tensor]
Assigned centroids for all samples in dataset. Assigned centroids for all samples in dataset.
_inertia : float _inertia : float
The inertia (quantity to be checked for convergence). The inertia (quantity to be checked for convergence).
Methods Methods
------- -------
_initialize_centroids(x) _initialize_centroids(x)
Randomly initialize centroids. Randomly initialize centroids.
_fit_to_cluster(x) _fit_to_cluster(x)
Get the closest centroid for each sample in dataset. Get the closest centroid for each sample in dataset.
fit(x) fit(x)
Perform k-means clustering. Perform k-means clustering.
""" """
def __init__( def __init__(
self, n_clusters: int = 8, max_iter: int = 300, tol: float = -1.0 self, n_clusters: int = 8, max_iter: int = 300, tol: float = -1.0
) -> None: ) -> None:
""" """
Configure k-means clustering algorithm. Configure k-means clustering algorithm.
Parameters Parameters
---------- ----------
n_clusters : int n_clusters : int
The number of clusters, i.e., k. The number of clusters, i.e., k.
max_iter : int max_iter : int
The maximum number of iterations to be performed. The maximum number of iterations to be performed.
tol : float tol : float
The tolerance for the convergence criterion. The tolerance for the convergence criterion.
""" """
self.n_clusters = n_clusters # Number of clusters self.n_clusters = n_clusters # Number of clusters
self.max_iter = max_iter # Maximum number of iterations self.max_iter = max_iter # Maximum number of iterations
self._centroids = None self._centroids = None
self._matching_centroids = None self._matching_centroids = None
self.tol = tol # Tolerance for convergence criterion self.tol = tol # Tolerance for convergence criterion
self._inertia = float("nan") self._inertia = float("nan")
def _initialize_centroids(self, x: torch.Tensor) -> None: def _initialize_centroids(self, x: torch.Tensor) -> None:
""" """
Randomly initialize the centroids. Randomly initialize the centroids.
Parameters Parameters
---------- ----------
x : torch.Tensor x : torch.Tensor
The dataset to be clustered. The dataset to be clustered.
""" """
# Shuffle data and choose first `n_clusters` samples as initial centroids. # Shuffle data and choose first `n_clusters` samples as initial centroids.
self._centroids = x[torch.randperm(x.shape[0])[: self.n_clusters]] self._centroids = x[torch.randperm(x.shape[0])[: self.n_clusters]]
def _fit_to_cluster(self, x: torch.Tensor) -> torch.Tensor: def _fit_to_cluster(self, x: torch.Tensor) -> torch.Tensor:
""" """
Determine the closest centroids for each sample in dataset as measured by their Euclidean distance. Determine the closest centroids for each sample in dataset as measured by their Euclidean distance.
Parameters Parameters
---------- ----------
x : torch.Tensor x : torch.Tensor
The dataset to be clustered. The dataset to be clustered.
Returns Returns
------- -------
torch.Tensor torch.Tensor
Indices of matching centroids for each sample in dataset. Indices of matching centroids for each sample in dataset.
""" """
distances = torch.cdist( distances = torch.cdist(
x, self._centroids x, self._centroids
) # Calculate Euclidian distance of each data sample to each current centroid. ) # Calculate Euclidean distance of each data sample to each current centroid.
return distances.argmin( return distances.argmin(
dim=1, keepdim=True dim=1, keepdim=True
) # Return index of the closest centroid for each sample. ) # Return index of the closest centroid for each sample.
def fit(self, x: torch.Tensor) -> "KMeans": def fit(self, x: torch.Tensor) -> "KMeans":
""" """
Perform k-means clustering of given dataset. Perform k-means clustering of given dataset.
Parameters Parameters
---------- ----------
x : torch.Tensor x : torch.Tensor
The dataset to cluster. The dataset to cluster.
Returns Returns
------- -------
KMeans KMeans
The fitted KMeans object containing final centroids. The fitted KMeans object containing final centroids.
""" """
self._initialize_centroids(x) # Initialize centroids. self._initialize_centroids(x) # Initialize centroids.
new_cluster_centers = self._centroids.clone() new_cluster_centers = self._centroids.clone()
# Iteratively fit points to centroids. # Iteratively fit points to centroids.
for idx in range(self.max_iter): for idx in range(self.max_iter):
# Determine index of the closest centroid for each sample in dataset. # Determine index of the closest centroid for each sample in dataset.
print(f"Iteration {idx}...") print(f"Iteration {idx}...")
self._matching_centroids = self._fit_to_cluster( self._matching_centroids = self._fit_to_cluster(
x x
) # Array of length `n_samples` providing index of closest centroid for each sample in dataset. ) # Array of length `n_samples` providing index of closest centroid for each sample in dataset.
# Update centroids. # Update centroids.
for i in range(self.n_clusters): # Loop over clusters. for i in range(self.n_clusters): # Loop over clusters.
# Determine all points in current cluster. # Determine all points in current cluster.
selection_mask = (self._matching_centroids == i).type(torch.int64) selection_mask = (self._matching_centroids == i).type(torch.int64)
# Array of length `n_samples` with binary encoding of whether each sample belongs to cluster i or not. # Array of length `n_samples` with binary encoding of whether each sample belongs to cluster i or not.
assigned_points = (x * selection_mask).sum( assigned_points = (x * selection_mask).sum(
axis=0, keepdim=True axis=0, keepdim=True
) # Compute vectorial sum of all points in current cluster. ) # Compute vectorial sum of all points in current cluster.
points_in_cluster = selection_mask.sum(axis=0, keepdim=True).clamp( points_in_cluster = selection_mask.sum(axis=0, keepdim=True).clamp(
1, torch.iinfo(torch.int64).max 1, torch.iinfo(torch.int64).max
) # Compute number of points in current cluster. ) # Compute number of points in current cluster.
new_cluster_centers[i : i + 1, :] = ( new_cluster_centers[i : i + 1, :] = (
assigned_points / points_in_cluster assigned_points / points_in_cluster
) # Compute new centroids. ) # Compute new centroids.
# Check whether centroid movement has converged. # Check whether centroid movement has converged.
self._inertia = ( self._inertia = (
(self._centroids - new_cluster_centers) ** 2 (self._centroids - new_cluster_centers) ** 2
).sum() # Update inertia. ).sum() # Update inertia.
self._centroids = new_cluster_centers.clone() self._centroids = new_cluster_centers.clone()
if ( if (
self.tol is not None and self._inertia <= self.tol self.tol is not None and self._inertia <= self.tol
): # Check whether inertia is smaller than tolerance. ): # Check whether inertia is smaller than tolerance.
break break
return self return self
if __name__ == "__main__": if __name__ == "__main__":
print( print(
"##############################\n" "##############################\n"
"# PyTorch k-Means Clustering #\n" "# PyTorch k-Means Clustering #\n"
"##############################" "##############################"
) )
path = "/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5" path = "/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5"
dataset = "cityscapes_data" dataset = "cityscapes_data"
## ADAPT CODE HERE TO ENABLE GPU USAGE: ## ADAPT CODE HERE TO ENABLE GPU USAGE:
device = torch.device("cpu") device = torch.device("cpu")
print(f"Loading dataset from {path}[{dataset}]...") print(f"Loading dataset from {path}[{dataset}]...")
# Data is available in HDF5 format. # Data is available in HDF5 format.
# An HDF5 file is a container for two kinds of objects: # An HDF5 file is a container for two kinds of objects:
# - datasets: array-like collections of data # - datasets: array-like collections of data
# - groups: folder-like containers holding datasets and other groups # - groups: folder-like containers holding datasets and other groups
# Most fundamental thing to remember when using h5py is: # Most fundamental thing to remember when using h5py is:
# Groups work like dictionaries, and datasets work like NumPy arrays. # Groups work like dictionaries, and datasets work like NumPy arrays.
# Open file for reading. We use the Cityscapes dataset. # Open file for reading. We use the Cityscapes dataset.
with h5py.File(path, "r") as handle: with h5py.File(path, "r") as handle:
print("Open h5 file...") print("Open h5 file...")
data = torch.tensor( data = torch.tensor(
handle[dataset][:300], device=device handle[dataset][:300], device=device
) # Default device is "cpu"; set device to "cuda" for GPU usage. ) # Default device is "cpu"; set device to "cuda" for GPU usage.
print("Torch tensor created.") print("Torch tensor created.")
# k-means hyperparameters # k-means hyperparameters
num_clusters = 8 num_clusters = 8
num_iterations = 20 num_iterations = 20
kmeans_clusterer = KMeans(n_clusters=num_clusters, max_iter=num_iterations) kmeans_clusterer = KMeans(n_clusters=num_clusters, max_iter=num_iterations)
print("Start fitting the data...") print("Start fitting the data...")
start = time.perf_counter() # Start timer. start = time.perf_counter() # Start timer.
kmeans_clusterer.fit(data) # Perform k-means clustering. kmeans_clusterer.fit(data) # Perform k-means clustering.
print(f"DONE.\nRun time: \t{time.perf_counter() - start} s") # Measure and print runtime. print(f"DONE.\nRun time: \t{time.perf_counter() - start} s") # Measure and print runtime.
``` ```
%% Cell type:markdown id:6c7900f7 tags: %% Cell type:markdown id:6c7900f7 tags:
### Aufgabe 2 ### Aufgabe 2
Implementieren Sie ausgehend von obigem Code eine Sample-parallele Version des k-Means-Algorithmus. Orientieren Sie sich dabei an der obenstehenden seriellen Implementierung. Implementieren Sie ausgehend von obigem Code eine Sample-parallele Version des k-Means-Algorithmus. Orientieren Sie sich dabei an der obenstehenden seriellen Implementierung.
Das Interface bzw. die Benutzung der Klasse im eigentlichen Ausführungsteil des Codes soll gleich bleiben. Für die Parallelisierung benötigen Sie einen entsprechend parallelisierten Dataloader. Diesen finden Sie im untenstehenden Code-Fragment. Testen Sie Ihren Code auf vier Knoten des bwUniClusters. Untenstehend finden Sie ein entsprechendes Submit-Skript in bash. Das Interface bzw. die Benutzung der Klasse im eigentlichen Ausführungsteil des Codes soll gleich bleiben. Für die Parallelisierung benötigen Sie einen entsprechend parallelisierten Dataloader. Diesen finden Sie im untenstehenden Code-Fragment. Testen Sie Ihren Code auf vier Knoten des bwUniClusters. Untenstehend finden Sie ein entsprechendes Submit-Skript in bash.
%% Cell type:code id:691eeb17 tags: %% Cell type:code id:691eeb17 tags:
``` python ``` python
#!/bin/bash #!/bin/bash
#SBATCH --job-name=kmeans_sample # job name #SBATCH --job-name=kmeans_sample # job name
#SBATCH --partition=multiple # queue for the resource allocation. #SBATCH --partition=multiple # queue for the resource allocation.
#SBATCH --time=30:00 # wall-clock time limit #SBATCH --time=30:00 # wall-clock time limit
#SBATCH --mem=40000 # memory per node #SBATCH --mem=40000 # memory per node
#SBATCH --nodes=4 # number of nodes to be used #SBATCH --nodes=4 # number of nodes to be used
#SBATCH --cpus-per-task=40 # number of CPUs required per MPI task #SBATCH --cpus-per-task=40 # number of CPUs required per MPI task
#SBATCH --ntasks-per-node=1 # maximum count of tasks per node #SBATCH --ntasks-per-node=1 # maximum count of tasks per node
#SBATCH --mail-type=ALL # Notify user by email when certain event types occur. #SBATCH --mail-type=ALL # Notify user by email when certain event types occur.
#SBATCH --mail-user=u????@student.kit.edu # notification email address #SBATCH --mail-user=u????@student.kit.edu # notification email address
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export VENVDIR=<path/to/your/venv/folder> # Export path to your Python3.11 virtual environment. export VENVDIR=<path/to/your/venv/folder> # Export path to your Python3.11 virtual environment.
export PYDIR=<path/to/your/python/script> # Export path to directory containing Python script. export PYDIR=<path/to/your/python/script> # Export path to directory containing Python script.
# Set up modules. # Set up modules.
module purge # Unload all currently loaded modules. module purge # Unload all currently loaded modules.
module load compiler/gnu/13.3 # Load required modules. module load compiler/gnu/13.3 # Load required modules.
module load mpi/openmpi/4.1 module load mpi/openmpi/4.1
module load devel/cuda/12.4 module load devel/cuda/12.4
module load lib/hdf5/1.14.4-gnu-13.3-openmpi-4.1 module load lib/hdf5/1.14.4-gnu-13.3-openmpi-4.1
source ${VENVDIR}/bin/activate # Activate your virtual environment. source ${VENVDIR}/bin/activate # Activate your virtual environment.
mpirun python ${PYDIR}/kmeans_sample_parallel.py # Run your Python script in parallel. mpirun python ${PYDIR}/kmeans_sample_parallel.py # Run your Python script in parallel.
``` ```
%% Cell type:code id:eac3eac5 tags: %% Cell type:code id:eac3eac5 tags:
``` python ``` python
""" """
Sample-parallel implementation of k-means clustering in PyTorch using MPI Sample-parallel implementation of k-means clustering in PyTorch using MPI
""" """
import time import time
import h5py import h5py
import torch import torch
from mpi4py import MPI from mpi4py import MPI
class KMeans: class KMeans:
""" """
Sample-parallel k-means clustering in PyTorch using MPI. Sample-parallel k-means clustering in PyTorch using MPI.
""" """
def __init__( def __init__(
self, self,
comm: MPI.Comm = MPI.COMM_WORLD, comm: MPI.Comm = MPI.COMM_WORLD,
n_clusters: int = 8, n_clusters: int = 8,
max_iter: int = 300, max_iter: int = 300,
tol: float = -1.0, tol: float = -1.0,
) -> None: ) -> None:
"""Configure sample-parallel k-means clustering algorithm.""" """Configure sample-parallel k-means clustering algorithm."""
self.comm = comm # The communicator used self.comm = comm # The communicator used
pass pass
## IMPLEMENT SAMPLE-PARALLEL K-MEANS CLUSTERING VERSION HERE! ## IMPLEMENT SAMPLE-PARALLEL K-MEANS CLUSTERING VERSION HERE!
if __name__ == "__main__": if __name__ == "__main__":
comm = MPI.COMM_WORLD comm = MPI.COMM_WORLD
rank, size = comm.rank, comm.size rank, size = comm.rank, comm.size
if rank == 0: if rank == 0:
print( print(
"#################################################\n" "#################################################\n"
"# Sample-Parallel k-Means Clustering in PyTorch #\n" "# Sample-Parallel k-Means Clustering in PyTorch #\n"
"#################################################" "#################################################"
) )
path = "/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5" path = "/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5"
dataset = "cityscapes_data" dataset = "cityscapes_data"
if rank == 0: if rank == 0:
print(f"Loading dataset from {path}[{dataset}]...") print(f"Loading dataset from {path}[{dataset}]...")
# Dataset is split along the sample axis. # Dataset is split along the sample axis.
# Each rank loads exclusive chunk of original dataset. # Each rank loads exclusive chunk of original dataset.
with h5py.File(path, "r") as handle: with h5py.File(path, "r") as handle:
chunk = int(handle[dataset].shape[0] / size) chunk = int(handle[dataset].shape[0] / size)
if rank == size - 1: if rank == size - 1:
data = torch.tensor(handle[dataset][rank * chunk :]) data = torch.tensor(handle[dataset][rank * chunk :])
else: else:
data = torch.tensor(handle[dataset][rank * chunk : (rank + 1) * chunk]) data = torch.tensor(handle[dataset][rank * chunk : (rank + 1) * chunk])
print("\t[OK]") print("\t[OK]")
# k-means hyperparameters # k-means hyperparameters
num_clusters = 8 num_clusters = 8
num_iterations = 20 num_iterations = 20
kmeans_clusterer = KMeans(comm=comm, n_clusters=num_clusters, max_iter=num_iterations) kmeans_clusterer = KMeans(comm=comm, n_clusters=num_clusters, max_iter=num_iterations)
if rank == 0: if rank == 0:
print("Start fitting the data...") print("Start fitting the data...")
start = time.perf_counter() # Start runtime measurement. start = time.perf_counter() # Start runtime measurement.
kmeans_clusterer.fit(data) # Perform actual k-means clustering. kmeans_clusterer.fit(data) # Perform actual k-means clustering.
if rank == 0: if rank == 0:
print(f"DONE.\nRun time:\t{time.perf_counter() - start} s") # Measure and print runtime. print(f"DONE.\nRun time:\t{time.perf_counter() - start} s") # Measure and print runtime.
``` ```
%% Cell type:markdown id:d5379ea7 tags: %% Cell type:markdown id:d5379ea7 tags:
### Aufgabe 3 ### Aufgabe 3
Implementieren Sie ausgehend von obigem Code eine Feature-parallele Version des k-Means-Algorithmus. Den entsprechend parallelisierten Dataloader finden Sie im untenstehenden Code-Fragment. Testen Sie Ihren Code auf vier Knoten des bwUniClusters. Implementieren Sie ausgehend von obigem Code eine Feature-parallele Version des k-Means-Algorithmus. Den entsprechend parallelisierten Dataloader finden Sie im untenstehenden Code-Fragment. Testen Sie Ihren Code auf vier Knoten des bwUniClusters.
%% Cell type:code id:217173aa tags: %% Cell type:code id:217173aa tags:
``` python ``` python
""" """
Feature-parallel implementation of k-means clustering in PyTorch using MPI Feature-parallel implementation of k-means clustering in PyTorch using MPI
""" """
import time import time
import h5py import h5py
import torch import torch
from mpi4py import MPI from mpi4py import MPI
class KMeans: class KMeans:
""" """
Feature-parallel k-means clustering in PyTorch using MPI. Feature-parallel k-means clustering in PyTorch using MPI.
""" """
def __init__( def __init__(
self, self,
comm: MPI.Comm = MPI.COMM_WORLD, comm: MPI.Comm = MPI.COMM_WORLD,
n_clusters: int = 8, n_clusters: int = 8,
max_iter: int = 300, max_iter: int = 300,
tol: float = -1.0, tol: float = -1.0,
) -> None: ) -> None:
"""Configure feature-parallel k-means clustering algorithm.""" """Configure feature-parallel k-means clustering algorithm."""
self.comm = comm # The communicator used. self.comm = comm # The communicator used.
pass pass
## IMPLEMENT FEATURE-PARALLEL K-MEANS CLUSTERING VERSION HERE! ## IMPLEMENT FEATURE-PARALLEL K-MEANS CLUSTERING VERSION HERE!
if __name__ == "__main__": if __name__ == "__main__":
comm = MPI.COMM_WORLD comm = MPI.COMM_WORLD
rank, size = comm.rank, comm.size rank, size = comm.rank, comm.size
if rank == 0: if rank == 0:
print( print(
"##################################################\n" "##################################################\n"
"# Feature-Parallel k-Means Clustering in PyTorch #\n" "# Feature-Parallel k-Means Clustering in PyTorch #\n"
"##################################################" "##################################################"
) )
path = "/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5" path = "/pfs/work7/workspace/scratch/ku4408-VL-ScalableAI/data/cityscapes_300.h5"
dataset = "cityscapes_data" dataset = "cityscapes_data"
if rank == 0: if rank == 0:
print(f"Loading dataset from {path}[{dataset}]...") print(f"Loading dataset from {path}[{dataset}]...")
# Dataset is split along the feature axis. # Dataset is split along the feature axis.
# Each rank loads exclusive chunk of original dataset. # Each rank loads exclusive chunk of original dataset.
with h5py.File(path, "r") as handle: with h5py.File(path, "r") as handle:
chunk = int(handle[dataset].shape[1] / size) chunk = int(handle[dataset].shape[1] / size)
if rank == size - 1: if rank == size - 1:
data = torch.tensor(handle[dataset][:, rank * chunk :]) data = torch.tensor(handle[dataset][:, rank * chunk :])
else: else:
data = torch.tensor(handle[dataset][:, rank * chunk : (rank + 1) * chunk]) data = torch.tensor(handle[dataset][:, rank * chunk : (rank + 1) * chunk])
print("\t[OK]") print("\t[OK]")
# k-means hyperparameters # k-means hyperparameters
num_clusters = 8 num_clusters = 8
num_iterations = 20 num_iterations = 20
kmeans_clusterer = KMeans(comm=comm, n_clusters=num_clusters, max_iter=num_iterations) kmeans_clusterer = KMeans(comm=comm, n_clusters=num_clusters, max_iter=num_iterations)
if rank == 0: if rank == 0:
print("Start fitting the data...") print("Start fitting the data...")
start = time.perf_counter() # Start runtime measurement. start = time.perf_counter() # Start runtime measurement.
kmeans_clusterer.fit(data) # Perform actual k-means clustering. kmeans_clusterer.fit(data) # Perform actual k-means clustering.
if rank == 0: if rank == 0:
print(f"DONE.\nRun time:\t{time.perf_counter() - start} s") print(f"DONE.\nRun time:\t{time.perf_counter() - start} s")
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment