On this tutorial, we display a practical knowledge poisoning assault by manipulating labels within the CIFAR-10 dataset and observing its impression on mannequin conduct. We assemble a clear and a poisoned coaching pipeline aspect by aspect, utilizing a ResNet-style convolutional community to make sure steady, comparable studying dynamics. By selectively flipping a fraction of samples from a goal class to a malicious class throughout coaching, we present how refined corruption within the knowledge pipeline can propagate into systematic misclassification at inference time. Take a look at the FULL CODES right here.
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.knowledge import DataLoader, Dataset
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report
CONFIG = {
"batch_size": 128,
"epochs": 10,
"lr": 0.001,
"target_class": 1,
"malicious_label": 9,
"poison_ratio": 0.4,
}
torch.manual_seed(42)
np.random.seed(42)
We arrange the core setting required for the experiment and outline all world configuration parameters in a single place. We guarantee reproducibility by fixing random seeds throughout PyTorch and NumPy. We additionally explicitly choose the compute machine so the tutorial runs effectively on each CPU and GPU. Take a look at the FULL CODES right here.
class PoisonedCIFAR10(Dataset):
def __init__(self, original_dataset, target_class, malicious_label, ratio, is_train=True):
self.dataset = original_dataset
self.targets = np.array(original_dataset.targets)
self.is_train = is_train
if is_train and ratio > 0:
indices = np.the place(self.targets == target_class)[0]
n_poison = int(len(indices) * ratio)
poison_indices = np.random.selection(indices, n_poison, exchange=False)
self.targets[poison_indices] = malicious_label
def __getitem__(self, index):
img, _ = self.dataset[index]
return img, self.targets[index]
def __len__(self):
return len(self.dataset)
We implement a customized dataset wrapper that permits managed label poisoning throughout coaching. We selectively flip a configurable fraction of samples from the goal class to a malicious class whereas holding the check knowledge untouched. We protect the unique picture knowledge in order that solely label integrity is compromised. Take a look at the FULL CODES right here.
def get_model():
mannequin = torchvision.fashions.resnet18(num_classes=10)
mannequin.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
mannequin.maxpool = nn.Identification()
return mannequin.to(CONFIG["device"])
def train_and_evaluate(train_loader, description):
mannequin = get_model()
optimizer = optim.Adam(mannequin.parameters(), lr=CONFIG["lr"])
criterion = nn.CrossEntropyLoss()
for _ in vary(CONFIG["epochs"]):
mannequin.practice()
for photographs, labels in train_loader:
photographs = photographs.to(CONFIG["device"])
labels = labels.to(CONFIG["device"])
optimizer.zero_grad()
outputs = mannequin(photographs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
return mannequin
We outline a light-weight ResNet-based mannequin tailor-made for CIFAR-10 and implement the complete coaching loop. We practice the community utilizing normal cross-entropy loss and Adam optimization to make sure steady convergence. We maintain the coaching logic an identical for clear and poisoned knowledge to isolate the impact of knowledge poisoning. Take a look at the FULL CODES right here.
def get_predictions(mannequin, loader):
mannequin.eval()
preds, labels_all = [], []
with torch.no_grad():
for photographs, labels in loader:
photographs = photographs.to(CONFIG["device"])
outputs = mannequin(photographs)
_, predicted = torch.max(outputs, 1)
preds.prolong(predicted.cpu().numpy())
labels_all.prolong(labels.numpy())
return np.array(preds), np.array(labels_all)
def plot_results(clean_preds, clean_labels, poisoned_preds, poisoned_labels, lessons):
fig, ax = plt.subplots(1, 2, figsize=(16, 6))
for i, (preds, labels, title) in enumerate([
(clean_preds, clean_labels, "Clean Model Confusion Matrix"),
(poisoned_preds, poisoned_labels, "Poisoned Model Confusion Matrix")
]):
cm = confusion_matrix(labels, preds)
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", ax=ax[i],
xticklabels=lessons, yticklabels=lessons)
ax[i].set_title(title)
plt.tight_layout()
plt.present()
We run inference on the check set and accumulate predictions for quantitative evaluation. We compute confusion matrices to visualise class-wise conduct for each clear and poisoned fashions. We use these visible diagnostics to focus on focused misclassification patterns launched by the assault. Take a look at the FULL CODES right here.
remodel = transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465),
(0.2023, 0.1994, 0.2010))
])
base_train = torchvision.datasets.CIFAR10(root="./knowledge", practice=True, obtain=True, remodel=remodel)
base_test = torchvision.datasets.CIFAR10(root="./knowledge", practice=False, obtain=True, remodel=remodel)
clean_ds = PoisonedCIFAR10(base_train, CONFIG["target_class"], CONFIG["malicious_label"], ratio=0)
poison_ds = PoisonedCIFAR10(base_train, CONFIG["target_class"], CONFIG["malicious_label"], ratio=CONFIG["poison_ratio"])
clean_loader = DataLoader(clean_ds, batch_size=CONFIG["batch_size"], shuffle=True)
poison_loader = DataLoader(poison_ds, batch_size=CONFIG["batch_size"], shuffle=True)
test_loader = DataLoader(base_test, batch_size=CONFIG["batch_size"], shuffle=False)
clean_model = train_and_evaluate(clean_loader, "Clear Coaching")
poisoned_model = train_and_evaluate(poison_loader, "Poisoned Coaching")
c_preds, c_true = get_predictions(clean_model, test_loader)
p_preds, p_true = get_predictions(poisoned_model, test_loader)
plot_results(c_preds, c_true, p_preds, p_true, lessons)
print(classification_report(c_true, c_preds, target_names=lessons, labels=[1]))
print(classification_report(p_true, p_preds, target_names=lessons, labels=[1]))
We put together the CIFAR-10 dataset, assemble clear and poisoned dataloaders, and execute each coaching pipelines finish to finish. We consider the skilled fashions on a shared check set to make sure a good comparability. We finalize the evaluation by reporting class-specific precision and recall to reveal the impression of poisoning on the focused class.
In conclusion, we noticed how label-level knowledge poisoning degrades class-specific efficiency with out essentially destroying total accuracy. We analyzed this conduct utilizing confusion matrices and per-class classification reviews, which reveal focused failure modes launched by the assault. This experiment reinforces the significance of knowledge provenance, validation, and monitoring in real-world machine studying programs, particularly in safety-critical domains.
Take a look at the FULL CODES right here. Additionally, be happy to observe us on Twitter and don’t neglect to hitch our 100k+ ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you may be part of us on telegram as properly.
Take a look at our newest launch of ai2025.dev, a 2025-focused analytics platform that turns mannequin launches, benchmarks, and ecosystem exercise right into a structured dataset you may filter, evaluate, and export.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.
