You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This model was trained on HAM10000 for research/educational purposes only and is not a medical device. By requesting access you confirm that you will not use these weights for clinical decision-making, diagnosis, or any patient- facing application.

Log in or Sign Up to review the conditions and access this model content.

ConvNeXt-Tiny β€” HAM10000 7-class skin-lesion classifier (lesion-disjoint)

Fine-tuned convnext_tiny.fb_in22k_ft_in1k on HAM10000 using a lesion-disjoint train/val/test split so that no lesion appears in more than one split. This is the honest evaluation condition; splits that are merely image-random leak ~88% of test lesions into training and produce inflated metrics.

Results

Held-out test set (n=2008 images, fp32, lesion-disjoint):

Metric Value
Accuracy 0.815
Balanced accuracy 0.778
Macro F1 0.748
Class Precision Recall F1 Support
actinic_keratoses 0.55 0.84 0.66 57
basal_cell_carcinoma 0.68 0.76 0.72 101
benign_keratosis-like_lesions 0.67 0.77 0.72 231
dermatofibroma 0.86 0.75 0.80 24
melanocytic_Nevi 0.94 0.86 0.90 1343
melanoma 0.53 0.61 0.56 226
vascular_lesions 0.92 0.85 0.88 26

Files

  • backbone.pt β€” timm backbone state_dict (load with timm.create_model)
  • run_config.json β€” model name + input preprocessing (size/mean/std/crop)
  • label_map.json β€” {id2label, label2id}
  • split_manifest.json β€” lesion + image IDs for each split (reproducibility)
  • test_metrics.json, test_report.txt β€” per-class metrics

Usage

import json, timm, torch
import torch.nn.functional as F
from PIL import Image
from timm.data import create_transform, resolve_data_config
from huggingface_hub import hf_hub_download

REPO = "andy279/convnext-tiny-ham10000-lesion-disjoint"

cfg    = json.loads(open(hf_hub_download(REPO, "run_config.json")).read())
labels = json.loads(open(hf_hub_download(REPO, "label_map.json")).read())
id2label = {int(k): v for k, v in labels["id2label"].items()}

model = timm.create_model(cfg["model_name"], pretrained=False, num_classes=len(id2label))
model.load_state_dict(torch.load(hf_hub_download(REPO, "backbone.pt"), map_location="cpu"))
model.eval()

tf = create_transform(**resolve_data_config({}, model=model), is_training=False)
img = Image.open("my_lesion.jpg").convert("RGB")
probs = F.softmax(model(tf(img).unsqueeze(0)), dim=-1)[0]
for p, i in zip(*torch.topk(probs, 3)):
    print(f"{float(p)*100:5.1f}%  {id2label[int(i)]}")

Training details

  • Dataset: HAM10000 via marmal88/skin_cancer mirror, regrouped by lesion_id into 9305 / 2041 / 2008 train/val/test images across 5235 / 1122 / 1113 unique lesions, stratified by diagnosis.
  • Backbone: timm.create_model("convnext_tiny.fb_in22k_ft_in1k", ...)
  • Loss: cross-entropy with inverse-frequency class weights (computed from the train split after regrouping).
  • Optimizer: AdamW, lr 3e-4, weight decay 0.05, cosine schedule, 5% warmup.
  • 8 epochs, batch size 64, bf16 on one H100.
  • Train loop: transformers.Trainer wrapping a thin nn.Module around the timm model so the forward signature returns {"loss", "logits"}.

Caveats

  • Melanoma recall is only 0.61 β€” the weakest class. Do not rely on this model for ruling out melanoma.
  • HAM10000 is dermatoscopic and predominantly captures white-skin subjects. Performance on clinical (smartphone) images or on underrepresented skin tones is not validated here.
  • Not a medical device. Research / educational use only.

Citation

@article{tschandl2018ham10000,
  title = {The {HAM10000} dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions},
  author = {Tschandl, Philipp and Rosendahl, Cliff and Kittler, Harald},
  journal = {Scientific Data},
  volume = {5},
  pages = {180161},
  year = {2018},
  doi = {10.1038/sdata.2018.161},
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train andy279/convnext-tiny-ham10000-lesion-disjoint