Training Guides

End-to-end recipes for fine-tuning and pretraining LLMs on Alauda AI.

When you want…	Use	Guide
Reusable templates, repeatable runs, optional Kueue quotas	Kubeflow Trainer v2 + LlamaFactory	Fine-Tuning with Kubeflow Trainer v2
Mix training with online inference, yield GPU back on demand	Kueue cohort + preemption + checkpoint resume	Preemptible TrainJobs with Kueue
A curated set of `TrainingRuntime` images (CUDA / CANN)	Trainer v2 runtime catalog	Training Runtime Images
One-shot quick start of distributed PyTorch on Trainer v2	`ClusterTrainingRuntime` + MNIST	Kubeflow Trainer Quick Start
Production SFT / OSFT with automatic memory management	`training_hub`	Fine-tuning LLMs with Training Hub
Interactive exploration, custom scripts, VolcanoJob submission	Workbench Notebook	Fine-tuning LLMs using Workbench
Full-parameter SFT / pretraining on Ascend NPU	Workbench `PyTorch CANN` / `MindSpore CANN`	Fine-tune and Pretrain on Ascend NPU

TOC