In French Home January 11, 2026 :Training and test an SDXL LoRA with SD-SCRIPTS (KOHYA) on RunPod and Vastaiβ Real-world experience feedback
Welcome in English January 11, 2026: Training a LoRa SDXL with SD-SCRIPTS (KOHYA) on RunPod or Vastai β Real-world experience, and testing itINTRODUCTION
How to train a LoRa with stable-diffusion-xl-base-1.0 and SD-SCRIPTS (KOHYA) and a dataset on Runpod This isn't really a tutorial, but rather the result of my experience. A compilation of my AI's responses that worked It took me many days to get the training working. It's a fragile environment that must be earned. I first tried training my LoRa on Google Colab The T4 has insufficient memory of 15GB, which has led to a series of silent crashes. Furthermore, the storage is not persistent and there are numerous unexpected disconnections. Rent a POD on Runpod; it will only cost you a few cents per hour on demand, only when using the GPU. Training LoRa takes between 1 and 2.5 hours depending on the dataset size. RTX A6000 FEATURES Many CUDA cores + Tensor β very high parallel computing capacity, ideal for wide architectures like SDXL and LoRA training. πΎ Memory and bandwidth GPU Memory: 48 GB GDDR6 with ECC Memory bus: 384-bit Bandwidth: ~768 GB/s π Practical consequence: 48 GB of VRAM allows SDXL to be trained in high resolutions (1024Γ1024), with bucket/latent caching, and to avoid Out-Of-Memory (OOM). High bandwidth facilitates rapid access to data, which is crucial for broadcast models. β‘ Performance FP32 performance (single precision): ~38.7 TFLOPS Tensor Performance: ~309.7 TFLOPS (Tensor Cores) TDP (maximum power consumption): 300 W Interface: PCI Express 4.0 Γ 16 π Practical consequence: Very high computing performance, which translates to: faster iterations (β 1.4-1.6 s/it in SDXL LoRA) improved tensor efficiency (fp16) support for large batches if needed THE DATASET (directory of images and text files) I'll skip over some details quickly. Open a terminal in Jupiterlab on your pod; this is where you will work. Create the directories (see directory tree below) either with the `mkdir` command or with the editor; you must be in workspace. To run the training script, you must be in sd-scripts. The dataset must be in train sd_xl_base_1.0.safetensors the model in models sd-scripts in sd-scripts(KOHYA) quick reminder about the dataset It consists of images and text files, yet it has the same name. 1_image.jpg -> 1_image.txt etc Each text file or caption contains a tag like my_lora_style at the beginning of the file which will be the name of your style to use during a text prompt It's also the name of the directory in Train with a number in front: 10_mon_style_de_lora Each caption contains a description of the image in industrial style. Key phrases, like those used for SEO, without literary sentences and punctuation,keep the same key words but vary the terms to describe the atmosphere: studio, outdoors, etc.. Libraries or AI do this very well automatically. In our case, each image must be 1024px x 1024px A Python batch script can automatically resize locally To upload your dataset to the train/10_mon_style_de_lora directory, use the editor. Upload only the images and captions Each image must have its caption with its tag, otherwise the training script will crash. The same applies to an empty caption or a corrupted image. The accepted image formats are jpg and png What JPG tested I --- # π COMPLETE TUTORIAL β TRAINING AN SDXL LoRa ON RUNPOD (RTX A6000) Image used: > **`runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04`** > π PyTorch + CUDA **already installed and working** --- ## π° INTRODUCTION β FUNDAMENTAL PRINCIPLES ### β What the RunPod image guarantees * PyTorch **2.4.x** * CUDA **12.4** * Compatible NVIDIA drivers * Supports RTX A6000 / 4090 * CUDA compilation successful ### β What you should NEVER do * β Reinstall `torch` * β Reinstall CUDA * β Use a `venv` * β Randomly downgrade/upgrade numpy * β Let pip βsolve on its ownβ π **All the bugs encountered previously stemmed from a broken Python environment, NOT the GPU.** --- ## π GENERAL PROJECT STRUCTURE (REQUIRED) ``` /workspace βββ models/ β βββ sd_xl_base_1.0.safetensors βββ train/ β βββ 10_sensual_lingerie_look/ β βββ image1.jpg β βββ image1.txt β βββ image2.jpg β βββ image2.txt β βββ ... βββ output/ βββ sd-scripts/ βββ logs/ (optional) ``` --- ## π STEP 0 β RUNPOD POD SETUP (IMPORTANT) * GPU: **RTX A6000** * VRAM: **48 GB** * vCPU: β₯ 8 * RAM: β₯ 64 GB * Container disk: **β₯ 100 GB** * CUDA visible: ```bash nvidia-smi ``` * β No venv * β Python system --- ## π’ STEP 1 β PYTORCH / CUDA VERIFICATION (ONLY ONCE) ```bash python - <<'PY' Import Torch print("Torch:", torch.__version__) print("CUDA available:", torch.cuda.is_available()) print("GPU:", torch.cuda.get_device_name(0)) x = torch.randn(1, device="cuda") print("CUDA OK") PY ``` π If it works: **DO NOT TOUCH TORCH / CUDA AGAIN** --- Checking installed dependencies! especially if you've tested LoRa on your Pod pip uninstalls the installed versions and reinstalls them with different versions, which created conflicts python3 - <<'EOF' packages = "accelerate", "transformers", "diffusers", "safetensors", "einops", "albumentations", "albucore", "imagesize", "toml", "voluptuous", "bitsandbytes", "timm", "scipy", "tensorboard", ] for pkg in packages: try: module = __import__(pkg) version = getattr(module, "__version__", "unknown") print(f"[OK] {pkg:<15} β {version}") except Exception as e: print(f"[MISSING] {pkg:<15} β {e}") EOF ## π’ STEP 2 β INSTALLATION (ACCORDING TO THOSE NOT PRESENT) OF MISSING OUTBUILDINGS (WITHOUT TORCH) ```bash pip install \ accelerate transformers \ diffusers safetensors einops \ albumentations albucore imagesize \ toml \ voluptuous bitsandbytes \ timem \ scipy \ tensorboard ``` β οΈ **DO NOT install Torch** β οΈ **DO NOT install numpy here** --- ## π CRITICAL STEP β SET UP NUMPY & OPENCV (REQUIRED) ### β Known issue * numpy **2.x** β * OpenCV **β₯ 4.9** β π These versions **silently break SDXL / LoRA** --- ### π οΈ Immediate correction ```bash pip uninstall -y numpy opencv-python opencv-python-headless ``` ```bash pip install numpy==1.26.4 opencv-python-headless==4.8.1.78 ``` β **DO NOT install `opencv-python`** β **headless only** --- ### π Mandatory Verification ```bash python - <<'PY' import albumentations as A import albucore import cv2 import numpy Import Torch print("albumentations:", A.__version__) print("albucore:", albucore.__version__) print("opencv:", cv2.__version__) print("numpy:", numpy.__version__) print("cuda:", torch.cuda.is_available()) x = torch.randn(1, device="cuda") print("CUDA OK") PY ``` ### β Expected result ``` albumentations: 1.4.8 albucore: 0.0.16 OpenCV: 4.8.1 numpy: 1.26.4 cuda: True CUDA OK ``` --- ## π’ STEP 3 β ACCELERATE CONFIGURATION (ONLY ONCE) > β οΈ **Essential**, even if the script βuses accelerateβ ```bash accelerate config ``` Correct answers: ``` Compute environment: LOCAL_MACHINE Machine type: NO Use CPU only: NO Mixed precision: fp16 Num processes: 1 Use Torch Dynamo: NO ``` π The generated file is automatically read by `accelerate launch`. --- ## π’ STEP 4 β INSTALLING SD-SCRIPTS (KOHYA) ```bash cd /workspace git clone https://github.com/kohya-ss/sd-scripts.git cd sd-scripts pip install -r requirements.txt ``` --- ## π’ STEP 5 β DOWNLOADING THE SDXL MODEL ```bash cd /workspace/models wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors ``` Check : ```bash ls -lh /workspace/models ``` --- ## π’ STEP 6 β DATASET (STRICT FORMAT) ``` /workspace/train/10_sensual_lingerie_look/ βββ glamour_01.jpg βββ glamour_01.txt βββ glamour_02.jpg βββ glamour_02.txt βββ ... ``` * same image/caption name * `.txt` required * no subfolders * clean images (jpg/png) --- ## π’ STEP 7 β SDXL LoRA TRAINING COMMAND (SAFE) ```bash cd /workspace/sd-scripts accelerate launch sdxl_train_network.py \ --pretrained_model_name_or_path=/workspace/models/sd_xl_base_1.0.safetensors \ --train_data_dir=/workspace/train \ --output_dir=/workspace/output \ --resolution=1024,1024 \ --enable_bucket \ --min_bucket_reso=512 \ --max_bucket_reso=1024 \ --bucket_reso_steps=64 \ --network_module=networks.lora \ --network_dim=32 \ --network_alpha=32 \ --network_train_unet_only \ --learning_rate=1e-4 \ --optimizer_type=AdamW \ --lr_scheduler=cosine \ --train_batch_size=1 \ --max_train_epochs=15 \ --caption_extension=.txt \ --gradient_checkpointing \ --save_every_n_epochs=5 \ --save_model_as=safetensors \ --output_name=sensual_lingerie_look_sdxl_lora ``` --- ## π¦ EXPECTED RELEASE ``` /workspace/output/ βββ sensual_lingerie_look_sdxl_lora-000005.safetensors βββ sensual_lingerie_look_sdxl_lora-000010.safetensors βββ sensual_lingerie_look_sdxl_lora-000015.safetensors ``` π Keep **the last one** or compare them in a test. --- ## π§ KEY POINTS TO REMEMBER (CRITICISM) * β GPU never the issue * β Unstable Colab for SDXL LoRA * β RunPod = professional environment * β numpy **1.26.4** * β OpenCV **4.8.1** * β numpy 2.x = Russian roulette * β OpenCV β₯ 4.9 = silent crash * β Accelerate configured = CUDA stable ---II. FUNCTIONS OF DEPENDENCIES
--- # π§ SDXL/LoRA DEPENDENCIES β FULL EXPLANATION --- ## π§© 1. `torch` (already provided by the RunPod image) > β οΈ **DO NOT reinstall with your RunPod image** ### Role * Core of deep learning * Manages: * the tensors * the GPU * the autograd * the backward * VRAM ### Used for * UNet Training * LoRA gradient calculation * Backpropagation * Mixed precision (fp16) ### Why criticize * 100% of the training is based on it ### Common Problems * torch without CUDA β `CUDA available: False` * Torch compiled with the wrong CUDA version β phantom errors --- ## π§© 2. `accelerate` ### Role * Training orchestrator * Hardware abstraction (GPU / CPU / multi-GPU) ### Used for * `accelerate launch` * Management : * fp16 * gradient checkpointing * device placement * DDP (if multi-GPU) ### Why it's essential * `sd-scripts` **does not directly launch PyTorch** * It **ALWAYS** goes through accelerate ### Without `accelerate config` * silent crash * CUDA errors βdevice busyβ * Wrong device selected --- ## π§© 3. `transformers` ### Role * Implementation of **CLIP Text Encoders** ### Used for * CLIP ViT-L / ViT-G (SDXL has **2 text encoders**) * Tokenization of captions * Text encoding β embeddings ### Without him * Error : ``` ModuleNotFoundError: No module named 'transformers' ``` ### Impact on LoRA * If you train the text encoder β **critical** dependency * Even UNet-only β necessary for text inference --- ## π§© 4. `diffusers` ### Role * Official implementation of Stable Diffusion pipelines ### Used for * UNet SDXL * E-bike * Schedulers * Noise prediction * Forward diffusion ### Why vital * SDXL = Diffusers architecture * VAE encodes/decodes images * Without him β no training possible --- ## π§© 5. `safetensors` ### Role * Secure weight format (pickle-free) ### Used for * Load the SDXL model * Back up LoRA ### Why is it mandatory? * SDXL Base is in `.safetensors` Kohya **refuses** certain unsafe formats --- ## π§© 6. `einops` ### Role * Proper manipulation of tensor dimensions ### Used for * Rearrangements: * batch * channels * spatial dimensions ### Example ```python rearrange(x, 'bchw -> b (hw) c') ``` Why is it necessary? * SDXL manipulates shapes extensively * Replaces dangerous `view()` --- ## π§© 7. `albumentations` ### Role * Image enhancement ### Used for * Resize * Crop * Flip * Color augmentations ### Even if disabled * Imported when the dataset was loaded ### Critical Version * β β₯ 2.0 breaks compatibility * β **1.4.8** = stable --- ## π§© 8. `albucore` ### Role * Low-level backend for albumentations ### Used for * Rapid transformations * Standardization * NumPy/Cv2 management ### Why sensitive * Highly dependent on NumPy/OpenCV * Incorrect version = data loader crash --- ## π§© 9. `opencv-python-headless` ### Role * Image reading/processing ### Used for * Load the `.jpg / .png` files * Resize * Color conversion ### Why **headless** * No graphical interface * Lighter * More stable on the server ### Critical Version * β β₯ 4.9 β SDXL case * β **4.8.1.78** --- ## π§© 10. `numpy` ### Role * Mathematical basis of the entire image pipeline ### Used for * Datasets * Increases * Standardization * Image β Tensor conversion ### **EXTREMELY critical** version * β numpy 2.x = **INCOMPATIBLE** * β **1.26.4** ### Symptoms if NumPy is bad * random crash * freeze after a few steps * incomprehensible CUDA errors --- ## π§© 11. `bitsandbytes` ### Role * Memory optimizations ### Used for * 8-bit optimizers * Lighter load ### Even if you don't use 8-bit * Imported by kohya * Must be present --- ## π§© 12. `timm` ### Role * Vision Model Collection ### Used for * Some architectural styles * CLIP compatibility / backend vision --- ## π§© 13. `imagesize` ### Role * Read image dimensions **without loading the image** ### Used for * Bucket resolution * Dataset verification ### Advantage * Fast * No RAM wasted --- ## π§© 14. `toml` ### Role * Parse the configuration files ### Used for * Config dataset * Config accelerate * Advanced training configuration --- ## π§© 15. `voluptuous` ### Role * Scheme validation ### Used for * Check : * CLI arguments * dataset config * inconsistent values ### Without him * crash on script startup --- ## π§© 16. `scipy` ### Role * Advanced mathematical functions ### Used for * Schedulers * Secondary numerical calculations --- ## π§© 17. `tensorboard` (optional but useful) ### Role * Visualization of the loss ### Used for * Follow the training * Debug overfit/underfit --- # π§ ULTRA-CLEAR SYNTHESIS | Dependency | Used to | | --------------- | ---------------------- | | torch | GPU computing | | accelerate | orchestration training | | transformers | text / CLIP | | diffusers | SDXL core | | Safetensors | Secure weights | | einops | reshape tensors | | albumentations | increases | | albucore | backend image | | opencv-headless | reading images | | numpy | numeric base | | bitsandbytes | memory | | timm | vision models | | imagesize | buckets | | toml | config | | voluptuous | validation | | scipy | math | | Tensorboard | Monitoring | --- * SDXL = **very fragile stack** * Python β reproducible environment * NumPy 2.x is **recent and brittle** * The official documents are incomplete. * Few people document the exact versions**III. BUGS ENCOUNTERS
CONCLUSION AND ESSENTIAL CONTROLS Do you want to run your training on CPU only (even if a GPU / Apple Silicon / Ascend NPU device is available)? [yes/NO] π **Answer: `NO` (simply press Enter)** ### For what ? * You **have an RTX A6000** detected by `nvidia-smi` * You want to **train a LoRa SDXL**, which is **impossible / pointless in terms of CPU** * Answering `yes` would force `accelerate` to **ignore the GPU** (a fatal error in your case) ### So do exactly: ``` Do you want to run your training on CPU only? [yes/NO] β (Entrance) ``` --- ### For the rest (so you have no doubts), here are **the complete correct answers** from `accelerate config`: ``` Compute environment: This machine Machine type: No distributed training Do you want to run your training on CPU only? NO Do you want to use DeepSpeed? NO How many processes do you want to use? 1 Do you want to use mixed precision? fp16 Do you want to enable Torch Dynamo? NO ``` In the end, you should see: ``` Accelerate configuration saved ``` --- ### Right after (very important) Test it immediately: ```bash accelerate v ``` You must see: * `Mixed precision: fp16` * `Num processes: 1` * **no mention of CPU-only** --- π Tell me **what you answered next** or paste the final output of `accelerate config`, and we'll start training **without wasting a minute**. Would you like to enable numa efficiency? (Currently only supported on NVIDIA hardware). [yes/NO]: π **Answer: `NO` (press Enter)** ### For what ? * NUMA efficiency offers **no useful gains** for: * a **single-GPU RunPod (RTX A6000)** * training **SDXL LoRA** * It can even **introduce unstable behavior** depending on the container/driver * It's especially useful on **multi-socket/multi-GPU servers**, which **isn't your case** ### So do exactly: ``` Would you like to enable numa efficiency? [yes/NO] β (Entrance) ``` --- ### Quick recap of correct answers To make sure you have no doubts: ``` CPU only? NO Deep Speed? NO Numerical processes? 1 Mixed precision? fp16 Torch Dynamo? NO NUMA efficiency? NO ``` --- ### After that (required) Execute: ```bash accelerate v REMINDER OF ORDERS Perfect π Here is **A CLEAR, CLEAN AND REUSABLE BLOCK** with **all the essential controls**, without noise, exactly what you asked for. --- # π§ ESSENTIAL CONTROLS β RUNPOD SDXL LoRA (REFERENCE) ## π 1οΈβ£ Creating the directory structure To be run **only once**: ```bash mkdir -p /workspace/{models,train,output,logs} ``` Expected structure: ``` /workspace βββ models/ βββ train/ βββ output/ βββ logs/ βββ sd-scripts/ ``` --- ## π¦ 2οΈβ£ Download the SDXL template ```bash cd /workspace/models wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors ``` Verification : ```bash ls -lh /workspace/models ``` --- ## 𧬠3οΈβ£ Dataset β expected structure ```bash /workspace/train/10_sensual_lingerie_look/ βββ image1.jpg βββ image1.txt βββ image2.jpg βββ image2.txt βββ ... ``` 10 = the number of repetitions (repeat) applied to each image in this folder at each epoch. ( π― Practical (recommended) rules πΉ Small dataset (10β30 images) 10 or 15 β‘οΈ reinforces learning πΉ Average dataset (30β80 images) 5_ πΉ Large dataset (100+ images) 1_ or 2_ ) β οΈ **Mandatory** * same image/caption name * `.txt` UTF-8 * no strange spaces --- ## π§ͺ 4οΈβ£ GPU / CUDA Test (FIRST AND FOREMOST) ```bash python - <<'PY' Import Torch print("Torch:", torch.__version__) print("CUDA available:", torch.cuda.is_available()) print("GPU:", torch.cuda.get_device_name(0)) x = torch.randn(1, device="cuda") print("CUDA OK") PY ``` Expected result: ``` CUDA available: True GPU: NVIDIA RTX A6000 CUDA OK ``` --- ## π¦ 5οΈβ£ MINIMAL Dependency Installation (ONE TIME) β οΈ **DO NOT install Torch or NumPy here** ```bash pip install \ accelerate transformers \ diffusers safetensors einops \ albumentations albucore opencv-python-headless \ imagesize \ toml \ voluptuous bitsandbytes \ timem \ scipy \ tensorboard ``` --- ## π 6οΈβ£ Fix numpy + opencv (CRITICAL) ```bash pip uninstall -y numpy opencv-python opencv-python-headless pip install numpy==1.26.4 opencv-python-headless==4.8.1.78 ``` --- ## π 7οΈβ£ COMPLETE verification of critical dependencies ```bash python - <<'PY' import torch, accelerate, transformers, diffusers, einops import albumentations, albucore, cv2, numpy print("torch:", torch.__version__) print("accelerate:", accelerate.__version__) print("transformers:", transformers.__version__) print("diffusers:", diffusers.__version__) print("einops:", einops.__version__) print("albumentations:", albumentations.__version__) print("albucore:", albucore.__version__) print("opencv:", cv2.__version__) print("numpy:", numpy.__version__) print("CUDA:", torch.cuda.is_available()) x = torch.randn(1, device="cuda") print("CUDA OK") PY ``` Result **MANDATORY**: ``` numpy: 1.26.4 OpenCV: 4.8.1 CUDA OK ``` --- ## βοΈ 8οΈβ£ Configuration Accelerate (REQUIRED) ```bash accelerate config ``` Correct answers: ``` Compute environment: LOCAL_MACHINE Machine type: NO Use CPU only: NO Mixed precision: fp16 Num processes: 1 Use Torch Dynamo: NO ``` Verification : ```bash accelerate v ``` --- ## π§© 9οΈβ£ Installing sd-scripts (kohya) ```bash cd /workspace git clone https://github.com/kohya-ss/sd-scripts.git cd sd-scripts pip install -r requirements.txt ``` --- ## π 1οΈβ£0οΈβ£ Launch of training (SDXL LoRA SAFE) ```bash cd /workspace/sd-scripts accelerate launch sdxl_train_network.py \ --pretrained_model_name_or_path=/workspace/models/sd_xl_base_1.0.safetensors \ --train_data_dir=/workspace/train \ --output_dir=/workspace/output \ --resolution=1024,1024 \ --enable_bucket \ --min_bucket_reso=512 \ --max_bucket_reso=1024 \ --bucket_reso_steps=64 \ --network_module=networks.lora \ --network_dim=32 \ --network_alpha=32 \ --network_train_unet_only \ --learning_rate=1e-4 \ --optimizer_type=AdamW \ --lr_scheduler=cosine \ --train_batch_size=1 \ --max_train_epochs=15 \ --caption_extension=.txt \ --gradient_checkpointing \ --save_every_n_epochs=5 \ --save_model_as=safetensors \ --output_name=sensual_lingerie_look_sdxl_lora ``` --- ## π§Ύ 1οΈβ£1οΈβ£ Results Check ```bash ls /workspace/output ``` You must see: ``` sensual_lingerie_look_sdxl_lora-000005.safetensors sensual_lingerie_look_sdxl_lora-000010.safetensors sensual_lingerie_look_sdxl_lora.safetensors ``` --- ## π GOLDEN RULES (NEVER FORGET) β Never reinstall Torch β Never let pip upgrade numpy β Never install OpenCV >= 4.9 β Always check CUDA BEFORE training β Always `accelerate config` ---DON'T GET DISCOURAGED, BE PERSEVERING I hope this memo has helped you Note: When Runpod is shut down the next time, the dependencies need to be reinstalled, which takes 10 minutes. Please feel free to share your feedback and suggestions for improvement. laurent.gevaert@hotmail.fr
LoRa training on Vastai.ai
INTRODUCTION: unlike Runpod where a pod is made available to you Vastai and the RB & B of GPU leasing You are renting a GPU instance You must first choose a desired software image or configuration in template form and rent it (click on rent). For the desired GPU instance, there are several options; choose according to the desired GPU and VRAM. 30 GB of storage, libraries and dependencies installed, and a price/hourly rate often slightly cheaper than Runpod Unlike Runpod, you have the option to back up your instance with the installed deps. for approximately 6 euros per month. You must adapt the desired configuration according to the available images and configurations. I couldn't find an RTX A6000 which is quite old (2020). To get an image with a sufficiently powerful RTX 5090 GPU and PyThrough and CUDA, I had to choose an image with kohya_gui It was unnecessary, so I deleted it. (See further below) DELETE kohya_gui Following a simple attempt to stop the pkill process which was not sufficient # π¨ What's really happening (clear explanation) You have this **which comes back immediately**: ```text /bin/bash /opt/supervisor-scripts/kohya.sh python kohya_gui.py --server_port 17860 --headless --noverify ``` π **Unambiguous conclusion**: β `pkill` **is not enough**, because: > **Supervisor AUTOMATICALLY restarts Kohya GUI as soon as he is killed** SO : * you kill the process * supervisor detects the shutdown * he **restarts it immediately** π This is **normal** behavior for this Vast.ai image. --- # β DEFINITIVE Solution (the one that actually works) We need to **neutralize the supervised script**, not just the process. --- ## π STEP 1 β Disable the Kohya script (REQUIRED) Execute **exactly this**: ```bash mv /opt/supervisor-scripts/kohya.sh /opt/supervisor-scripts/kohya.sh.disabled ``` π This prevents the supervisor from restarting it. Check that the file no longer exists: ```bash ls /opt/supervisor-scripts/ ``` You need to see `kohya.sh.disabled` and **more** `kohya.sh`. --- ## πͺ STEP 2 β Kill Kohya ONE LAST TIME ```bash pkill -f kohya_gui.py pkill -f kohya.sh ``` Then : ```bash ps to | grep kohya ``` π There must be **ONLY**: ```text grep --color=auto kohya ``` --- ## π§ͺ STEP 3 β Check that the GPU is FREE ```bash nvidia-smi ``` Expected result: * **0 Python processes** * VRAM almost empty If yes β **victory** π --- # π§ Why this is the RIGHT method β You do NOT touch CUDA β You do NOT touch PyTorch β Do NOT touch sd-scripts (which needs to be installed) β You do NOT break the image β You prevent automatic restart β **Reversible** Method To reactivate Kohya later: ```bash mv /opt/supervisor-scripts/kohya.sh.disabled /opt/supervisor-scripts/kohya.sh ``` BUGS ENCOUNTERED SPECIFIC TO MY INSTANT DURING THE LAUNCH OF THE TRAINING ## β 1οΈβ£ Remove `invisible-watermark` (recommended) `invisible-watermark` is **unnecessary for LoRA SDXL training** and causes this false OpenCV conflict. ```bash Next, return to the Runpod tutorial to test and install the dependency. with NumPy conflicts etc. install the sd-scripts, models, etc. directories install sd_scripts, dataset (images+captions), model SDXL in models etc ADDICTION PROBLEMS IN VASTAI ## β 1οΈβ£ Remove `invisible-watermark` (recommended) `invisible-watermark` is **unnecessary for LoRA SDXL training** and causes this false OpenCV conflict. ```bash pip uninstall -y invisible-watermark ``` π After that, **the first warning disappears**. --- ## π¨ 2οΈβ£ The REAL problem: `albucore` is too recent Error : ``` albucore 0.0.24 requires opencv-python-headless>=4.9.0.80 ``` π Or **we deliberately want to stay on OpenCV 4.8.1.78** (to avoid SDXL / numpy 2.x bugs). ### β What NOT to do * β Upgrade OpenCV to 4.9+ * β Let pip force numpy 2.x --- ## β 3οΈβ£ PRO solution: pinner `albumentations` / `albucore` compatible ### π§ Uninstall current versions ```bash pip uninstall -y albumentations albucore ``` ### π Install the COMPATIBLE (stable) versions ```bash pip install albumentations==1.4.3 albucore==0.0.13 ``` These versions: * β Works with **opencv-python-headless 4.8.1.78** * β do NOT force OpenCV 4.9+ * β are stable for SDXL preprocessing * β Compatible with Python 3.12 --- ## π§ͺ 4οΈβ£ Final check (to be done) ```bash python3 - <<'EOF' import albumentations, albucore, cv2, numpy print("albumentations:", albumentations.__version__) print("albucore:", albucore.__version__) print("opencv:", cv2.__version__) print("numpy:", numpy.__version__) EOF ``` Expected result: ``` albumentations: 1.4.3 albucore: 0.0.13 OpenCV: 4.8.1 numpy: 1.26.4 ``` --- ## π― Clear verdict β **Yes**, remove `invisible-watermark` without hesitation β **Yes**, downgrade `albucore` (this is the real conflict) β **NO**, don't touch OpenCV / numpy (you have the right combo) β **Clean, stable environment, ready for LoRa SDXL opencv= bug ## π CURRENT state of your environment ```text albumentations: 1.4.3 β OK albucore: 0.0.13 β OK numpy: 1.26.4 β OK OpenCV: 4.11.0 β NOT OK ``` π **The only remaining problem is OpenCV 4.11.0** π However, your tutorial (and SDXL's field experience) requires **OpenCV 4.8.1.78** --- ## π¨ Why OpenCV 4.11.0 is a PROBLEM Even if everything βfalls into placeβ: * OpenCV β₯ 4.9 introduces C++ / numpy changes * risks of **broken data augmentation** * silent bugs (no crashes, but degraded LoRA) * Historical incompatibilities with SDXL preprocessing π **That's exactly the trap your tutorial is trying to avoid.** --- ## β PERMANENT FIX (to be done NOW) ### π§ 1οΈβ£ Uninstall ALL of OpenCV (important) ```bash pip uninstall -y opencv-python opencv-python-headless opencv-contrib-python ``` β οΈ Do this even if some packages βare not installedβ. --- ### π§ 2οΈβ£ Reinstall the correct version (HEADLESS only) ```bash pip install opencv-python-headless==4.8.1.78 ``` LAST PROBLEM ```bash python3 - <<'EOF' import cv2, numpy print("opencv:", cv2.__version__) print("numpy:", numpy.__version__) EOF ``` Expected result **exactly**: ```text OpenCV: 4.8.1 numpy: 1.26.4 ``` --- ## π― Final verdict (clear and clear) β Albumentations / albucore: **GOOD versions** β NumPy: **PERFECT** β OpenCV 4.11.0: **TO BE CORRECTED** π Once OpenCV 4.8.1.78 is reinstalled β **STABLE SDXL ENVIRONMENT** --- When you send me the **new result after correction**, I will give you the **final command `accelerate launch` ready to execute**, aligned **100% with your tutorial** and your RTX 4090 / 48 GB. numpy: 1.26.4 OpenCV: 4.8.1 torch: 2.8.0+cu128 pip uninstall -y gradio Launch script cd /workspace/sd-scripts accelerate launch sdxl_train_network.py \ --pretrained_model_name_or_path=/workspace/models/sd_xl_base_1.0.safetensors \ --train_data_dir=/workspace/train \ --output_dir=/workspace/output \ --resolution=1024,1024 \ --enable_bucket \ --min_bucket_reso=512 \ --max_bucket_reso=1024 \ --bucket_reso_steps=64 \ --network_module=networks.lora \ --network_dim=32 \ --network_alpha=32 \ --network_train_unet_only \ --learning_rate=1e-4 \ --optimizer_type=AdamW \ --lr_scheduler=cosine \ --train_batch_size=1 \ --max_train_epochs=15 \ --caption_extension=.txt \ --gradient_checkpointing \ --save_every_n_epochs=5 \ --save_model_as=safetensors \ --output_name=my_lora_style Conclusion The main difficulty in LoRa training is finding a stable environment and matching dependencies. To avoid conflicts, do not hesitate to install dependencies with the desired version numbers. ***********************************************************************************************************<HOW TO TEST YOUR LORA
Once your LoRA safetensors have been generated in output, they should be tested. Rest assured, it's much easier than training. The installation is minimalist, with 4 outbuildings. I recommend using a separate Runpod pod or Vastail instance from the training environment to avoid conflicts. checks transformers broadcast safetensors peft lists the 4 desired dependencies (peft to test multiple LoRA at the same time) python3 - <<'EOF' import importlib packages = "transformers", "diffusers", "safetensors", "peft" ] print("=== Checking dependencies (transformers / diffusers / safetensors / peft) ===\n") for pkg in packages: try: module = importlib.import_module(pkg) version = getattr(module, "__version__", "unknown version") print(f"[OK] {pkg:12} β {version}") except Exception as e: print(f"[MISSING] {pkg:12} β {e}") EOF We therefore install pip install \ transformers \ diffusers safetensors peft Or pip install transformers diffusers safetensors peft You don't have to install PEFT directly for a single LoRa test (I haven't tested it). You can install it later to test multiple LoRa devices. pip install peft n checks the installation Exact versions installed (recommended) ```bash python - << 'EOF' import transformers, diffusers, safetensors print("transformers:", transformers.__version__) print("diffusers:", diffusers.__version__) print("safetensors:", safetensors.__version__) EOF ``` π No specific NumPy/OpenCV here π No kohya π No UIII. MINIMUM STRUCTURE FOR THE TEST (use the runpod editor)
In `/workspace`: ``` /workspace βββ models/ β βββ sd_xl_base_1.0.safetensors β βββ ma_lora_sdxl.safetensors βββ output_test/ βββ test_lora.py ``` π **No need for `sd-scripts`** π **No need for albumentations** π **No need to accelerate** --- ## π’ 2οΈβ£ MINIMUM DEPENDENCIES FOR TEST (INFERTION) β οΈ **ONLY these libraries** (if they are not already there): transformers broadcast safetensors π No specific NumPy/OpenCV here π No kohya π No UI # π§ IMPORTANT QUESTIONS (CLEAR ANSWERS) ### β Do we need training dependencies for testing? β **NO** | Use | Need | | ------------- | ----------------------------------- | | LoRA Training | kohya + albumentations + accelerate | | LoRA testing | diffusers + transformers | --- ### β Is only one pod enough for testing? β **YES** Even a **smaller GPU** is sufficient for inference. (But two different pods are recommended, one for training and one for testing) and generate images) --- ### β Why not ComfyUI? Because : * It installs too many dependencies * it modifies numpy/opencv * it **can disrupt a clean environment** * unnecessary for a simple LoRa testIII. Creation of the TEST_LORA.py file: Python test script created in workspace
`bash cat << 'EOF' > /workspace/test_lora.py Import Torch from diffusers import StableDiffusionXLPipeline BASE_MODEL = "/workspace/models/sd_xl_base_1.0.safetensors" LORA_PATH = "/workspace/models/ma_lora_sdxl.safetensors" OUTPUT_DIR = "/workspace/output_test" prompt = ( "my_tag_style, studio photography, soft lighting, " "high detail, professional fashion photo" ) pipe = StableDiffusionXLPipeline.from_single_file( BASE_MODEL, torch_dtype=torch.float16, variant="fp16", ).to("cuda") pipe.load_lora_weights(LORA_PATH) pipe.fuse_lora() image = pipe( prompt=prompt, num_inference_steps=30, guidance_scale=7.0, ).images[0] image.save(f"{OUTPUT_DIR}/test_lora.png") print("β Generated image:", f"{OUTPUT_DIR}/test_lora.png") EOF Enter the name of your LoRa model LORA_PATH = "/workspace/models/ma_lora_sdxl.safetensors" and add the tag to the prompt. in your style (the one written at the top of your captions) You can modify the prompt IV. Launch the script python /workspace/test_lora.py Test with two LoRa files: two safetensor files If peft is not installed File "/workspace/test_lora_mix.py", line 26, in pipe.load_lora_weights(LORA_A, adapter_name="A") File "/usr/local/lib/python3.11/dist-packages/diffusers/loaders/lora_pipeline.py", line 616, in load_lora_weights raise ValueError("PEFT backend is required for this method.") ValueError: PEFT backend is required for this method. pip install peft My LoRA creates artistic nudes FINAL VERSION NO MORALIZED AND SAFETY SDXL DISABLED ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ cat << 'EOF' > /workspace/test_lora_une_image.py Import Torch import argparse from diffusers import StableDiffusionXLPipeline # ========================================================= # CLIENT ARGUMENTS # ========================================================= parser = argparse.ArgumentParser() parser.add_argument("--prompt", type=str, required=True) parser.add_argument("--nude_scale", type=float, default=0.7) parser.add_argument("--lingerie_scale", type=float, default=0.5) parser.add_argument("--cfg", type=float, default=6.2) parser.add_argument("--steps", type=int, default=30) args = parser.parse_args() # ========================================================= # PATHS # ========================================================= BASE_MODEL = "/workspace/models/sd_xl_base_1.0.safetensors" LORA_NUDE = "/workspace/models/artistic_nude_style_lora.safetensors" LORA_LINGERIE = "/workspace/models/sensual_lingerie_look_sdxl_lora.safetensors" OUTPUT_DIR = "/workspace/output_test" # ========================================================= # PIPELINE (SAFETY CHECKER DISABLED) # ========================================================= pipe = StableDiffusionXLPipeline.from_single_file( BASE_MODEL, torch_dtype=torch.float16, variant="fp16", safety_checker=None, requires_safety_checker=False, ).to("cuda") # ========================================================= # LOAD LORAS (PEFT) # ========================================================= pipe.load_lora_weights(LORA_NUDE, adapter_name="NUDE") pipe.load_lora_weights(LORA_LINGERIE, adapter_name="LINGERIE") pipe.set_adapters( ["NUDE", "LINGERIE"] adapter_weights=[args.nude_scale, args.lingerie_scale] ) # ========================================================= # NEGATIVE PROMPT (ANTI-BRA AUTO) # ========================================================= negative_prompt = ( "bra, bikini top, chest covered, armor chest, metal bra, " "clothing on torso, headband, fabric covering breasts" ) # ========================================================= # GENERATION # ========================================================= image = pipe( prompt=args.prompt, negative_prompt=negative_prompt, num_inference_steps=args.steps, guidance_scale=args.cfg, ).images[0] # ========================================================= # SAVE # ========================================================= safe_prompt = args.prompt.replace(" ", "_").replace(",", "")[:60] out = ( f"{OUTPUT_DIR}/" f"nude{args.nude_scale}_lingerie{args.lingerie_scale}_" f"cfg{args.cfg}_steps{args.steps}_{safe_prompt}.png" ) image.save(out) print("β Image generated:", out) EOF CLI python /workspace/test_lora_une_image.py \ --prompt "artistic_nude_style, sensual_lingerie_look, topless woman, bare breasts, front-facing, long blond hair, green eyes, red string panties with garter belt, cinematic natural lighting, studio portrait, soft lighting, high detail, 85mm photo, realistic skin texture" \ --nude_scale 0.75 \ --lingerie_scale 0.60 \ --cfg 6.2 \ --steps 30 SDXL performs self-censorship on the nude, hence a negative prompt and safety_checker=None. requires_safety_checker=False, TEST WITH 3 LoRas WITH THE GOAL OF AN ARTISTIC NUDE WOMAN The goal of testing one or more LoRa instances is to create prompts that display exactly what you want, such as an image. The right prompt is an art in itself. Conditions that favor a dominant lora in the script. Python script in workspace #!/usr/bin/env python3 # -*- coding: utf-8 -*- "" SDXL Test β Generating a text image with 3 LoRA (Vast.ai Runpod) - Artistic Nude Sensual Lingerie - Erotic Nude Style Includes: - Negative prompt anatomy + eyes - Automatic LoRA arbitration to prevent anatomical breakage "" Import Torch import argparse from diffusers import StableDiffusionXLPipeline # ============================================================ # CLIENT ARGUMENTS # ============================================================ parser = argparse.ArgumentParser( description="SDXL Generation with 3 LoRA (NUDE / LINGERIE / EROTIC)" ) parser.add_argument("--prompt", type=str, required=True) parser.add_argument("--nude_scale", type=float, default=0.7) parser.add_argument("--lingerie_scale", type=float, default=0.5) parser.add_argument("--erotic_scale", type=float, default=0.6) parser.add_argument("--cfg", type=float, default=6.5) parser.add_argument("--steps", type=int, default=30) parser.add_argument("--seed", type=int, default=None) args = parser.parse_args() # ============================================================ # NEGATIVE PROMPT (STRUCTURE + EYES) # ============================================================ negative_prompt = ( # --- Clothing / Accessories --- "bra, bikini top, chest covered, armor chest, metal bra, " "clothing on torso, headband, fabric covering breasts, " "panties, underwear, thong, g-string, " "panties, cotton panties, high-waisted panties," "shorty, briefs, full-coverage lingerie, underwear" # --- anatomy / structure --- βextra limbs, extra legs, extra arms,β "multiple bodies, duplicate bodies, " "deformed anatomy, bad anatomy," "fused limbs, malformed body," # --- eyes / face --- βbad eyes, deformed eyes, malformed eyes,β "crossed eyes, lazy eye," "asymmetrical eyes, misaligned eyes, " "extra pupils, missing pupils, " "blurred eyes, melted eyes" ) # ============================================================ # PATHS # ============================================================ BASE_MODEL = "/workspace/models/sd_xl_base_1.0.safetensors" LORA_NUDE = "/workspace/models/artistic_nude_style_lora.safetensors" LORA_LINGERIE = "/workspace/models/sensual_lingerie_look_sdxl_lora.safetensors" LORA_EROTIC = "/workspace/models/erotic_nude_style.safetensors" OUTPUT_DIR = "/workspace/output_test" # ============================================================ # SEED (OPTIONAL) # ============================================================ generator = None if args.seed is not None: generator = torch.Generator("cuda").manual_seed(args.seed) # ============================================================ # PIPELINE SDXL # ============================================================ pipe = StableDiffusionXLPipeline.from_single_file( BASE_MODEL, torch_dtype=torch.float16, variant="fp16", safety_checker=None, requires_safety_checker=False, ).to("cuda") pipe.enable_xformers_memory_efficient_attention() # ============================================================ # LOAD LORAS # ============================================================ pipe.load_lora_weights(LORA_NUDE, adapter_name="NUDE", use_safetensors=True) pipe.load_lora_weights(LORA_LINGERIE, adapter_name="LINGERIE", use_safetensors=True) pipe.load_lora_weights(LORA_EROTIC, adapter_name="EROTIC", use_safetensors=True) # ============================================================ # LORA SAFETY ARBITRATOR (ANATOMICAL ANTI-BREAKAGE) # ============================================================ MAX_LORA_SUM = 1.2 lora_weights = { "NUDE": args.nude_scale, "LINGERIE": args.lingerie_scale, "EROTIC": args.erotic_scale, } total_weight = sum(lora_weights.values()) active_adapters = [] active_weights = [] if total_weight <= MAX_LORA_SUM: for name, w in lora_weights.items(): if w > 0: active_adapters.append(name) active_weights.append(w) else: dominant = max(lora_weights, key=lora_weights.get) active_adapters = [dominant] active_weights = [lora_weights[dominant]] print("β οΈ LORA SAFETY TRIGGERED") print(f"π LoRA retained: {dominant}") print("π Other LoRAs disabled for anatomical stability") pipe.set_adapters(active_adapters, adapter_weights=active_weights) # ============================================================ # GENERATION # ============================================================ image = pipe( prompt=args.prompt, negative_prompt=negative_prompt, num_inference_steps=args.steps, guidance_scale=args.cfg, generator=generator, ).images[0] # ============================================================ # SAVE IMAGE # ============================================================ safe_prompt = args.prompt.replace(" ", "_")[:60] out_path = ( f"{OUTPUT_DIR}/" f"nude{args.nude_scale}_" f"lingerie{args.lingerie_scale}_" f"erotic{args.erotic_scale}_" f"cfg{args.cfg}_" f"steps{args.steps}.png" ) image.save(out_path) print("β Image generated successfully:") print(out_path) CLI Launching the script python text-lora-one-image-three-lora-vastai.py \ --prompt "artistic_nude_style,sensual_lingerie_look,erotic_nude_style,full-length photo,topless woman, bare breasts,long blonde hair flowing in the wind,beautiful face,green eyes with makeup, visible pussy, red stiletto heels, totally nude, cinematic natural lighting, realistic skin texture, symmetrical face, well-aligned eyes, sharp gaze" --nude_scale 0.35 \ --lingerie_scale 0.25 \ --erotic_scale 0.50 \ --cfg 8.0 \ --steps 30 \ --seed 42Conclusion and lessons learned
My three LoRa controllers are too close together, resulting in conflicting instructions at eye level due to their similar postures. There are always problems with the eyes, which need to be drawn very precisely. So some images are inconsistent and initially contradictory, otherwise beautiful photos overall Solution for better rendering and to avoid contradictions and Loras gaining an advantage fusion of the 3 Loras into a single safetensor. EDIT: Remove Kolya GUI and then install Dependency Mix. First follow the Vastai tutorial and then Runpod. Vastai tutorial, don't forget to pin albumentations albucoreLORA CHARACTER TRAINING
Take different images; few images are similar, but different poses and different clothing. Different colors, front, side, full-length, size, otherwise overfit approximately 40 images. Do not resize them, but prioritize sharp images and respect the proportions (image size >800px). Same installation as for LoRA style (identical dependencies, kolya sd-scripts, sdxl etc) New script with new configuration for LoRa character accelerate launch sdxl_train_network.py \ --pretrained_model_name_or_path=/workspace/models/sd_xl_base_1.0.safetensors \ --train_data_dir=/workspace/train \ --output_dir=/workspace/output \ --resolution=1024,1024 \ --enable_bucket \ --min_bucket_reso=512 \ --max_bucket_reso=1024 \ --bucket_reso_steps=64 \ --network_module=networks.lora \ --network_dim=64 \ --network_alpha=64 \ --network_train_unet_only \ --learning_rate=1e-4 \ --optimizer_type=AdamW \ --lr_scheduler=cosine \ --train_batch_size=1 \ --max_train_epochs=10 \ --caption_extension=.txt \ --gradient_checkpointing \ --save_every_n_epochs=5 \ --save_model_as=safetensors \ --output_name=pop_star_caractere_sdxl_lora Feel free to send me your feedback and improvements [laurent.gevaert@hotmail.fr](mailto:laurent.gevaert@hotmail.fr)