Finetune produces unusable LoRA for Mistral model

# Expected Behavior

I expected finetune to produce a usable LoRA adapter for all supported models.

# Current Behavior

For Mistral models (I tried both Mistral and Zephyr, Q8_0, Q5_K_M, Q5_0) model outputs gibberish with LoRA after a single finetune iteration.

On the same PC finetuning produces usable LoRA adapter for TinyLlama (I tried Q8_0, Q5_K_M, Q5_0).

First few tokens for "Building a website can be done in 10 simple steps:" prompt:

Base Mistral model:

```
Building a website can be done in 10 simple steps:
1. Come up with an idea for your site.
2. Do some research on the web to see what’s out there.
```

Mistral with LoRA (single finetune iteration on shakespeare.txt from example):

```
Building a website can be done in 10 simple steps: (3 . in.
 A,
! (
 P! A,  PAM,IT A) MER W W 0
```

# Environment and Context

* Physical (or virtual) hardware you are using, e.g. for Linux:

Core i7 4770 CPU

`$ lscpu`

```
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
    CPU family:          6
    Model:               60
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            3
    CPU(s) scaling MHz:  100%
    CPU max MHz:         3900.0000
    CPU min MHz:         800.0000
    BogoMIPS:            6784.88
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs
                          bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt
                         aes xsave avx f16c rdrand lahf_lm abm cpuid_fault invpcid_single pti tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsav
                         eopt dtherm ida arat pln pts
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-7
Vulnerabilities:
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                   Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec store bypass:     Vulnerable
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Vulnerable: No microcode
  Tsx async abort:       Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
  ```

* Operating System, e.g. for Linux:

`$ uname -a`

```
Linux maxxk-pc 6.1.29 #1-NixOS SMP PREEMPT_DYNAMIC Wed May 17 09:54:00 UTC 2023 x86_64 GNU/Linux
```

# Failure Information (for bugs)

For Mistral models (I tried both Mistral and Zephyr, Q8_0, Q5_K_M, Q5_0) model outputs gibberish with LoRA after a single finetune iteration.

# Steps to Reproduce

I used pre-converted models from TheBloke:

- Mistral: https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/main
- Zephyr: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/tree/main

This issue can be reproduced using shakespeare.txt from finetune example, but I got same results for a different dataset.

Finetuning command:

```
../llama.cpp/bin/finetune \
  --model-base mistral-7b-v0.1.Q8_0.gguf \
  --train-data shakespeare.txt  \
  --lora-out lora-Q8_0.gguf \
  --save-every 1 \
  --threads 4 \
  --ctx 64 \
  --batch 1 \
  --grad-acc 1 \
  --lora-r 64 \
  --lora-alpha 64 \
  --adam-iter 1 \
  --use-checkpointing \
  --use-flash \
  --escape \
  --seed 1
```

For Zephyr (also produces invalid LoRA) and TinyLlama (produces valid LoRA) I changed only model-base parameter. Between experiments I removed all finetune checkpoints and LoRAs.

Testing without LoRA:
```
../llama.cpp/bin/main -m ./mistral-7b-v0.1.Q8_0.gguf -p "Building a website can be done in 10 simple steps:"
```

Testing with LoRA:
```
 ../llama.cpp/bin/main -m ./mistral-7b-v0.1.Q8_0.gguf -p "Building a website can be done in 10 simple steps:" --lora ./lora-Q8_0.gguf
 ```

P.S. As a final part of this bug report I would like to thank all contributors for this amazing piece of software. It is a pleasure to use, and it gives an ability to experiment with LLMs even for those of us without top GPUs.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Finetune produces unusable LoRA for Mistral model #3852

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Finetune produces unusable LoRA for Mistral model #3852

Description

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions