Important
The pre-training code included in this repo has been refactored to be more user-friendly and is now available at 🦐 minpeter/krill.
# https://github.com/astral-sh/uv/issues/6437#issuecomment-2535324784
uv sync --no-install-package flash-attn
uv sync --no-build-isolation
uv run 00-tknz.py
uv run 01-preprocess.py
uv run accelerate launch 02-train.py --hf_model_id your-hf/model-id
https://huggingface.co/minpeter/tiny-ko-187m-base-250718
https://huggingface.co/minpeter/tiny-ko-124m-base-muon
https://huggingface.co/minpeter/tiny-ko-20m-base-en
....
https://wandb.ai/kasfiekfs-e/lm-eval-harness-integration/workspace
https://github.com/huggingface/smollm/tree/main/text/pretraining
https://github.com/jzhang38/TinyLlama
https://github.com/SmallDoges/small-doge
https://github.com/keeeeenw/MicroLlama
https://github.com/karpathy/nanoGPT
[Model] Very, very small things Collections