Finetune LLMs on your laptop’s GPU—no code, no PhD, no hassle.
- GPU-Powered Finetuning: Optimized for NVIDIA GPUs (even 4GB VRAM).
- One-Click Workflow: Upload data → Pick task → Train → Test.
- Hardware-Aware: Auto-detects your GPU/CPU and recommends models.
- React UI: No CLI or notebooks—just a friendly interface.
- Text-Generation: Generates answers in the form of text based on prior and fine-tuned knowledge. Ideal for use cases like customer support chatbots, story generators, social media script writers, code generators, and general-purpose chatbots.
- Summarization: Generates summaries for long articles and texts. Ideal for use cases like news article summarization, law document summarization, and medical article summarization.
- Extractive Question Answering: Finds the answers relevant to a query from a given context. Best for use cases like Retrieval Augmented Generation (RAG), and enterprise document search (for example, searching for information in internal documentation).
- Python==3.11.x: Ensure you have Python installed.
- NVIDIA GPU: Recommended VRAM >= 6GB.
- CUDA: Ensure CUDA is installed and configured for your GPU.
- HuggingFace Account: Create an account on Hugging Face and generate a finegrained access token.
-
Install the Package:
pip install modelforge-finetuning
-
Set HuggingFace API Key in environment variables:
Linux:export HUGGINGFACE_TOKEN=your_huggingface_token
Windows Powershell:
$env:HUGGINGFACE_TOKEN="your_huggingface_token"
Windows CMD:
set HUGGINGFACE_TOKEN=your_huggingface_token
Or use a .env file:
echo "HUGGINGFACE_TOKEN=your_huggingface_token" > .env
-
Install Appropriate CUDA version for PyTorch:
- Navigate to the PyTorch installation page and select the appropriate CUDA version for your system.
- Install PyTorch with the correct CUDA version. For example, for CUDA 12.6 on Windows, you can use:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
-
Run the Application:
modelforge run
-
Done!: Navigate to http://localhost:8000 in your browser and get started!
- Start the Application:
modelforge run
- Navigate to the App:
Open your browser and go to http://localhost:8000.
To stop the application and free up resources, press Ctrl+C
in the terminal running the app.
{"input": "Enter a really long article here...", "output": "Short summary."},
{"input": "Enter the poem topic here...", "output": "Roses are red..."}
transformers
+peft
(LoRA finetuning)bitsandbytes
(4-bit quantization)React
(UI)FastAPI
(Backend)Python
(Backend)React.JS
(Frontend)