Skip to content

A no-code toolkit to finetune LLMs on your local GPU—just upload data, pick a task, and deploy later. Perfect for hackathons or prototyping, with automatic hardware detection and a guided React interface.

License

Notifications You must be signed in to change notification settings

RETR0-OS/ModelForge

Repository files navigation

ModelForge 🔧⚡

Finetune LLMs on your laptop’s GPU—no code, no PhD, no hassle.

logo

🚀 Features

  • GPU-Powered Finetuning: Optimized for NVIDIA GPUs (even 4GB VRAM).
  • One-Click Workflow: Upload data → Pick task → Train → Test.
  • Hardware-Aware: Auto-detects your GPU/CPU and recommends models.
  • React UI: No CLI or notebooks—just a friendly interface.

📖 Supported Tasks

  • Text-Generation: Generates answers in the form of text based on prior and fine-tuned knowledge. Ideal for use cases like customer support chatbots, story generators, social media script writers, code generators, and general-purpose chatbots.
  • Summarization: Generates summaries for long articles and texts. Ideal for use cases like news article summarization, law document summarization, and medical article summarization.
  • Extractive Question Answering: Finds the answers relevant to a query from a given context. Best for use cases like Retrieval Augmented Generation (RAG), and enterprise document search (for example, searching for information in internal documentation).

Installation

Prerequisites

  • Python==3.11.x: Ensure you have Python installed.
  • NVIDIA GPU: Recommended VRAM >= 6GB.
  • CUDA: Ensure CUDA is installed and configured for your GPU.
  • HuggingFace Account: Create an account on Hugging Face and generate a finegrained access token.

Steps

  1. Install the Package:

    pip install modelforge-finetuning
  2. Set HuggingFace API Key in environment variables:
    Linux:

    export HUGGINGFACE_TOKEN=your_huggingface_token

    Windows Powershell:

    $env:HUGGINGFACE_TOKEN="your_huggingface_token"

    Windows CMD:

    set HUGGINGFACE_TOKEN=your_huggingface_token

    Or use a .env file:

    echo "HUGGINGFACE_TOKEN=your_huggingface_token" > .env
  3. Install Appropriate CUDA version for PyTorch:

    • Navigate to the PyTorch installation page and select the appropriate CUDA version for your system.
    • Install PyTorch with the correct CUDA version. For example, for CUDA 12.6 on Windows, you can use:
     pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
  4. Run the Application:

    modelforge run
  5. Done!: Navigate to http://localhost:8000 in your browser and get started!

Running the Application Again in the Future

  1. Start the Application:
    modelforge run
  2. Navigate to the App:
    Open your browser and go to http://localhost:8000.

Stopping the Application

To stop the application and free up resources, press Ctrl+C in the terminal running the app.

📂 Dataset Format

{"input": "Enter a really long article here...", "output": "Short summary."},
{"input": "Enter the poem topic here...", "output": "Roses are red..."}

🛠 Tech Stack

  • transformers + peft (LoRA finetuning)
  • bitsandbytes (4-bit quantization)
  • React (UI)
  • FastAPI (Backend)
  • Python (Backend)
  • React.JS (Frontend)

About

A no-code toolkit to finetune LLMs on your local GPU—just upload data, pick a task, and deploy later. Perfect for hackathons or prototyping, with automatic hardware detection and a guided React interface.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •