Setup & Settings
Hefty works out of the box with sensible defaults. Install the desktop app, pick your AI provider in the setup dialog, and you're ready to go.
Quick Start
- Install Hefty - download the desktop app for your platform (allow it to run if your OS shows a warning)
- Launch Hefty - on first launch, the Cognition Config dialog appears automatically
- Pick a provider - choose one of the four supported AI providers (see below) and select a model
- Test & save - test the connection, save, and start chatting
Install Ollama, pull a model (ollama pull gemma3:12b and ollama pull nomic-embed-text), and select Ollama in the setup dialog. Hefty connects automatically.
First Launch - Allow Hefty to Run
Hefty's binaries are not signed with vendor certificates. This is a deliberate choice - we refuse to pay platform gatekeepers for the privilege of distributing software. The app is safe, but your operating system may show a warning the first time you run it. Here's how to get past it on each platform.
Linux (AppImage)
Make the downloaded file executable and run it. That's all there is to it.
chmod +x Hefty-*.AppImage
./Hefty-*.AppImage Or right-click the file in your file manager, go to Properties → Permissions, and check "Allow executing as program", then double-click to launch.
macOS (DMG)
- Open the
.dmgfile and drag Hefty into your Applications folder - Double-click to launch - macOS will block it with "Hefty can't be opened because Apple cannot check it for malicious software"
- Open System Settings → Privacy & Security
- Scroll down - you'll see "Hefty was blocked from use because it is not from an identified developer"
- Click "Open Anyway" and confirm
If "Open Anyway" doesn't appear or still fails, open Terminal and run:xattr -cr /Applications/Hefty.app
This strips the quarantine flag. Then launch normally.
Windows (Installer)
- Run the
Hefty Setupinstaller - Windows will show a SmartScreen warning: "Windows protected your PC" - Click "More info" (the text link, not a button)
- Click "Run anyway"
- Proceed with installation as normal
This only happens once - after the first launch, your OS will remember that you allowed it.
AI Providers
Hefty supports four AI providers out of the box. You choose your provider in the Cognition Config dialog - accessible on first launch or any time via the admin menu.
Free, local inference. Supports model listing, embeddings, and multimodal models. Recommended for privacy-first setups.
Local inference with a GUI model manager. Uses the OpenAI-compatible API. Models are loaded dynamically from your LM Studio server.
GPT-4o, GPT-4.1, o3-mini, and more. Supports model listing and embeddings. Requires an API key.
Claude Sonnet 4, Claude 3.5 Haiku, and more. Requires an API key. Does not support embeddings - use a separate embedding provider.
Any OpenAI-compatible API also works - OpenRouter, Together AI, Groq, vLLM, or any other service that speaks the OpenAI chat completions protocol. Just select OpenAI as the provider and enter the service's URL.
Embedding Provider
Hefty uses a separate embedding model for its knowledge system (vector search). By default, embeddings use the same provider as your text model. If your text provider doesn't support embeddings (e.g. Anthropic), you can configure a separate embedding backend - for example, a local Ollama instance running nomic-embed-text.
Cognition Config Dialog
The Cognition Config dialog has two tabs:
- Provider & Models - select your AI provider, enter the server URL, choose a text model and embedding model, test the connection, and enter an API key if needed
- Advanced - tune reasoning parameters: max reflection loops, model keep-alive time, reasoning timeout, context token limit, and stream max tokens
Changes take effect immediately - Hefty hot-reloads the cognition subsystem when you save, with no restart needed. The config is persisted to cognition-config.json in your data directory.
Customizing Hefty
In the App
Within the Hefty UI, you can configure:
- Display name - how Hefty addresses you
- Custom instructions - persistent instructions Hefty follows in every conversation (e.g., "Always explain your reasoning", "Prefer Python over JavaScript")
These settings persist across sessions and conversations.
Installation
The desktop app is a single executable that bundles everything:
- Linux - AppImage (download and run, no installation needed)
- macOS - DMG installer
- Windows - NSIS installer
Knowledge & Learning
Hefty's learning system is enabled by default. You can control it with:
HEFTY_LEARNING_ENABLED- set tofalseto disable automatic learningHEFTY_MAINTENANCE_ENABLED- set tofalseto disable automatic knowledge maintenance (pruning, merging)
Data Location
All Hefty data (conversations, knowledge, settings) is stored in ~/.hefty by default. Change it with HEFTY_DATA_DIR. You can back up, move, or delete this folder at any time.
Advanced
Most configuration is done via the in-app Cognition Config dialog. Environment variables below serve as compile-time defaults and can override persisted config when needed.
Environment Variables Reference
Server
HEFTY_PORT- HTTP port for the unified server (GUI + API). Default:41890HEFTY_HOST- bind address. Default:0.0.0.0HEFTY_DATA_DIR- data storage location. Default:~/.hefty
AI Models
HEFTY_LLM_BACKEND- wire protocol:ollama,openai,anthropic, orllamacpp. Default:ollamaHEFTY_LLM_URL- LLM API URL. Default:http://localhost:11434HEFTY_TEXT_MODEL- primary reasoning model. Default:gemma3:12bHEFTY_EMBEDDING_MODEL- model for vector embeddings. Default:nomic-embed-text
Knowledge
HEFTY_LEARNING_ENABLED- enable/disable learning. Default:trueHEFTY_MAINTENANCE_ENABLED- enable/disable auto-maintenance. Default:trueHEFTY_VEC_EXTENSION_PATH- path tosqlite-vecextension. Auto-detected in desktop app.
Reflection
HEFTY_MAX_REFLECTION_LOOPS- max reasoning loops per task. Default:10HEFTY_MODEL_KEEP_ALIVE_MINUTES- how long to keep the model loaded. Default:15
Provider Setup Details
Ollama (Default)
Ollama runs on port 11434 by default, which matches Hefty's default configuration. No extra setup needed beyond pulling the models.
LM Studio
LM Studio exposes an OpenAI-compatible API on port 1234. Select "LM Studio" in the provider dialog - Hefty auto-fills the correct URL and lists your loaded models.
OpenAI
Enter your OpenAI API key in the config dialog. Hefty will list available models and auto-select gpt-4o as the default text model and text-embedding-3-small for embeddings.
Anthropic
Enter your Anthropic API key. Note that Anthropic does not support embeddings or model listing, so you'll need to type the model name (e.g. claude-sonnet-4-20250514) and configure a separate embedding backend.
Other OpenAI-Compatible APIs
Select the OpenAI provider and change the URL to your service's endpoint. Works with OpenRouter, Together AI, Groq, or self-hosted vLLM / text-generation-inference.
Protocol & Integration
Hefty runs a unified server that serves both the web UI and the API on a single port (default 41890).
The API uses JSON-RPC over WebSocket, implementing the Model Context Protocol (MCP). Any MCP-compatible client can connect - not just the built-in web UI.
WebSocket endpoint: ws://localhost:41890/mcp/ws