Setup & Settings

Hefty works out of the box with sensible defaults. Install the desktop app, pick your AI provider in the setup dialog, and you're ready to go.

Quick Start

Install Hefty - download the desktop app for your platform (allow it to run if your OS shows a warning)
Launch Hefty - on first launch, the Cognition Config dialog appears automatically
Pick a provider - choose one of the four supported AI providers (see below) and select a model
Test & save - test the connection, save, and start chatting

Fastest Local Start

Install Ollama, pull a model (ollama pull gemma3:12b and ollama pull nomic-embed-text), and select Ollama in the setup dialog. Hefty connects automatically.

First Launch - Allow Hefty to Run

Hefty's binaries are not signed with vendor certificates. This is a deliberate choice - we refuse to pay platform gatekeepers for the privilege of distributing software. The app is safe, but your operating system may show a warning the first time you run it. Here's how to get past it on each platform.

Linux (AppImage)

Make the downloaded file executable and run it. That's all there is to it.

chmod +x Hefty-*.AppImage
./Hefty-*.AppImage

Or right-click the file in your file manager, go to Properties → Permissions, and check "Allow executing as program", then double-click to launch.

macOS (DMG)

Open the .dmg file and drag Hefty into your Applications folder
Double-click to launch - macOS will block it with "Hefty can't be opened because Apple cannot check it for malicious software"
Open System Settings → Privacy & Security
Scroll down - you'll see "Hefty was blocked from use because it is not from an identified developer"
Click "Open Anyway" and confirm

macOS Sequoia (15+)

If "Open Anyway" doesn't appear or still fails, open Terminal and run:
xattr -cr /Applications/Hefty.app
This strips the quarantine flag. Then launch normally.

Windows (Installer)

Run the Hefty Setup installer - Windows will show a SmartScreen warning: "Windows protected your PC"
Click "More info" (the text link, not a button)
Click "Run anyway"
Proceed with installation as normal

This only happens once - after the first launch, your OS will remember that you allowed it.

AI Providers

Hefty supports four AI providers out of the box. You choose your provider in the Cognition Config dialog - accessible on first launch or any time via the admin menu.

Ollama Local

Free, local inference. Supports model listing, embeddings, and multimodal models. Recommended for privacy-first setups.

Default URL: localhost:11434

LM Studio Local

Local inference with a GUI model manager. Uses the OpenAI-compatible API. Models are loaded dynamically from your LM Studio server.

Default URL: localhost:1234/v1

OpenAI Cloud

GPT-4o, GPT-4.1, o3-mini, and more. Supports model listing and embeddings. Requires an API key.

URL: api.openai.com/v1

Anthropic Cloud

Claude Sonnet 4, Claude 3.5 Haiku, and more. Requires an API key. Does not support embeddings - use a separate embedding provider.

URL: api.anthropic.com

Any OpenAI-compatible API also works - OpenRouter, Together AI, Groq, vLLM, or any other service that speaks the OpenAI chat completions protocol. Just select OpenAI as the provider and enter the service's URL.

Embedding Provider

Hefty uses a separate embedding model for its knowledge system (vector search). By default, embeddings use the same provider as your text model. If your text provider doesn't support embeddings (e.g. Anthropic), you can configure a separate embedding backend - for example, a local Ollama instance running nomic-embed-text.

Cognition Config Dialog

The Cognition Config dialog has two tabs:

Provider & Models - select your AI provider, enter the server URL, choose a text model and embedding model, test the connection, and enter an API key if needed
Advanced - tune reasoning parameters: max reflection loops, model keep-alive time, reasoning timeout, context token limit, and stream max tokens

Changes take effect immediately - Hefty hot-reloads the cognition subsystem when you save, with no restart needed. The config is persisted to cognition-config.json in your data directory.

Customizing Hefty

In the App

Within the Hefty UI, you can configure:

Display name - how Hefty addresses you
Custom instructions - persistent instructions Hefty follows in every conversation (e.g., "Always explain your reasoning", "Prefer Python over JavaScript")

These settings persist across sessions and conversations.

Installation

The desktop app is a single executable that bundles everything:

Linux - AppImage (download and run, no installation needed)
macOS - DMG installer
Windows - NSIS installer

Knowledge & Learning

Hefty's learning system is enabled by default. You can control it with:

HEFTY_LEARNING_ENABLED - set to false to disable automatic learning
HEFTY_MAINTENANCE_ENABLED - set to false to disable automatic knowledge maintenance (pruning, merging)

Data Location

All Hefty data (conversations, knowledge, settings) is stored in ~/.hefty by default. Change it with HEFTY_DATA_DIR. You can back up, move, or delete this folder at any time.

Advanced

Most configuration is done via the in-app Cognition Config dialog. Environment variables below serve as compile-time defaults and can override persisted config when needed.

Environment Variables Reference

Server

HEFTY_PORT - HTTP port for the unified server (GUI + API). Default: 41890
HEFTY_HOST - bind address. Default: 0.0.0.0
HEFTY_DATA_DIR - data storage location. Default: ~/.hefty

AI Models

HEFTY_LLM_BACKEND - wire protocol: ollama, openai, anthropic, or llamacpp. Default: ollama
HEFTY_LLM_URL - LLM API URL. Default: http://localhost:11434
HEFTY_TEXT_MODEL - primary reasoning model. Default: gemma3:12b
HEFTY_EMBEDDING_MODEL - model for vector embeddings. Default: nomic-embed-text

Knowledge

HEFTY_LEARNING_ENABLED - enable/disable learning. Default: true
HEFTY_MAINTENANCE_ENABLED - enable/disable auto-maintenance. Default: true
HEFTY_VEC_EXTENSION_PATH - path to sqlite-vec extension. Auto-detected in desktop app.

Reflection

HEFTY_MAX_REFLECTION_LOOPS - max reasoning loops per task. Default: 10
HEFTY_MODEL_KEEP_ALIVE_MINUTES - how long to keep the model loaded. Default: 15

Provider Setup Details

Ollama (Default)

Ollama runs on port 11434 by default, which matches Hefty's default configuration. No extra setup needed beyond pulling the models.

LM Studio

LM Studio exposes an OpenAI-compatible API on port 1234. Select "LM Studio" in the provider dialog - Hefty auto-fills the correct URL and lists your loaded models.

OpenAI

Enter your OpenAI API key in the config dialog. Hefty will list available models and auto-select gpt-4o as the default text model and text-embedding-3-small for embeddings.

Anthropic

Enter your Anthropic API key. Note that Anthropic does not support embeddings or model listing, so you'll need to type the model name (e.g. claude-sonnet-4-20250514) and configure a separate embedding backend.

Other OpenAI-Compatible APIs

Select the OpenAI provider and change the URL to your service's endpoint. Works with OpenRouter, Together AI, Groq, or self-hosted vLLM / text-generation-inference.

Protocol & Integration

Hefty runs a unified server that serves both the web UI and the API on a single port (default 41890).

The API uses JSON-RPC over WebSocket, implementing the Model Context Protocol (MCP). Any MCP-compatible client can connect - not just the built-in web UI.

WebSocket endpoint: ws://localhost:41890/mcp/ws