Setup & Settings

Hefty works out of the box with sensible defaults. Install the desktop app, pick your AI provider in the setup dialog, and you're ready to go.

Quick Start

  1. Install Hefty - download the desktop app for your platform (allow it to run if your OS shows a warning)
  2. Launch Hefty - on first launch, the Cognition Config dialog appears automatically
  3. Pick a provider - choose one of the four supported AI providers (see below) and select a model
  4. Test & save - test the connection, save, and start chatting
Fastest Local Start

Install Ollama, pull a model (ollama pull gemma3:12b and ollama pull nomic-embed-text), and select Ollama in the setup dialog. Hefty connects automatically.

First Launch - Allow Hefty to Run

Hefty's binaries are not signed with vendor certificates. This is a deliberate choice - we refuse to pay platform gatekeepers for the privilege of distributing software. The app is safe, but your operating system may show a warning the first time you run it. Here's how to get past it on each platform.

Linux (AppImage)

Make the downloaded file executable and run it. That's all there is to it.

chmod +x Hefty-*.AppImage
./Hefty-*.AppImage

Or right-click the file in your file manager, go to Properties → Permissions, and check "Allow executing as program", then double-click to launch.

macOS (DMG)

  1. Open the .dmg file and drag Hefty into your Applications folder
  2. Double-click to launch - macOS will block it with "Hefty can't be opened because Apple cannot check it for malicious software"
  3. Open System Settings → Privacy & Security
  4. Scroll down - you'll see "Hefty was blocked from use because it is not from an identified developer"
  5. Click "Open Anyway" and confirm
macOS Sequoia (15+)

If "Open Anyway" doesn't appear or still fails, open Terminal and run:
xattr -cr /Applications/Hefty.app
This strips the quarantine flag. Then launch normally.

Windows (Installer)

  1. Run the Hefty Setup installer - Windows will show a SmartScreen warning: "Windows protected your PC"
  2. Click "More info" (the text link, not a button)
  3. Click "Run anyway"
  4. Proceed with installation as normal

This only happens once - after the first launch, your OS will remember that you allowed it.

AI Providers

Hefty supports four AI providers out of the box. You choose your provider in the Cognition Config dialog - accessible on first launch or any time via the admin menu.

Ollama Local

Free, local inference. Supports model listing, embeddings, and multimodal models. Recommended for privacy-first setups.

Default URL: localhost:11434
LM Studio Local

Local inference with a GUI model manager. Uses the OpenAI-compatible API. Models are loaded dynamically from your LM Studio server.

Default URL: localhost:1234/v1
OpenAI Cloud

GPT-4o, GPT-4.1, o3-mini, and more. Supports model listing and embeddings. Requires an API key.

URL: api.openai.com/v1
Anthropic Cloud

Claude Sonnet 4, Claude 3.5 Haiku, and more. Requires an API key. Does not support embeddings - use a separate embedding provider.

URL: api.anthropic.com

Any OpenAI-compatible API also works - OpenRouter, Together AI, Groq, vLLM, or any other service that speaks the OpenAI chat completions protocol. Just select OpenAI as the provider and enter the service's URL.

Embedding Provider

Hefty uses a separate embedding model for its knowledge system (vector search). By default, embeddings use the same provider as your text model. If your text provider doesn't support embeddings (e.g. Anthropic), you can configure a separate embedding backend - for example, a local Ollama instance running nomic-embed-text.

Cognition Config Dialog

The Cognition Config dialog has two tabs:

  • Provider & Models - select your AI provider, enter the server URL, choose a text model and embedding model, test the connection, and enter an API key if needed
  • Advanced - tune reasoning parameters: max reflection loops, model keep-alive time, reasoning timeout, context token limit, and stream max tokens

Changes take effect immediately - Hefty hot-reloads the cognition subsystem when you save, with no restart needed. The config is persisted to cognition-config.json in your data directory.

Customizing Hefty

In the App

Within the Hefty UI, you can configure:

  • Display name - how Hefty addresses you
  • Custom instructions - persistent instructions Hefty follows in every conversation (e.g., "Always explain your reasoning", "Prefer Python over JavaScript")

These settings persist across sessions and conversations.

Installation

The desktop app is a single executable that bundles everything:

  • Linux - AppImage (download and run, no installation needed)
  • macOS - DMG installer
  • Windows - NSIS installer

Knowledge & Learning

Hefty's learning system is enabled by default. You can control it with:

  • HEFTY_LEARNING_ENABLED - set to false to disable automatic learning
  • HEFTY_MAINTENANCE_ENABLED - set to false to disable automatic knowledge maintenance (pruning, merging)

Data Location

All Hefty data (conversations, knowledge, settings) is stored in ~/.hefty by default. Change it with HEFTY_DATA_DIR. You can back up, move, or delete this folder at any time.


Advanced

Most configuration is done via the in-app Cognition Config dialog. Environment variables below serve as compile-time defaults and can override persisted config when needed.

Environment Variables Reference

Server

  • HEFTY_PORT - HTTP port for the unified server (GUI + API). Default: 41890
  • HEFTY_HOST - bind address. Default: 0.0.0.0
  • HEFTY_DATA_DIR - data storage location. Default: ~/.hefty

AI Models

  • HEFTY_LLM_BACKEND - wire protocol: ollama, openai, anthropic, or llamacpp. Default: ollama
  • HEFTY_LLM_URL - LLM API URL. Default: http://localhost:11434
  • HEFTY_TEXT_MODEL - primary reasoning model. Default: gemma3:12b
  • HEFTY_EMBEDDING_MODEL - model for vector embeddings. Default: nomic-embed-text

Knowledge

  • HEFTY_LEARNING_ENABLED - enable/disable learning. Default: true
  • HEFTY_MAINTENANCE_ENABLED - enable/disable auto-maintenance. Default: true
  • HEFTY_VEC_EXTENSION_PATH - path to sqlite-vec extension. Auto-detected in desktop app.

Reflection

  • HEFTY_MAX_REFLECTION_LOOPS - max reasoning loops per task. Default: 10
  • HEFTY_MODEL_KEEP_ALIVE_MINUTES - how long to keep the model loaded. Default: 15
Provider Setup Details

Ollama (Default)

Ollama runs on port 11434 by default, which matches Hefty's default configuration. No extra setup needed beyond pulling the models.

LM Studio

LM Studio exposes an OpenAI-compatible API on port 1234. Select "LM Studio" in the provider dialog - Hefty auto-fills the correct URL and lists your loaded models.

OpenAI

Enter your OpenAI API key in the config dialog. Hefty will list available models and auto-select gpt-4o as the default text model and text-embedding-3-small for embeddings.

Anthropic

Enter your Anthropic API key. Note that Anthropic does not support embeddings or model listing, so you'll need to type the model name (e.g. claude-sonnet-4-20250514) and configure a separate embedding backend.

Other OpenAI-Compatible APIs

Select the OpenAI provider and change the URL to your service's endpoint. Works with OpenRouter, Together AI, Groq, or self-hosted vLLM / text-generation-inference.

Protocol & Integration

Hefty runs a unified server that serves both the web UI and the API on a single port (default 41890).

The API uses JSON-RPC over WebSocket, implementing the Model Context Protocol (MCP). Any MCP-compatible client can connect - not just the built-in web UI.

WebSocket endpoint: ws://localhost:41890/mcp/ws