For developers & vibe coders

Build secure apps on
your own AI infrastructure

GenTrellis exposes an OpenAI-compatible API with protection levels, local RAG, and multi-agent workflows built in. Vibe-code against local endpoints that keep sensitive data private — then ship production apps your compliance team can actually approve.

OpenAI-compatible API

Same SDK. Local endpoint. Protection built in.

Use the OpenAI SDK your team already knows. Change the base URL to your local GenTrellis box and add X-Protection-Level to enforce governance per request.

from openai import OpenAI

client = OpenAI(
    base_url="https://your-gentrellis/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="gentrellis-default",
    messages=[{
        "role": "user",
        "content": "Summarize our FERPA compliance policy"
    }],
    extra_headers={
        "X-Protection-Level": "protected"
    }
)

GenTrellis vs. Claude Code

Vibe-code with guardrails, not just prompts

Claude Code is powerful for individual developers. GenTrellis gives your whole team the same vibe-coding power — locally, with your data, and with compliance built into every request.

Feature	Claude Code	GenTrellis
Where code runs	Cloud — your prompts go to Anthropic	Local — prompts stay on your hardware
What you build against	General-purpose API	OpenAI-compatible API with protection levels, RAG, and guardrails built in
Data grounding	No access to your internal documents	Local RAG over your policies, contracts, client files, and handbooks
Protection & compliance	None — output goes directly to the user	4-level protection: PII detection, content policy, approval gates, audit trails
Multi-agent workflows	Single agent per session	smol-agent harness with chained agents, local tool use, and protection-aware routing
Team enablement	Individual tool — each person figures it out	Hands-on workshops that turn one champion into a team capability

What's pre-loaded

A complete AI development stack, ready to go

Every GenTrellis box ships with the full stack installed and configured. No Kubernetes. No Docker compose files. No “just deploy this Helm chart.”

vLLM Inference Engine

High-throughput local inference with continuous batching. Runs 70B–1T class models on your hardware with 25–80+ tok/s per user.

OpenAI-Compatible API

Drop-in replacement for the OpenAI API. Use the same SDK your team already knows — just change the base URL and add protection headers.

smol-agent Harness

Multi-agent framework for building workflows that chain local AI calls. Agents can search your documents, draft outputs, and route through protection levels automatically.

Local RAG Pipeline

Retrieval-augmented generation over your own documents. Upload policies, contracts, handbooks, and project files — the AI answers from your data, not the public internet.

NeMo Guardrails

NVIDIA NeMo Guardrails for runtime content policy enforcement. PII detection, prompt leak prevention, content filtering, and custom rules — all running locally.

Intelligent Router

Sensitivity-based routing that keeps controlled workloads on-prem and sends low-sensitivity requests to AWS Bedrock. Fully auditable, policy-driven, automatic.

HallMonitor Protection

Four protection levels (Standard, Protected, Controlled, Locked) that control what the AI can see, say, retrieve, and do. Set per-workflow, enforced on every request.

Immutable Audit Logs

Every prompt, every response, every routing decision — logged locally with immutable audit trails. Export for compliance reviews. SOC 2 readiness underway.

Workshops & consulting

We don't just ship hardware — we teach your team to build

Every pilot includes hands-on training. Your team leaves the workshop with working production apps built on your own data.

1-Day Workshop

Included with Entry Box

✓OpenAI API basics with your local endpoint
✓Building a knowledge assistant over your documents
✓Protection levels and when to use each
✓Deploying your first internal app

2-Day Workshop + App Sprint

Included with Mid Box

✓Everything in the 1-day workshop
✓smol-agent multi-agent workflows
✓Building customer portals and dashboards
✓Custom guardrails and content policies
✓Production deployment and monitoring
✓Leave with a working production app

Ongoing Consulting

Available as add-on

✓Custom app development sprints
✓Advanced RAG pipeline tuning
✓Compliance workflow design (FERPA, HIPAA, legal)
✓Team scaling and train-the-trainer programs

Ready to build on private AI?

Reserve your pilot box and we'll get your team building production apps in weeks, not months.

Reserve Your Pilot Box Talk to an Engineer

Build secure apps onyour own AI infrastructure