For developers & vibe coders
GenTrellis exposes an OpenAI-compatible API with protection levels, local RAG, and multi-agent workflows built in. Vibe-code against local endpoints that keep sensitive data private — then ship production apps your compliance team can actually approve.
OpenAI-compatible API
Use the OpenAI SDK your team already knows. Change the base URL to your local GenTrellis box and add X-Protection-Level to enforce governance per request.
from openai import OpenAI
client = OpenAI(
base_url="https://your-gentrellis/v1",
api_key="your-api-key"
)
response = client.chat.completions.create(
model="gentrellis-default",
messages=[{
"role": "user",
"content": "Summarize our FERPA compliance policy"
}],
extra_headers={
"X-Protection-Level": "protected"
}
)GenTrellis vs. Claude Code
Claude Code is powerful for individual developers. GenTrellis gives your whole team the same vibe-coding power — locally, with your data, and with compliance built into every request.
| Feature | Claude Code | GenTrellis |
|---|---|---|
| Where code runs | Cloud — your prompts go to Anthropic | Local — prompts stay on your hardware |
| What you build against | General-purpose API | OpenAI-compatible API with protection levels, RAG, and guardrails built in |
| Data grounding | No access to your internal documents | Local RAG over your policies, contracts, client files, and handbooks |
| Protection & compliance | None — output goes directly to the user | 4-level protection: PII detection, content policy, approval gates, audit trails |
| Multi-agent workflows | Single agent per session | smol-agent harness with chained agents, local tool use, and protection-aware routing |
| Team enablement | Individual tool — each person figures it out | Hands-on workshops that turn one champion into a team capability |
What's pre-loaded
Every GenTrellis box ships with the full stack installed and configured. No Kubernetes. No Docker compose files. No “just deploy this Helm chart.”
High-throughput local inference with continuous batching. Runs 70B–1T class models on your hardware with 25–80+ tok/s per user.
Drop-in replacement for the OpenAI API. Use the same SDK your team already knows — just change the base URL and add protection headers.
Multi-agent framework for building workflows that chain local AI calls. Agents can search your documents, draft outputs, and route through protection levels automatically.
Retrieval-augmented generation over your own documents. Upload policies, contracts, handbooks, and project files — the AI answers from your data, not the public internet.
NVIDIA NeMo Guardrails for runtime content policy enforcement. PII detection, prompt leak prevention, content filtering, and custom rules — all running locally.
Sensitivity-based routing that keeps controlled workloads on-prem and sends low-sensitivity requests to AWS Bedrock. Fully auditable, policy-driven, automatic.
Four protection levels (Standard, Protected, Controlled, Locked) that control what the AI can see, say, retrieve, and do. Set per-workflow, enforced on every request.
Every prompt, every response, every routing decision — logged locally with immutable audit trails. Export for compliance reviews. SOC 2 readiness underway.
Workshops & consulting
Every pilot includes hands-on training. Your team leaves the workshop with working production apps built on your own data.
1-Day Workshop
Included with Entry Box
2-Day Workshop + App Sprint
Included with Mid Box
Ongoing Consulting
Available as add-on
Reserve your pilot box and we'll get your team building production apps in weeks, not months.