Control every LLM call in-process — no proxy, no latency, no surprises
Enforce cost, safety, and reliability at runtime — entirely inside your app. No proxy. No sidecar. No infrastructure.
<span class="text-purple-400">import</span><span class="text-white/80"> { Loret } </span><span class="text-purple-400">from</span><span class="text-green-400"> "@loret/sdk"</span><span class="text-white/80">;</span>
<span class="text-purple-400">const</span><span class="text-blue-300"> client</span><span class="text-white/80"> = </span><span class="text-purple-400">new</span><span class="text-yellow-300"> Loret</span><span class="text-white/80">({</span>
<span class="text-white/80"> </span><span class="text-blue-300">projectId</span><span class="text-white/80">: </span><span class="text-green-400">"my-app"</span><span class="text-white/80">,</span>
<span class="text-white/80"> </span><span class="text-blue-300">providers</span><span class="text-white/80">: [{ provider: </span><span class="text-green-400">"openai"</span><span class="text-white/80">, model: </span><span class="text-green-400">"gpt-4o"</span><span class="text-white/80">, priority: 1 },</span>
<span class="text-white/80"> { provider: </span><span class="text-green-400">"anthropic"</span><span class="text-white/80">, model: </span><span class="text-green-400">"claude-sonnet-4-6"</span><span class="text-white/80">, priority: 2 }],</span>
<span class="text-white/80"> </span><span class="text-blue-300">mode</span><span class="text-white/80">: </span><span class="text-green-400">"enforce"</span><span class="text-white/80">,</span>
<span class="text-white/80"> </span><span class="text-blue-300">budgetLimits</span><span class="text-white/80">: [{ scope: </span><span class="text-green-400">"per_call"</span><span class="text-white/80">, maxCostUsd: 0.05 }],</span>
<span class="text-white/80">});</span>
<span class="text-purple-400">const</span><span class="text-blue-300"> result</span><span class="text-white/80"> = </span><span class="text-purple-400">await</span><span class="text-blue-300"> client</span><span class="text-white/80">.</span><span class="text-yellow-300">run</span><span class="text-white/80">({</span>
<span class="text-white/80"> </span><span class="text-blue-300">messages</span><span class="text-white/80">: [{ role: </span><span class="text-green-400">"user"</span><span class="text-white/80">, content: </span><span class="text-green-400">"Hello"</span><span class="text-white/80"> }],</span>
<span class="text-white/80">});</span>The problem
You're shipping AI features without real control
LLM APIs are expensive, unreliable, and opaque. Most teams only discover issues after a cost spike or a production incident — not before it happens.
Features
Runtime control for every LLM call
All enforcement happens before the request is sent — inside your application. Unlike proxy-based solutions, there is no extra hop, no infrastructure, no latency tradeoff.
Budget enforcement
Block expensive requests before they happen. Enforce token and cost limits per call, per trace, or over time — violations throw typed errors before the request is ever sent.
Provider routing and fallback
Retry failures and fall back across providers automatically — no orchestration layer required. Circuit breaking handles sustained outages without manual intervention.
PII protection
Detect and optionally redact or block sensitive data before it leaves your system. Emails, phone numbers, SSNs, credit cards, secrets, and IPs — caught in-process.
Trace and workflow guards
Limit calls, cost, and execution time across multi-step agent runs. Stop waste before it accumulates — guards fire before provider dispatch, not after.
Full observability
Structured events for every request: start, completion, failure, retry, fallback, and guardrail triggers. Buffered and flushed asynchronously — no latency impact.
Zero added latency
Runs entirely in-process. No network hop, no proxy, no added infrastructure. Policy is read from a local snapshot — enforcement overhead is under 1ms per request.
How it works
From integration to production in three steps
Install and configure
npm install @loret/sdk. Define your providers, budgets, and guardrails locally. No external service required — enforcement starts immediately in your process.
Replace your provider calls
Wrap your OpenAI or Anthropic calls with client.run(). Every request is now enforced and observed — before it leaves your application.
Connect the control plane (coming soon)
When ready, add centralized policy management, cost attribution by feature, and team-level governance across every service instance.
Early access
Be first when we launch
The hosted control plane and team dashboard are in development. Join the waitlist and get early access plus a 3-month discount on any paid plan.
No spam. Unsubscribe any time.