Guardrails Package
The @hazeljs/guardrails package provides content safety, PII handling, and output validation for HazelJS AI applications. It integrates with HTTP routes, @hazeljs/ai, and @hazeljs/agent to protect against prompt injection, toxic content, and PII leakage.
Why Guardrails?
AI applications face unique security and compliance challenges. Without guardrails:
- Prompt injection — Malicious users can inject instructions like "ignore previous instructions" or "you are now in developer mode" to bypass system prompts and extract data or change behavior.
- PII leakage — LLMs may echo or infer sensitive data (emails, phone numbers, SSNs) from user input or training data, violating privacy regulations (GDPR, CCPA).
- Toxic output — Models can generate harmful, biased, or inappropriate content when given adversarial inputs.
- Compliance — Enterprise deployments often require documented controls for AI safety (EU AI Act, SOC 2, etc.).
Unlike LangChain or Vercel AI SDK, HazelJS provides built-in guardrails that plug into your existing HTTP, AI, and agent layers—no separate middleware or glue code.
Purpose
The @hazeljs/guardrails package provides:
- PII Detection & Redaction — Email, phone, SSN, credit card (configurable entities)
- Prompt Injection Detection — Heuristic-based patterns (e.g. "ignore previous instructions", "system:", "jailbreak")
- Toxicity Check — Keyword blocklist for harmful content
- Output Validation — Schema validation and PII redaction on LLM responses
- Multiple Integration Points — HTTP pipes/interceptors, AI decorators, agent tools
Architecture
flowchart TB
subgraph inputs [Input Sources]
HTTP[HTTP Request]
AITask[AITask Method]
AgentTool[Agent Tool Call]
end
subgraph integrations [Integration Layer]
Pipe[GuardrailPipe]
Interceptor[GuardrailInterceptor]
Decorators[Input Output Decorators]
ToolExec[ToolExecutor]
end
subgraph guardrails [GuardrailsService]
PII[PII Redaction]
Injection[Prompt Injection Check]
Toxicity[Toxicity Check]
Schema[Output Schema]
end
HTTP --> Pipe
HTTP --> Interceptor
AITask --> Decorators
AgentTool --> ToolExec
Pipe --> guardrails
Interceptor --> guardrails
Decorators --> guardrails
ToolExec --> guardrails
guardrails --> LLM[LLM Response]
guardrails --> ToolOut[Tool Output]Key Components
- GuardrailsService — Core service with
checkInput(),checkOutput(), andredactPII() - GuardrailPipe — Validates request body/query before route handlers
- GuardrailInterceptor — Validates input and output for entire controller
- @GuardrailInput / @GuardrailOutput — Method-level decorators for AI tasks
- Agent integration — Automatic validation when
GuardrailsModuleandAgentModuleare both imported
Installation
npm install @hazeljs/guardrails
Or with the CLI:
hazel add guardrails
Quick Start
1. Import the Module
import { HazelModule } from '@hazeljs/core';
import { GuardrailsModule } from '@hazeljs/guardrails';
@HazelModule({
imports: [
GuardrailsModule.forRoot({
redactPIIByDefault: true,
blockInjectionByDefault: true,
blockToxicityByDefault: true,
}),
],
})
export class AppModule {}
Integration Options
Choose the integration that fits your use case:
| Integration | When to Use |
|---|---|
| GuardrailPipe | Single route needs input validation (e.g. one chat endpoint) |
| GuardrailInterceptor | All routes in a controller need input + output validation |
| @GuardrailInput / @GuardrailOutput | AI tasks with @AITask that need per-method control |
| Agent integration | Agent tools should validate input/output automatically |
HTTP: GuardrailPipe
Validate request body before the handler runs. Use when you want guardrails on specific routes only.
import { Controller, Post, Body, UsePipes } from '@hazeljs/core';
import { GuardrailPipe } from '@hazeljs/guardrails';
@Controller({ path: '/chat' })
export class ChatController {
@Post()
@UsePipes(GuardrailPipe)
async chat(@Body() body: { message: string }) {
return { reply: '...' };
}
}
If input fails (e.g. prompt injection detected), the request is rejected before your handler runs.
HTTP: GuardrailInterceptor
Validate both input and output for all routes in a controller. Use when the entire controller handles AI or user-generated content.
import { Controller, Post, Body, UseInterceptors } from '@hazeljs/core';
import { GuardrailInterceptor } from '@hazeljs/guardrails';
@Controller({ path: '/chat' })
@UseInterceptors(GuardrailInterceptor)
export class ChatController {
@Post()
async chat(@Body() body: { message: string }) {
return { reply: '...' };
}
}
The interceptor runs input guardrails on context.body before next(), and output guardrails on the response after the handler returns.
AI Tasks: @GuardrailInput and @GuardrailOutput
For methods using @AITask, add @GuardrailInput and @GuardrailOutput to validate user input before the LLM call and the model response after. Inject GuardrailsService in the controller constructor.
import { Controller, Post, Body } from '@hazeljs/core';
import { AIService, AITask } from '@hazeljs/ai';
import { GuardrailsService, GuardrailInput, GuardrailOutput } from '@hazeljs/guardrails';
@Controller({ path: '/chat' })
export class ChatController {
constructor(
private aiService: AIService,
private guardrailsService: GuardrailsService
) {}
@GuardrailInput()
@GuardrailOutput()
@AITask({ provider: 'openai', model: 'gpt-4' })
@Post()
async chat(@Body() body: { message: string }) {
return body.message;
}
}
Decorator order matters: @GuardrailInput runs before the method, @GuardrailOutput runs after. Both require GuardrailsService to be injected.
Agent Tools: Automatic Integration
When both GuardrailsModule and AgentModule are imported, tool input and output are validated automatically. No decorators or pipes needed.
@HazelModule({
imports: [
GuardrailsModule.forRoot(),
AgentModule.forRoot(),
],
})
export class AppModule {}
Every tool call runs through checkInput before execution and checkOutput after. Blocked inputs return { success: false, error }; blocked outputs throw.
Use Cases
- Customer support chatbot — Block prompt injection, redact PII from logs, validate responses before sending to users
- Internal AI tools — Ensure agent tools don't leak sensitive data or execute on malicious input
- Public-facing chat API — GuardrailPipe on
/chatto reject toxic or injection attempts before hitting the LLM - Compliance-heavy apps — PII redaction for GDPR/CCPA, documented guardrails for audits
Using GuardrailsService Directly
For custom logic (e.g. background jobs, non-HTTP flows), inject and use GuardrailsService:
import { Injectable } from '@hazeljs/core';
import { GuardrailsService } from '@hazeljs/guardrails';
@Injectable()
export class CustomProcessor {
constructor(private guardrails: GuardrailsService) {}
async processUserMessage(text: string) {
const result = this.guardrails.checkInput(text, {
redactPII: true,
blockInjection: true,
blockToxicity: true,
});
if (!result.allowed) {
throw new Error(result.blockedReason);
}
const safeInput = (result.modified ?? text) as string;
// Process safeInput...
}
redactSensitiveData(text: string) {
return this.guardrails.redactPII(text, ['email', 'phone', 'ssn']);
}
}
Error Handling
When guardrails block content, they throw GuardrailViolationError:
import { GuardrailViolationError } from '@hazeljs/guardrails';
try {
await chatController.chat({ body: { message: userInput } });
} catch (err) {
if (err instanceof GuardrailViolationError) {
console.error('Blocked:', err.violations, err.blockedReason);
// Return 400 with user-friendly message
}
}
Use an Exception Filter to map GuardrailViolationError to a consistent HTTP response (e.g. 400 Bad Request with { error: 'Content blocked', reason: err.blockedReason }).
Configuration
| Option | Type | Default | Description |
|---|---|---|---|
piiEntities | PIIEntityType[] | ['email','phone','ssn','credit_card'] | Entities to detect/redact |
redactPIIByDefault | boolean | false | Redact PII in input by default |
blockInjectionByDefault | boolean | true | Block prompt injection by default |
blockToxicityByDefault | boolean | true | Block toxic content by default |
injectionBlocklist | string[] | — | Custom injection patterns (regex source strings) |
toxicityBlocklist | string[] | — | Custom toxicity keywords |
PII Entity Types
email— Standard email formatphone— E.164 and common US formatsssn— US Social Security Number (XXX-XX-XXXX or 9 digits)credit_card— Card numbers (4 groups of 4 digits or 13–19 digits)
Prompt Injection Patterns (Built-in)
The package detects common injection phrases such as: "ignore previous instructions", "disregard all prior", "system:", "### instruction:", "jailbreak", "DAN mode", "developer mode", "pretend you are", "act as if", and similar variants.
API Reference
GuardrailsService
| Method | Description |
|---|---|
checkInput(input, options?) | Validate input. Returns { allowed, modified?, violations?, blockedReason? } |
checkOutput(output, options?) | Validate output. Returns { allowed, modified?, violations?, blockedReason? } |
redactPII(text, entities?) | Redact PII from text. Returns sanitized string |
GuardrailResult
interface GuardrailResult {
allowed: boolean;
modified?: string | object; // Redacted/modified content when allowed
violations?: string[]; // e.g. ['pii_redacted', 'prompt_injection']
blockedReason?: string; // Human-readable reason when blocked
}
GuardrailViolationError
Thrown when content is blocked. Properties: message, violations, blockedReason. Use toJSON() for serialization.
Best Practices
- Enable guardrails early — Add
GuardrailsModulewhen you first integrate AI; it's easier than retrofitting. - Redact PII in logs — Use
redactPII()before logging user input or LLM responses. - Handle errors gracefully — Catch
GuardrailViolationErrorand return clear feedback (e.g. "Your message could not be processed"). - Custom blocklists — Add domain-specific keywords via
toxicityBlocklistorinjectionBlocklist. - Combine with Auth — Use Auth for authentication; guardrails for content safety.
What's Next?
- Learn about AI for LLM integration
- Explore Agent for tool-using agents
- Check out Auth for route protection
- Read Exception Filters for error handling