Guardrails Package

The @hazeljs/guardrails package provides content safety, PII handling, and output validation for HazelJS AI applications. It integrates with HTTP routes, @hazeljs/ai, and @hazeljs/agent to protect against prompt injection, toxic content, and PII leakage.

Why Guardrails?

AI applications face unique security and compliance challenges. Without guardrails:

Prompt injection — Malicious users can inject instructions like "ignore previous instructions" or "you are now in developer mode" to bypass system prompts and extract data or change behavior.
PII leakage — LLMs may echo or infer sensitive data (emails, phone numbers, SSNs) from user input or training data, violating privacy regulations (GDPR, CCPA).
Toxic output — Models can generate harmful, biased, or inappropriate content when given adversarial inputs.
Compliance — Enterprise deployments often require documented controls for AI safety (EU AI Act, SOC 2, etc.).

Unlike LangChain or Vercel AI SDK, HazelJS provides built-in guardrails that plug into your existing HTTP, AI, and agent layers—no separate middleware or glue code.

Purpose

The @hazeljs/guardrails package provides:

PII Detection & Redaction — Email, phone, SSN, credit card (configurable entities)
Prompt Injection Detection — Heuristic-based patterns (e.g. "ignore previous instructions", "system:", "jailbreak")
Toxicity Check — Keyword blocklist for harmful content
Output Validation — Schema validation and PII redaction on LLM responses
Multiple Integration Points — HTTP pipes/interceptors, AI decorators, agent tools

Architecture

flowchart TB
  subgraph inputs [Input Sources]
      HTTP[HTTP Request]
      AITask[AITask Method]
      AgentTool[Agent Tool Call]
  end

  subgraph integrations [Integration Layer]
      Pipe[GuardrailPipe]
      Interceptor[GuardrailInterceptor]
      Decorators[Input Output Decorators]
      ToolExec[ToolExecutor]
  end

  subgraph guardrails [GuardrailsService]
      PII[PII Redaction]
      Injection[Prompt Injection Check]
      Toxicity[Toxicity Check]
      Schema[Output Schema]
  end

  HTTP --> Pipe
  HTTP --> Interceptor
  AITask --> Decorators
  AgentTool --> ToolExec
  Pipe --> guardrails
  Interceptor --> guardrails
  Decorators --> guardrails
  ToolExec --> guardrails
  guardrails --> LLM[LLM Response]
  guardrails --> ToolOut[Tool Output]

Key Components

GuardrailsService — Core service with checkInput(), checkOutput(), and redactPII()
GuardrailPipe — Validates request body/query before route handlers
GuardrailInterceptor — Validates input and output for entire controller
@GuardrailInput / @GuardrailOutput — Method-level decorators for AI tasks
Agent integration — Automatic validation when GuardrailsModule and AgentModule are both imported

Installation

npm install @hazeljs/guardrails

Or with the CLI:

hazel add guardrails

Quick Start

1. Import the Module

import { HazelModule } from '@hazeljs/core';
import { GuardrailsModule } from '@hazeljs/guardrails';

@HazelModule({
  imports: [
    GuardrailsModule.forRoot({
      redactPIIByDefault: true,
      blockInjectionByDefault: true,
      blockToxicityByDefault: true,
    }),
  ],
})
export class AppModule {}

Integration Options

Choose the integration that fits your use case:

Integration	When to Use
GuardrailPipe	Single route needs input validation (e.g. one chat endpoint)
GuardrailInterceptor	All routes in a controller need input + output validation
@GuardrailInput / @GuardrailOutput	AI tasks with `@AITask` that need per-method control
Agent integration	Agent tools should validate input/output automatically

HTTP: GuardrailPipe

Validate request body before the handler runs. Use when you want guardrails on specific routes only.

import { Controller, Post, Body, UsePipes } from '@hazeljs/core';
import { GuardrailPipe } from '@hazeljs/guardrails';

@Controller({ path: '/chat' })
export class ChatController {
  @Post()
  @UsePipes(GuardrailPipe)
  async chat(@Body() body: { message: string }) {
    return { reply: '...' };
  }
}

If input fails (e.g. prompt injection detected), the request is rejected before your handler runs.

HTTP: GuardrailInterceptor

Validate both input and output for all routes in a controller. Use when the entire controller handles AI or user-generated content.

import { Controller, Post, Body, UseInterceptors } from '@hazeljs/core';
import { GuardrailInterceptor } from '@hazeljs/guardrails';

@Controller({ path: '/chat' })
@UseInterceptors(GuardrailInterceptor)
export class ChatController {
  @Post()
  async chat(@Body() body: { message: string }) {
    return { reply: '...' };
  }
}

The interceptor runs input guardrails on context.body before next(), and output guardrails on the response after the handler returns.

AI Tasks: @GuardrailInput and @GuardrailOutput

For methods using @AITask, add @GuardrailInput and @GuardrailOutput to validate user input before the LLM call and the model response after. Inject GuardrailsService in the controller constructor.

import { Controller, Post, Body } from '@hazeljs/core';
import { AIService, AITask } from '@hazeljs/ai';
import { GuardrailsService, GuardrailInput, GuardrailOutput } from '@hazeljs/guardrails';

@Controller({ path: '/chat' })
export class ChatController {
  constructor(
    private aiService: AIService,
    private guardrailsService: GuardrailsService
  ) {}

  @GuardrailInput()
  @GuardrailOutput()
  @AITask({ provider: 'openai', model: 'gpt-4' })
  @Post()
  async chat(@Body() body: { message: string }) {
    return body.message;
  }
}

Decorator order matters: @GuardrailInput runs before the method, @GuardrailOutput runs after. Both require GuardrailsService to be injected.

Agent Tools: Automatic Integration

When both GuardrailsModule and AgentModule are imported, tool input and output are validated automatically. No decorators or pipes needed.

@HazelModule({
  imports: [
    GuardrailsModule.forRoot(),
    AgentModule.forRoot(),
  ],
})
export class AppModule {}

Every tool call runs through checkInput before execution and checkOutput after. Blocked inputs return { success: false, error }; blocked outputs throw.

Use Cases

Customer support chatbot — Block prompt injection, redact PII from logs, validate responses before sending to users
Internal AI tools — Ensure agent tools don't leak sensitive data or execute on malicious input
Public-facing chat API — GuardrailPipe on /chat to reject toxic or injection attempts before hitting the LLM
Compliance-heavy apps — PII redaction for GDPR/CCPA, documented guardrails for audits

Using GuardrailsService Directly

For custom logic (e.g. background jobs, non-HTTP flows), inject and use GuardrailsService:

import { Injectable } from '@hazeljs/core';
import { GuardrailsService } from '@hazeljs/guardrails';

@Injectable()
export class CustomProcessor {
  constructor(private guardrails: GuardrailsService) {}

  async processUserMessage(text: string) {
    const result = this.guardrails.checkInput(text, {
      redactPII: true,
      blockInjection: true,
      blockToxicity: true,
    });

    if (!result.allowed) {
      throw new Error(result.blockedReason);
    }

    const safeInput = (result.modified ?? text) as string;
    // Process safeInput...
  }

  redactSensitiveData(text: string) {
    return this.guardrails.redactPII(text, ['email', 'phone', 'ssn']);
  }
}

Error Handling

When guardrails block content, they throw GuardrailViolationError:

import { GuardrailViolationError } from '@hazeljs/guardrails';

try {
  await chatController.chat({ body: { message: userInput } });
} catch (err) {
  if (err instanceof GuardrailViolationError) {
    console.error('Blocked:', err.violations, err.blockedReason);
    // Return 400 with user-friendly message
  }
}

Use an Exception Filter to map GuardrailViolationError to a consistent HTTP response (e.g. 400 Bad Request with { error: 'Content blocked', reason: err.blockedReason }).

Configuration

Option	Type	Default	Description
`piiEntities`	`PIIEntityType[]`	`['email','phone','ssn','credit_card']`	Entities to detect/redact
`redactPIIByDefault`	`boolean`	`false`	Redact PII in input by default
`blockInjectionByDefault`	`boolean`	`true`	Block prompt injection by default
`blockToxicityByDefault`	`boolean`	`true`	Block toxic content by default
`injectionBlocklist`	`string[]`	—	Custom injection patterns (regex source strings)
`toxicityBlocklist`	`string[]`	—	Custom toxicity keywords

PII Entity Types

email — Standard email format
phone — E.164 and common US formats
ssn — US Social Security Number (XXX-XX-XXXX or 9 digits)
credit_card — Card numbers (4 groups of 4 digits or 13–19 digits)

Prompt Injection Patterns (Built-in)

The package detects common injection phrases such as: "ignore previous instructions", "disregard all prior", "system:", "### instruction:", "jailbreak", "DAN mode", "developer mode", "pretend you are", "act as if", and similar variants.

API Reference

GuardrailsService

Method	Description
`checkInput(input, options?)`	Validate input. Returns `{ allowed, modified?, violations?, blockedReason? }`
`checkOutput(output, options?)`	Validate output. Returns `{ allowed, modified?, violations?, blockedReason? }`
`redactPII(text, entities?)`	Redact PII from text. Returns sanitized string

GuardrailResult

interface GuardrailResult {
  allowed: boolean;
  modified?: string | object;  // Redacted/modified content when allowed
  violations?: string[];      // e.g. ['pii_redacted', 'prompt_injection']
  blockedReason?: string;     // Human-readable reason when blocked
}

GuardrailViolationError

Thrown when content is blocked. Properties: message, violations, blockedReason. Use toJSON() for serialization.

Best Practices

Enable guardrails early — Add GuardrailsModule when you first integrate AI; it's easier than retrofitting.
Redact PII in logs — Use redactPII() before logging user input or LLM responses.
Handle errors gracefully — Catch GuardrailViolationError and return clear feedback (e.g. "Your message could not be processed").
Custom blocklists — Add domain-specific keywords via toxicityBlocklist or injectionBlocklist.
Combine with Auth — Use Auth for authentication; guardrails for content safety.

What's Next?

Learn about AI for LLM integration
Explore Agent for tool-using agents
Check out Auth for route protection
Read Exception Filters for error handling