Building Context-Aware AI with the HazelJS Memory System

Discover how to build intelligent, context-aware AI applications with conversation tracking, entity memory, semantic search, and RAG integration using the new Memory System in @hazeljs/rag.

🧠 The Challenge of Context in AI Applications

Building truly intelligent AI applications requires more than just connecting to an LLM API. Your AI needs to remember conversations, understand user preferences, track entities, and maintain context across sessions. This is where most AI applications fall short.

Today, we're excited to introduce the Memory System in @hazeljs/rag - a comprehensive solution for building context-aware AI applications with persistent memory, semantic search, and seamless RAG integration.

What is the Memory System?

The Memory System provides five types of persistent memory that work together to give your AI applications true contextual awareness:

Conversation Memory - Track multi-turn conversations with automatic summarization
Entity Memory - Remember people, companies, and relationships
Semantic Memory - Store and recall facts using semantic search
Episodic Memory - Remember specific events with temporal context
Working Memory - Temporary scratchpad for current tasks

Quick Start: Your First Memory-Enabled Chatbot

Let's build a simple chatbot that remembers conversations:

import {
  MemoryManager,
  BufferMemory,
  RAGPipelineWithMemory,
} from '@hazeljs/rag';

// Setup memory
const memoryStore = new BufferMemory({ maxSize: 100 });
const memoryManager = new MemoryManager(memoryStore, {
  maxConversationLength: 20,
  entityExtraction: true,
});

await memoryManager.initialize();

// Add conversation messages
await memoryManager.addMessage(
  { role: 'user', content: 'My name is Alice' },
  'session-123'
);

await memoryManager.addMessage(
  { role: 'assistant', content: 'Nice to meet you, Alice!' },
  'session-123'
);

// Later in the conversation...
const history = await memoryManager.getConversationHistory('session-123');
// The AI remembers Alice's name!

Three Storage Strategies for Every Use Case

The Memory System provides three storage strategies, each optimized for different scenarios:

1. BufferMemory - Fast & Simple

Perfect for development and recent conversation history. Stores memories in a FIFO buffer with optional TTL.

const buffer = new BufferMemory({
  maxSize: 100,
  ttl: 3600000, // 1 hour
});

Best for: Development, testing, recent messages, low-latency requirements

2. VectorMemory - Semantic Search

Stores memories as embeddings for powerful semantic search. Works with any vector store (Pinecone, Qdrant, etc.).

const vectorMemory = new VectorMemory(vectorStore, embeddings, {
  collectionName: 'memories',
});

Best for: Long-term storage, semantic search, production deployments

3. HybridMemory - Best of Both Worlds

Combines fast buffer storage with persistent vector storage. Recent memories stay in the buffer, old ones automatically archive to the vector store.

const hybrid = new HybridMemory(buffer, vectorMemory, {
  archiveThreshold: 15, // Archive after 15 messages
});

Best for: Production applications, balancing speed and persistence

Entity Memory: Remember People & Relationships

Track entities mentioned in conversations and their relationships:

// Track an entity
await memoryManager.trackEntity({
  name: 'Alice',
  type: 'person',
  attributes: {
    role: 'engineer',
    company: 'TechCorp',
  },
  relationships: [
    { type: 'works_at', target: 'TechCorp' },
  ],
  firstSeen: new Date(),
  lastSeen: new Date(),
  mentions: 1,
});

// Retrieve entity
const alice = await memoryManager.getEntity('Alice');
console.log(alice.attributes.company); // 'TechCorp'

// Update entity
await memoryManager.updateEntity('Alice', {
  attributes: { ...alice.attributes, status: 'premium' },
});

Semantic Memory: Store & Recall Facts

Store facts and recall them semantically - no exact keyword matching needed:

// Store facts
await memoryManager.storeFact(
  'User prefers dark mode',
  { userId: 'user-123', category: 'preference' }
);

await memoryManager.storeFact(
  'HazelJS supports TypeScript decorators',
  { category: 'framework-feature' }
);

// Recall facts semantically
const facts = await memoryManager.recallFacts('user preferences', {
  topK: 5,
  minScore: 0.7,
});
// Returns: ['User prefers dark mode']

RAG + Memory: The Ultimate Combination

Combine document retrieval with conversation memory for truly context-aware responses:

import {
  RAGPipelineWithMemory,
  MemoryManager,
  HybridMemory,
  MemoryVectorStore,
  OpenAIEmbeddings,
} from '@hazeljs/rag';

// Setup embeddings
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

// Setup hybrid memory
const buffer = new BufferMemory({ maxSize: 20 });
const memoryVectorStore = new MemoryVectorStore(embeddings);
const vectorMemory = new VectorMemory(memoryVectorStore, embeddings);
const hybridMemory = new HybridMemory(buffer, vectorMemory);

const memoryManager = new MemoryManager(hybridMemory, {
  maxConversationLength: 20,
  summarizeAfter: 50,
  entityExtraction: true,
});

// Setup RAG with memory
const rag = new RAGPipelineWithMemory(
  {
    vectorStore: documentVectorStore,
    embeddingProvider: embeddings,
    topK: 5,
  },
  memoryManager,
  llmFunction
);

await rag.initialize();

// Add documents
await rag.addDocuments([
  {
    content: 'HazelJS is a modern TypeScript framework...',
    metadata: { source: 'docs' },
  },
]);

// Query with memory context
const response = await rag.queryWithMemory(
  'What did we discuss about pricing?',
  'session-123',
  'user-456'
);

console.log(response.answer);
console.log('Sources:', response.sources);
console.log('Memories:', response.memories);
console.log('History:', response.conversationHistory);

Enhanced Context: Three Sources Combined

When you query with RAGPipelineWithMemory, it combines three sources of context:

Document Retrieval - Relevant documents from your knowledge base
Conversation History - Recent messages in the conversation
Relevant Memories - Semantically similar past interactions

This creates incredibly rich context for your LLM, resulting in more accurate and personalized responses.

Advanced Features

Automatic Fact Extraction

Let the system automatically extract and store facts from AI responses:

const response = await rag.queryWithLearning(
  'Tell me about HazelJS features',
  'session-123',
  'user-456'
);
// Facts from the response are automatically stored!

Memory Search

Search across all memories semantically:

const relevantMemories = await memoryManager.relevantMemories(
  'pricing and discounts',
  {
    sessionId: 'session-123',
    types: [MemoryType.CONVERSATION, MemoryType.FACT],
    topK: 5,
    minScore: 0.7,
  }
);

Importance Scoring

Automatically calculate importance scores for memories:

const memoryManager = new MemoryManager(memoryStore, {
  importanceScoring: true, // Enable automatic scoring
});
// Questions and longer content get higher scores
// Important memories are retained longer

Memory Statistics

Monitor memory usage and performance:

const stats = await memoryManager.getStats('session-123');

console.log(`Total memories: ${stats.totalMemories}`);
console.log('By type:', stats.byType);
console.log(`Average importance: ${stats.averageImportance}`);

Real-World Use Cases

Customer Support Bot

// Remember customer information
await memoryManager.trackEntity({
  name: 'Jane Smith',
  type: 'customer',
  attributes: { tier: 'premium', accountId: 'ACC-123' },
  // ...
});

// Store support history
await memoryManager.storeFact(
  'Customer reported login issues on 2024-01-15',
  { customerId: 'ACC-123', category: 'support' }
);

// Context-aware responses
const response = await rag.queryWithMemory(
  'What was my previous issue?',
  'session-123',
  'ACC-123'
);
// AI remembers the login issue!

Personal AI Assistant

// Remember preferences
await memoryManager.storeFact('User prefers concise responses');
await memoryManager.storeFact('User timezone is PST');

// Track tasks
await memoryManager.setContext(
  'active_tasks',
  ['email', 'meeting'],
  'session-123'
);

// Personalized responses
const response = await rag.queryWithMemory(
  'What should I focus on today?',
  'session-123'
);

Educational Tutor

// Track learning progress
await memoryManager.trackEntity({
  name: 'Student-123',
  type: 'student',
  attributes: {
    level: 'intermediate',
    completedLessons: ['intro', 'basics'],
  },
  // ...
});

// Remember misconceptions
await memoryManager.storeFact(
  'Student confused about async/await',
  { studentId: 'Student-123', topic: 'javascript' }
);

Best Practices

Choose the Right Store - Use BufferMemory for dev, HybridMemory for production
Set Appropriate Limits - Configure maxConversationLength based on token limits
Enable Features Selectively - Turn on entityExtraction and importanceScoring as needed
Monitor Memory Usage - Use getStats() regularly and prune old memories
Session Management - Use consistent session IDs and clear when done

Performance & Scalability

The Memory System is designed for production use:

Fast Retrieval - BufferMemory provides sub-millisecond access
Semantic Search - VectorMemory enables powerful similarity search
Automatic Archiving - HybridMemory balances speed and persistence
Scalable - Works with any vector store backend
Memory Efficient - Automatic pruning and TTL support

Try It Yourself

The Memory System is available now in @hazeljs/rag:

npm install @hazeljs/rag

# Run the examples
cd hazeljs/example
npm run memory:basic      # Basic memory features
npm run memory:rag        # RAG with memory
npm run memory:chatbot    # Complete chatbot

Documentation & Examples

Check out the comprehensive documentation and working examples:

Memory System Guide - Complete documentation
RAG Package Docs - API reference
Example Code - Working examples

What's Next?

We're continuously improving the Memory System. Upcoming features include:

Memory Decorators - @WithMemory and @TrackEntity decorators
Advanced Entity Extraction - Automatic entity detection from conversations
Memory Consolidation - Smart merging of similar memories
Time-Based Decay - Automatic relevance scoring based on age
Multi-User Support - Shared and private memory spaces

Conclusion

The Memory System transforms how you build AI applications. Instead of stateless request-response cycles, you can now create truly intelligent applications that remember, learn, and adapt to users over time.

Whether you're building a customer support bot, personal assistant, or educational tutor, the Memory System provides the foundation for context-aware AI that feels natural and intelligent.

Start building smarter AI today! 🧠✨