Persistent memory for AI applications. Smart routing, automatic memory extraction, and infinite context recall. One SDK, three methods, unlimited conversations.
Built on PostgreSQL and Pinecone. Automatic memory extraction, smart routing, and infinite context recall. Designed to be the memory layer for your AI applications.
Bring any OpenAI or Gemini chat model with your own API key. We validate providers/models, never store your key, and route calls through secure BullMQ workers.
The .say() method auto-detects intent. Questions route to retrieval, statements route to conversation—no manual branching.
Memories are extracted asynchronously. Facts, preferences, and entities flow into PostgreSQL + Pinecone without manual tagging.
The .ask() method retrieves top memories from Pinecone, giving you context far beyond token limits.
BullMQ workers handle extraction, summaries, and request logging. Track API usage via api_requests and retrieval_logs.
Ship fast with the project dashboard, developer portal, and three SDK methods: .say(), .chat(), .ask().
Add persistent memory to your AI applications in minutes. Works with any LLM, any framework.
npm install normal-memory - one package, zero config.
Pass your API key and conversation ID. That's it.
Smart routing handles everything. Questions → memory, statements → chat.
import express from 'express';
import { NormalMemory } from 'normal-memory';
const app = express();
app.use(express.json());
const memory = new NormalMemory({
apiKey: process.env.NORMAL_MEMORY_KEY,
conversationId: process.env.CONVERSATION_ID,
baseUrl: process.env.NORMAL_MEMORY_URL,
llmProvider: process.env.LLM_PROVIDER,
llmApiKey: process.env.LLM_API_KEY,
llmModel: process.env.LLM_MODEL,
});
app.post('/chat', async (req, res) => {
const reply = await memory.chat(req.body.message);
res.json({ reply });
});
app.post('/ask', async (req, res) => {
const answer = await memory.ask(req.body.question);
res.json({ answer });
});
app.listen(4000, () => console.log('SDK test server running on port 4000'));