Agentary JS
Run quantized small language models directly in the browser with WebGPU and WebAssembly, featuring built-in support for agentic workflows.
Features
- 🚀 Browser-Native - Run LLMs directly in the browser without server dependencies
- ⚡ WebGPU Acceleration - Leverage WebGPU for high-performance inference
- 🤖 Agentic Workflows - Create and execute complex multi-step agent workflows
- 🧠 Memory Management - Smart context compression and pruning for long conversations
- 🛠️ Function Calling - Built-in support for tool/function calling
- 📊 Multi-Model Support - Use different models for chat, tool use, and reasoning
- 📡 Lifecycle Events - Built-in event system for monitoring and debugging
Quick Start
Install via npm:
npm install agentary-jsBasic example:
import { createSession } from 'agentary-js';
const session = await createSession({
models: {
chat: {
name: 'onnx-community/gemma-3-270m-it-ONNX',
quantization: 'q4'
}
},
engine: 'webgpu'
});
for await (const chunk of session.createResponse({
messages: [{ role: 'user', content: 'Hello!' }]
})) {
process.stdout.write(chunk.token);
}
await session.dispose();Why Agentary JS?
Browser-Native AI
Run powerful language models directly in users’ browsers - no server costs, maximum privacy, and instant responses.
Agentic Workflows
Build sophisticated multi-step AI agents that can think, plan, use tools, and adapt - all running client-side.
Production Ready
Built with observability, error handling, memory management, and performance optimization from the ground up.
Browser Support
- WebGPU: Chrome 113+, Edge 113+, Firefox with WebGPU enabled
- WebAssembly: All modern browsers
- Minimum Requirements: 4GB RAM recommended for small models
Get Started
Ready to build? Check out the Getting Started guide.