Introduction

Agentary JS

A lightweight JavaScript SDK for building agentic workflows with tool calling, memory and multi-step reasoning.

Features

  • 🚀 Flexible Inference - Run models on-device or via cloud providers
  • WebGPU Acceleration - Leverage WebGPU for high-performance on-device inference (supported by transformers.js)
  • ☁️ Cloud Provider Support - Integrate with cloud LLMs via secure proxy pattern
  • 🤖 Agentic Workflows - Create and execute complex multi-step agent workflows
  • 🧠 Memory Management - Smart context compression and pruning for long conversations
  • 🛠️ Function Calling - Built-in support for tool/function calling
  • 📊 Multi-Provider Support - Mix device and cloud models in the same application
  • 📡 Lifecycle Events - Built-in event system for monitoring and debugging

Quick Start

Installation

For cloud-only usage:

npm install agentary-js

For on-device inference, install the peer dependency:

npm install agentary-js @huggingface/transformers

Note: @huggingface/transformers is only required for on-device inference. Cloud-only users can skip this dependency.

Basic Example with On-Device Inference

import { createSession } from 'agentary-js';
 
const session = await createSession({
  models: [{
    runtime: 'transformers-js',
    model: 'onnx-community/Qwen3-0.6B-ONNX',
    quantization: 'q4',
    engine: 'webgpu'
  }]
});
 
const response = await session.createResponse('onnx-community/Qwen3-0.6B-ONNX', {
  messages: [{ role: 'user', content: 'Hello!' }]
});
 
if (response.type === 'streaming') {
  for await (const chunk of response.stream) {
    process.stdout.write(chunk.token);
  }
}
 
await session.dispose();

Cloud Provider Example

const session = await createSession({
  models: [{
    runtime: 'anthropic',
    model: 'gpt-5-nano',
    proxyUrl: 'https://your-backend.com/api/openai',
    modelProvider: 'openai'
  }]
});

Why Agentary JS?

Flexible Deployment

Choose between on-device inference (WebGPU/WASM) for privacy and zero server costs, or cloud providers for maximum performance. Mix and match based on your needs.

Secure Cloud Integration

Cloud providers use a secure proxy pattern - your API keys stay on your backend, never exposed to the browser.

Agentic Workflows

Build sophisticated multi-step AI agents that can think, plan, use tools, and adapt - with device or cloud models.

Production Ready

Built with observability, error handling, memory management, and performance optimization from the ground up.

Browser Support

  • WebGPU: Chrome 113+, Edge 113+, Firefox with WebGPU enabled
  • WebAssembly: All modern browsers
  • Minimum Requirements: 4GB RAM recommended for small models

Get Started

Ready to build? Check out the Getting Started guide.

Community