SLM360
ReleasedThe First AI That Thinks, Learns, and Remembers On-Device
SLM360 is a complete, privacy-first AI system comprising two purpose-built foundation models (SLM360 Nano, a 6.4M-parameter encoder, and SLM360 Base, a 125M-parameter decoder), a hybrid NLU pipeline, and an on-device continual learning framework. All implemented in pure Rust with zero external ML dependencies.
Specifications
50MB
Footprint
39ms
Latency
100%
Banking77
98%
SNIPS
<100ms
Reasoning
~64KB
Memory/User
Architecture
1 Tier 1: Pattern Matching (<1ms) - Regex, exact match, contains, fuzzy2 Tier 2: MicroTransformer, 85K params (2-5ms) - BPE tokenizer, 1-layer transformer3 Tier 3: SLM360 Nano, 6.4M params (<5ms) - Full encoder with GQA + SwiGLU4 Tier 4: SLM360 Base, 125M params (<50ms/tok) - Causal decoder for generation
Features
- Hybrid classification: pattern matching (<1ms) + semantic embeddings (39ms) with confidence arbitration
- Multi-step reasoning engine with conditional execution, sequences, and automatic rollback in <100ms
- 5-tier SmartMemory: short-term, episodic, semantic, procedural, and meta-learning
- Predictive context engine: anticipates user needs from topic transitions, time patterns, and entity preferences
- Cross-platform deployment: Linux, macOS, Windows, WebAssembly, with iOS and Android planned
- 100% on-device processing. Privacy-first by architecture, not policy
- 50MB total footprint: ONNX model (32MB), pattern engine (8MB), reasoning (4MB), SmartMemory (2MB), cache (3MB), runtime (1MB)
- 196 tests passing with comprehensive coverage across NLU, reasoning, memory, WASM, and async
Benchmarks
| Dataset | Score | Comparison |
|---|---|---|
| Banking77 | 100% | BERT-base: 93.1% |
| SNIPS | 98% | BERT-base: 98.0% |
| Forgetting Rate (with EWC) | <2% | Without EWC: 23% |
| Correction Success Rate | 87% | - |
| Energy per Query | 0.001 Wh | Cloud LLM: 0.42-29 Wh |
Deployment Targets
- >Native (ARM/x86) with SIMD auto-detection
- >WebAssembly (~300KB gzipped) for browser deployment
- >Android via JNI bindings
- >iOS via FFI bindings
- >Minimal mode (~50KB) for MCU deployment
Models in this Family
SLM360 Nano
6.4M-parameter bidirectional encoder for intent classification and NLU. Sub-5ms latency with INT4 quantization.
6.4M
Parameters
4MB
INT4 Size
<5ms
Latency
6
Layers
SLM360 LensNano
869KB on-device voice + vision AI engine for smart glasses. 1.67M params, sub-10ms latency, 100% offline.
1.67M
Parameters
869KB
INT4 Size
<10ms
Latency
4
Layers
SLM360 Base
125M-parameter causal decoder for autoregressive text generation. Sub-50ms per token with INT4.
125M
Parameters
63MB
INT4 Size
<50ms/tok
Latency
16
Layers