Architecture
System Overview
Quila uses a neuron-based dynamic graph architecture with 32,768 neurons operating in parallel.
Neuron Structure
Each neuron maintains 8 channels of state:
- S1: Semantic (long-term knowledge)
- S2: Episodic (session memory)
- S3: Working (active computation)
- S4: Plan (goal representation)
- S5: Tool (external interaction)
- S6: Output (generation buffer)
- S7: Conflict (contradiction detection)
- S8: Meta (self-monitoring)
Computation Streams
4 parallel streams process each neuron:
- Stream A: Micro-NSA (sparse attention)
- Stream B: SSM (state space model)
- Stream C: Linear Attention (RWKV-style)
- Stream D: DRC (residual correction)
Inference Pipeline
6 phases execute sequentially:
- Phase 0: VQ-GAN encoding
- Phase 1: Replay (adaptive skip)
- Phase 2: LastInput (demand identification)
- Phase 3: Context re-read (HOT-NSA grading)
- Phase 4: Plan generation
- Phase 5: Think + Tool + Generate
- Phase 6: Output (Attention Merge)
Memory Hierarchy
- L1 (GPU SRAM): Neuron states
- L2 (GPU HBM): WMQ, active KFE
- L3 (System RAM): SCT, cold KFE
- L4 (NVRAM): Persona vector
- L5 (NVMe): NMDB, Engram