Architecture Overview¶

Memoir implements a clean layered architecture with proper separation of concerns and dependency injection patterns. This design enables high performance, maintainability, and flexibility.

Core Principles¶

Git-like Versioning: Every memory change is tracked with cryptographic integrity
Semantic Paths: Replace UUID keys with meaningful hierarchical paths
Memory Aggregation: Group related memories at semantic locations
Dependency Injection: Clean separation between storage, classification, and search
Performance First: O(log n) lookups instead of expensive vector operations

System Architecture¶

flowchart TB
    MM["Memory Manager<br/>(Orchestration Layer)"]
    S[Storage Layer]
    C[Classification Layer]
    SE[Search Engine]
    PT[ProllyTree]
    TX[Taxonomy System]
    PS[Path Selection]
    MM --> S
    MM --> C
    MM --> SE
    S --> PT
    C --> TX
    SE --> PS

Layer Details¶

1. Storage Layer (`memoir.store`)¶

The storage layer provides pure data persistence without business logic:

ProllyTreeStore: Git-like versioned key-value storage
Memory Aggregation: Groups memories at semantic paths
Cryptographic Integrity: SHA-256 hashing for all operations
Efficient Queries: O(log n) prefix searches

from memoir.store.prolly_adapter import ProllyTreeStore

store = ProllyTreeStore(
    path="./memory_store",
    enable_versioning=True,
    cache_size=10000
)

2. Classification Layer (`memoir.classifier`)¶

Handles semantic classification of memories into hierarchical paths:

SemanticClassifier: Fast pattern-based classification (1-5ms)
IntelligentClassifier: LLM-powered with dynamic taxonomy expansion
Confidence Thresholds: Configurable acceptance criteria
Multi-stage Pipeline: Pattern matching → LLM → Expansion

from memoir.classifier.intelligent import IntelligentClassifier

classifier = IntelligentClassifier(
    llm=llm,
    confidence_thresholds={
        "high": 0.8,    # Auto-store
        "medium": 0.5,  # Review
        "low": 0.0      # Reject threshold
    }
)

3. Search Engine Layer (`memoir.search`)¶

Provides intelligent memory retrieval capabilities:

IntelligentSearchEngine: LLM-powered path selection
Multi-strategy: Breadth-first, depth-first, best-match
Relevance Scoring: Combined semantic and structural scoring

# Intelligent LLM-powered search
from memoir.search.intelligent import IntelligentSearchEngine
search_engine = IntelligentSearchEngine(llm=llm, store=store)

4. Memory Manager (`memoir.core`)¶

Orchestrates all components with proper dependency injection:

Dependency Injection: Clean separation of concerns
Transaction Management: Atomic operations
Version Control: Branching, merging, rollback
Performance Monitoring: Built-in metrics

from memoir.core.memory import ProllyTreeMemoryStoreManager

memory_manager = ProllyTreeMemoryStoreManager(
    prolly_store=store,        # Injected dependency
    classifier=classifier,      # Injected dependency
    search_engine=search_engine # Injected dependency
)

Data Flow¶

Storage Flow:

flowchart LR
    A["Memory Input<br/>I work at X"] --> B["Classification<br/>Classifier analysis"]
    B --> C["Path Selection<br/>profile.professional.occupation"]
    C --> D["Aggregation<br/>with similar memories"]
    D --> E["Storage<br/>ProllyTree"]

Retrieval Flow:

flowchart LR
    A["Query<br/>user job"] --> B["Path Selection<br/>profile.* paths"]
    B --> C["Storage Lookup<br/>Tree Search"]
    C --> D["Aggregation<br/>Collect memories"]
    D --> E["Results<br/>Ranked"]

Memory Aggregation¶

Memories are aggregated at semantic paths rather than stored individually:

Traditional Approach:

uuid-1234-5678 → "I work at TechCorp"
uuid-9876-5432 → "I'm a software engineer"
uuid-1111-2222 → "I've been coding for 5 years"

Memoir Approach:

profile.professional.occupation → {
  "memories": [
    {"content": "I work at TechCorp", "confidence": 0.95},
    {"content": "I'm a software engineer", "confidence": 0.87},
    {"content": "I've been coding for 5 years", "confidence": 0.82}
  ],
  "count": 3,
  "last_updated": "2024-01-15"
}

Taxonomy System¶

The taxonomy system provides hierarchical organization:

Fixed Taxonomy (memoir.taxonomy.semantic): - ~200 predefined paths - Fast pattern matching - Consistent organization

Dynamic Taxonomy (memoir.taxonomy.iterative): - LLM-driven expansion - Automatic growth - Context-aware paths

Example Paths:

profile.
├── identity.
│   ├── name.{first,last,full}
│   └── demographics.{age,location}
├── professional.
│   ├── occupation.{role,company}
│   └── skills.{technical,soft}
└── personal.
    ├── interests.{hobbies,sports}
    └── relationships.{family,friends}

Performance Characteristics¶

Search Performance:

Semantic Search: 0.1-1ms average latency
Intelligent Search: 100-500ms (includes LLM calls)
Traditional Vector Search: 150-750ms

Storage Performance:

Memory Classification: 1-5ms (pattern) / 100-500ms (LLM)
Storage Operations: 20-30ms
Version Control Ops: 50-100ms

Scalability:

Memory Count: Tested up to 1M memories
Path Depth: Up to 8 levels deep
Concurrent Users: Horizontal scaling ready

Version Control¶

Git-like operations for memory management:

main branch
│
├─ commit: "Initial user profile"
│  └─ memories: profile.identity.*
│
├─ commit: "Added work info"
│  └─ memories: profile.professional.*
│
└─ branch: experiment
   ├─ commit: "Testing new classifier"
   └─ merge → main

Extensions and Plugins¶

Memento Collections (memoir.memento): - LocationMemento: Spatial/geographic memories - TimelineMemento: Temporal/chronological memories - ProfileMemento: Identity/personal memories

Custom Extensions: - Custom classifiers - Custom search engines - Custom taxonomy systems - Custom storage backends

This architecture enables Memoir to provide fast, reliable, and scalable semantic memory management while maintaining clean code organization and extensibility.