Memory Classification Engine

Don't remember everything. Remember what matters.
_{Real-time memory classification for AI Agents. 60%+ zero LLM cost.}

中文 · 日本語 · Roadmap · Issues

The Problem: Your Agent Forgets, or Remembers Too Much

Every AI agent has the same memory problem.

Option A: Save nothing. Each new session starts from zero. Users repeat themselves. The agent makes mistakes it already made.

Option B: Save everything. Dump every conversation into a vector DB as a summary. Works at first. After 50 sessions, retrieval returns vague noise because signal is drowned out by noise. Cost scales linearly with message count (every message = one LLM call).

The root cause: most systems don't classify before they store. They treat a user's preference ("use double quotes"), a decision ("we chose PostgreSQL"), and small talk ("nice weather") identically. One big blob.

How MCE Is Different

MCE classifies every message in real time before storing:

Message: "That last approach was too complex, let's go simpler"

Traditional system:
  → Stores as summary fragment: "discussed approach complexity"
  → Lost context: this was a REJECTION of a past decision
  → Search result: buried in 50 other summaries, low relevance

MCE:
  → [correction] "Rejected previous complex approach, prefers simplicity"
  → Auto-linked to decision_001 (the original complex plan)
  → Confidence: 0.89 | Source: pattern analysis | Tier: episodic
  → LLM cost: $0 (matched at Layer 2)

One message. The traditional system stores noise. MCE stores an actionable, typed, cross-linked memory with zero LLM cost.

Cost per 1,000 messages:

Approach	LLM Calls	Cost
Summarize everything	1,000	$0.50 - $2.00
MCE	<100	$0.05 - $0.20

Three-Layer Pipeline: Cheap First, Expensive Last

Incoming Message
       │
       ▼
┌─────────────────────┐   60%+ of messages   │  Zero cost
│ Layer 1: Rule Match  │   handled here      │  Regex + keywords
│   "remember", "always..."│                   │  Deterministic
└──────────┬──────────┘
           │ Unmatched
           ▼
┌─────────────────────┐   30%+ of messages   │  Still zero LLM
│ Layer 2: Pattern    │   handled here       │  Conversation structure
│   Analysis          │                     │  "3rd rejection = preference"
└──────────┬──────────┘
           │ Unmatched
           ▼
┌─────────────────────┐   <10% of messages   │  LLM fallback
│ Layer 3: Semantic   │   reach here         │  Ambiguous edge cases
│   Inference         │                     │
└─────────────────────┘

Most solutions start at Layer 3. MCE starts at Layer 1 and escalates only when needed. That's why 60%+ of classification costs nothing.

Quick Start

pip install memory-classification-engine

No database. No API key. No configuration. Works out of the box.

Try It in 30 Seconds

from memory_classification_engine import MemoryClassificationEngine

engine = MemoryClassificationEngine()

# Scenario 1: User rejects a previous approach (implicit correction)
engine.process_message(
    "That last approach was too complex, let's go simpler"
)
# → [correction] Rejected previous complex approach
#   confidence: 0.89, source: pattern, tier: episodic
#   linked to: decision_001 (auto)

# Scenario 2: Frustration reveals a recurring pain point
engine.process_message(
    "We always have to test before deploying, this process is so tedious"
)
# → [sentiment_marker] Frustration with deployment process
#   implied_pattern: test-before-deploy (auto-extracted)

# Scenario 3: Team roles in one sentence
engine.process_message(
    "Alice owns the backend, Bob does frontend, I oversee architecture"
)
# → [relationship] Alice→backend, Bob→frontend, User→arch lead
#   confidence: 0.95, tier: semantic

Recall Across Sessions

This is what users actually feel: opening a new conversation and having their agent remember what matters.

from memory_classification_engine import MemoryOrchestrator

memory = MemoryOrchestrator()

# ... after using for a week ...

# New session starts — load relevant memories
memories = memory.recall(context="coding", limit=5)
for m in memories:
    print(f"[{m['type']}] {m['content']} (conf: {m['confidence']}, src: {m['source']})")
# Output:
# [user_preference] Use double quotes, not single (conf: 0.95, src: rule)
# [decision] Project uses Python, not Go (conf: 0.91, src: rule)
# [relationship] Alice handles backend API (conf: 0.88, src: semantic)
# [correction] No over-engineering — keep it simple (conf: 0.89, src: pattern)
# [fact_declaration] Prod runs on Ubuntu 22.04 (conf: 0.92, src: rule)
#
# Stats: 5 loaded | 12 noise filtered | 0 LLM calls

MCP Server: Claude Code Integration in 2 Minutes

MCE ships with a built-in MCP server. This is the fastest way to use it with Claude Code, Cursor, or any MCP-compatible tool.

cd mce-mcp
python3 server.py
# MCP server running on http://localhost:9001

Add to your Claude Code config (~/.claude/settings.json):

{
  "mcpServers": {
    "mce": {
      "command": "python3",
      "args": ["/path/to/mce-mcp/server.py"]
    }
  }
}

Available tools: classify_message, retrieve_memories, search_memories, get_memory_timeline, get_memory_details, store_memory, get_memory_stats, delete_memory, update_memory, export_memories, import_memories.

Every message you send in Claude Code can be classified and stored automatically. Every new session starts with a structured recall of your memories.

See BETA_TESTING_GUIDE_EN.md for full setup instructions.

What Gets Classified: 7 Memory Types

Type	Example	Stored Where
user_preference	"I prefer spaces over tabs"	Tier 2: Procedural (active)
correction	"No, do it like this instead"	Tier 3: Episodic (linked)
fact_declaration	"We have 100 employees"	Tier 3: Episodic (verified)
decision	"Let's go with Redis for caching"	Tier 3: Episodic (high priority)
relationship	"Alice handles backend"	Tier 4: Semantic (graph)
task_pattern	"Always test before deploy"	Tier 2: Procedural (auto)
sentiment_marker	"This workflow is frustrating"	Tier 3: Episodic (low priority)

Not every message produces a memory. Chit-chat, acknowledgments ("OK", "thanks"), and low-signal content are filtered out before storage.

Self-Evolving: Patterns Become Rules Over Time

The engine gets cheaper and more accurate the longer it runs.

Time	Layer 1 (Rules)	Layer 2 (Patterns)	Layer 3 (LLM)	Cost/1k msgs
Week 1	30% hit rate	40%	30%	$0.15
Week 4	50% (+20 auto-rules)	35%	15%	$0.08 (-47%)
Month 3	65% (+50 auto-rules)	25%	10%	$0.05 (-67%)

Auto-generated rules look like this:

# System seed (day one):
- pattern: "remember.*prefer"
  type: user_preference

# Learned after 1 month of use:
- pattern: "too complex.*simpler"
  type: correction
  source: learned_from_user_behavior

- pattern: "always have to.*tedious"
  type: sentiment_marker
  source: learned_from_user_behavior

Your usage patterns become free classification rules. No manual tuning required.

Comparison

Feature	Mem0	MemGPT	LangChain Memory	claude-mem	MCE
When to write	Post-conversation	Context window	Manual/Hooks	Full recording	Real-time, per-message
Classification	Basic tags	None	None	None (all observations)	7 types + 3-layer pipeline
Storage tiers	1 (vector)	2 (mem + disk)	1 (session)	1 (SQLite + Chroma)	4 tiers (working / procedural / episodic / semantic)
Forgetting	None	Passive overflow	None	AI compression	Active decay + Nudge review
Learning	Static	None	None	None	Patterns auto-promote to rules
LLM cost	Per-message	Medium	Low	High (compression)	60%+ classified at zero cost
Cross-session	Export only	None	None	Yes	Structured migration standard
MCP support	No	No	No	No	Built-in MCP Server
High-level API	No	No	Basic	No	MemoryOrchestrator (learn/recall/export/import)
Retrieval model	Full content	Full content	Full content	Progressive disclosure	Progressive disclosure + typed memories

Four-Tier Storage

Tier	Name	Storage	Lifecycle
T1	Working Memory	Context window (LLM-native)	Current session only
T2	Procedural	Config files / system prompts	Long-term, always loaded
T3	Episodic	Vector store (ChromaDB / SQLite)	Weighted decay over time
T4	Semantic	Knowledge graph (Neo4j / in-memory)	Long-term, cross-linked

Core dependency: only PyYAML. Vector DBs, graph DBs, and LLM are all optional extensions.

Performance

Metric	Result
Message processing (Layer 1/2)	~10ms
Message processing (Layer 3)	<500ms
Retrieval latency	~15ms
Concurrent throughput	626 msg/s
Memory compression	87-90% noise reduction
Memory footprint	<100MB (basic mode)
LLM call ratio	<10%

Tech Stack

Component	Default	Alternative
Rule engine	YAML + Regex	JSON Schema
Vector store (T3)	ChromaDB	Qdrant, Milvus
Knowledge graph (T4)	In-memory	Neo4j
Semantic classifier (L3)	Small model API	Ollama local model
Agent adapters	Standalone SDK	Plugin extension

Project Structure

memory-classification-engine/
├── mce-mcp/                    # MCP Server (Claude Code / Cursor integration)
│   ├── server.py               #   Server entry point
│   ├── tools/                  #   MCP tool implementations
│   └── config.yaml             #   Server configuration
│
├── src/memory_classification_engine/
│   ├── engine.py               # Core coordinator
│   ├── layers/                 # 3-layer pipeline
│   │   ├── rule_matcher.py     #   Layer 1: Rule matching
│   │   ├── pattern_analyzer.py #   Layer 2: Structure analysis
│   │   └── semantic_classifier.py # Layer 3: LLM fallback
│   ├── storage/                # Tiered storage (T2-T4)
│   ├── orchestrator.py         # MemoryOrchestrator (high-level API)
│   ├── privacy/
│   └── utils/
│
├── examples/                   # Ready-to-run examples
├── tests/                      # Test suite
├── config/rules.yaml           # Classification rules
├── setup.py                    # PyPI package config
└── README.md

Installation Options

# Core (classification engine only)
pip install memory-classification-engine

# With RESTful API server
pip install -e ".[api]"

# With LLM-based semantic classification (Layer 3)
pip install -e ".[llm]"
export MCE_LLM_API_KEY="your-key"
export MCE_LLM_ENABLED=true

# Run tests
pip install -e ".[testing]"
pytest

License

MIT

Links

Repository: github.com/lulin70/memory-classification-engine
Roadmap: ROADMAP.md
Beta Testing Guide: BETA_TESTING_GUIDE_EN.md
MCP Setup for Claude Code: docs/claude_code_mcp_config.md
Issues / Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
.trae/documents		.trae/documents
config		config
dashboard		dashboard
demo		demo
docs		docs
examples		examples
mce-mcp		mce-mcp
scripts		scripts
src		src
tests		tests
vscode-extension		vscode-extension
.flake8		.flake8
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MCE_MODEL_STATUS.md		MCE_MODEL_STATUS.md
Makefile		Makefile
README-JP.md		README-JP.md
README-ZH.md		README-ZH.md
README.md		README.md
ROADMAP-JP.md		ROADMAP-JP.md
ROADMAP-ZH.md		ROADMAP-ZH.md
ROADMAP.md		ROADMAP.md
requirements.txt		requirements.txt
setup.py		setup.py
start_mcp.sh		start_mcp.sh
start_mcp_offline.py		start_mcp_offline.py
start_mcp_server.sh		start_mcp_server.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Memory Classification Engine

The Problem: Your Agent Forgets, or Remembers Too Much

How MCE Is Different

Three-Layer Pipeline: Cheap First, Expensive Last

Quick Start

Try It in 30 Seconds

Recall Across Sessions

MCP Server: Claude Code Integration in 2 Minutes

What Gets Classified: 7 Memory Types

Self-Evolving: Patterns Become Rules Over Time

Comparison

Four-Tier Storage

Performance

Tech Stack

Project Structure

Installation Options

License

Links

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Memory Classification Engine

The Problem: Your Agent Forgets, or Remembers Too Much

How MCE Is Different

Three-Layer Pipeline: Cheap First, Expensive Last

Quick Start

Try It in 30 Seconds

Recall Across Sessions

MCP Server: Claude Code Integration in 2 Minutes

What Gets Classified: 7 Memory Types

Self-Evolving: Patterns Become Rules Over Time

Comparison

Four-Tier Storage

Performance

Tech Stack

Project Structure

Installation Options

License

Links

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages