Unified Knowledge Library System - Proposal
Problem: Knowledge fragmented across brain/, docs/, architecture/, with inconsistent filing and risk of loss
Goal: Single source of truth with flat hierarchy, meaningful topics, cross-referencing, and vector search
Current State Analysis
Scattered Knowledge Locations
Within repository (singular-dream/):
- brain/ - Ephemeral session artifacts (risk of loss)
- docs/ - Mixed purpose (user docs, research, guides) - 111 files
- architecture/ - Architecture decisions, thinking, standards - 203 files
- architecture/THINKING/ - Research, proposals
- architecture/GOLDEN/ - Canonical architecture
- architecture/STANDARDS/ - Standards documents
- architecture/TRASH/ - Archived/deprecated
- Various README.md files throughout codebase
Beyond repository boundary (SD-HOA/):
- SD-HOA/README.md - Marketing website documentation (9,957 bytes)
- SD-HOA/*.html - Marketing website files (index, visualizations)
- SD-HOA/*.css - Styling documentation
- SD-HOA/*.js - Interactive features documentation
- Other project-level documentation
Total scope: - Repository: 118+ markdown files found at depth 2 - Parent directory: Additional documentation and web assets - Multi-repository knowledge spanning different contexts
Problems Identified
- Misplaced artifacts - Brain directory used for permanent research
- Inconsistent filing - Same type of content in different places
- Hard to discover - No central index or search
- Risk of loss - Ephemeral directories for permanent knowledge
- Cognitive overload - Too many places to look
- No cross-referencing - Documents exist in isolation
- Multi-repository fragmentation - Knowledge split between parent and repo
- Boundary confusion - Unclear what belongs where
Proposed Solution: The Knowledge Library
Core Principle
"Library Floor 1, Shelf 16" Model: - One central location for all knowledge - Flat but meaningful topic organization - Consistent cross-referencing - README navigation at every level - Vector search for discovery
Directory Structure
knowledge/
├── README.md # Master index & search guide
├── architecture/ # System architecture
│ ├── README.md
│ ├── golden/ # Canonical architecture (current state)
│ ├── proposed/ # Future architecture (planning)
│ ├── retired/ # Deprecated architecture (history)
│ └── decisions/ # ADRs (Architecture Decision Records)
├── standards/ # Standards & practices
│ ├── README.md
│ ├── coding/ # Code standards
│ ├── processes/ # Development processes
│ ├── procedures/ # Step-by-step procedures
│ └── conventions/ # Naming, formatting, etc.
├── research/ # Research & thinking
│ ├── README.md
│ ├── ai-development/ # AI & development research
│ ├── technical/ # Technical research
│ └── business/ # Business research
├── guides/ # How-to guides
│ ├── README.md
│ ├── development/ # Developer guides
│ ├── deployment/ # Deployment guides
│ ├── operations/ # Operations guides
│ └── user/ # User documentation
├── processes/ # Business processes
│ ├── README.md
│ ├── governance/ # Governance processes
│ ├── finance/ # Financial processes
│ ├── operations/ # Operational processes
│ └── property/ # Property management
├── reference/ # Reference materials
│ ├── README.md
│ ├── apis/ # API documentation
│ ├── tools/ # Tool documentation
│ ├── integrations/ # Integration docs
│ └── glossary.md # Terms & definitions
└── sessions/ # Session artifacts
├── README.md
├── 2026-01/ # Organized by month
│ ├── m2-ai-implementation/ # Session topic
│ └── devops-architecture/ # Session topic
└── index.md # Searchable session index
Multi-Repository Strategy
The Boundary Question
Current structure:
SD-HOA/ # Parent directory (Antigravity boundary)
├── README.md # Marketing website docs
├── index.html, *.css, *.js # Marketing website
└── singular-dream/ # Monorepo
├── knowledge/ # NEW: Knowledge library
├── docs/ # Current docs
└── architecture/ # Current architecture
Key decision: Where should the knowledge library live?
Option 1: Repository-Scoped (Recommended)
Location: singular-dream/knowledge/
Scope: - All monorepo-related knowledge - Architecture, standards, guides - Development documentation - Session artifacts
Benefits: - ✅ Version controlled with code - ✅ Part of monorepo structure - ✅ Clear ownership - ✅ Portable with repository
Limitations: - ❌ Doesn't include parent-level docs - ❌ Marketing website docs separate
Option 2: Parent-Scoped
Location: SD-HOA/knowledge/
Scope: - All project knowledge (monorepo + marketing) - Cross-repository documentation - Project-level decisions
Benefits: - ✅ Single source of truth for all knowledge - ✅ Includes marketing website docs - ✅ Project-level view
Limitations: - ❌ Not version controlled with monorepo - ❌ Harder to port repository - ❌ Unclear ownership
Recommendation: Hybrid Approach
Primary library: singular-dream/knowledge/ (repository-scoped)
Cross-reference: Link to parent-level docs when needed
Structure:
SD-HOA/
├── README.md # Marketing website (stays here)
├── marketing/ # NEW: Organize marketing assets
│ ├── index.html
│ ├── styles.css
│ └── script.js
└── singular-dream/
└── knowledge/
├── README.md # Master index
├── architecture/
├── standards/
├── research/
├── guides/
├── processes/
├── reference/
│ └── marketing-website.md # Links to ../../../marketing/
└── sessions/
Benefits: - ✅ Repository knowledge version-controlled - ✅ Marketing website organized but separate - ✅ Cross-references maintain connections - ✅ Clear boundaries and ownership
Design Principles
1. Flat But Meaningful
Avoid deep nesting: - Maximum 3 levels deep - Clear topic separation at top level - Subtopics only when necessary
Example:
✅ Good: knowledge/architecture/golden/monorepo-structure.md
❌ Bad: knowledge/architecture/systems/backend/structure/monorepo/design.md
2. State-Based Organization
Architecture example:
- golden/ - Current canonical state
- proposed/ - Future planned state
- retired/ - Historical deprecated state
Benefits: - Clear lifecycle management - Easy to find current vs. future - Historical context preserved
3. Topic-Based, Not Type-Based
Organize by topic, not document type:
✅ Good: knowledge/architecture/golden/
knowledge/architecture/proposed/
knowledge/architecture/retired/
❌ Bad: knowledge/markdown-files/
knowledge/diagrams/
knowledge/spreadsheets/
4. README Navigation
Every directory has README.md:
# Architecture
**Purpose**: System architecture documentation
**Contents**:
- `golden/` - Current canonical architecture
- `proposed/` - Planned future architecture
- `retired/` - Deprecated architecture
- `decisions/` - Architecture Decision Records (ADRs)
**Related**:
- [Standards](../standards/) - Coding standards
- [Guides](../guides/development/) - Developer guides
- [Master Index](../README.md) - Full library index
**Search**: Use vector search for "architecture" topics
Cross-Referencing System
Document Frontmatter
Every document includes:
---
title: Monorepo Structure
topic: architecture
state: golden
tags: [monorepo, turborepo, structure]
related:
- knowledge/standards/coding/monorepo-conventions.md
- knowledge/guides/development/monorepo-setup.md
created: 2026-01-15
updated: 2026-01-18
---
Automatic Cross-Reference Generation
Script: scripts/generate-knowledge-index.ts
// Scans all knowledge/ documents
// Extracts frontmatter
// Generates:
// - knowledge/INDEX.md (master index)
// - knowledge/TAGS.md (tag index)
// - knowledge/TOPICS.md (topic index)
// - Per-directory README.md updates
Bidirectional Links
Documents reference each other:
## Related Documents
- [Monorepo Conventions](../standards/coding/monorepo-conventions.md)
- [Monorepo Setup Guide](../guides/development/monorepo-setup.md)
- [ADR: Turborepo Selection](../architecture/decisions/adr-001-turborepo.md)
Vector Search Integration
Why Vector Search?
Problems with grep/find: - Requires exact keywords - Misses semantic matches - No relevance ranking - Hard to discover related content
Vector search benefits: - Semantic understanding - "Find documents about batch processing" (not just keyword "batch") - Relevance ranking - Discover related topics
Implementation Options
Option 1: Local Embeddings (Recommended)
Tool: @xenova/transformers (runs locally)
// scripts/lib/knowledge-search.ts
import { pipeline } from '@xenova/transformers';
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
// Index all documents
const documents = await scanKnowledgeDirectory();
const embeddings = await Promise.all(
documents.map(doc => embedder(doc.content))
);
// Store in simple JSON
await fs.writeFile('knowledge/.index/embeddings.json', JSON.stringify({
documents,
embeddings,
}));
// Search
async function search(query: string, limit = 10) {
const queryEmbedding = await embedder(query);
const results = cosineSimilarity(queryEmbedding, embeddings);
return results.slice(0, limit);
}
CLI:
pnpm devops knowledge search "batch processing architecture"
pnpm devops knowledge search "AI development best practices"
Option 2: Gemini Embeddings API
Use Google's embedding API:
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: 'embedding-001' });
async function embed(text: string) {
const result = await model.embedContent(text);
return result.embedding;
}
Pros: Better quality, no local model Cons: API calls, requires internet
Option 3: Hybrid Approach
Local for fast search, API for quality:
// Quick local search
const quickResults = await localSearch(query);
// If user wants more, use API
const detailedResults = await apiSearch(query);
Search Interface
CLI:
# Quick search
pnpm devops knowledge search "monorepo"
# Detailed search
pnpm devops knowledge search "monorepo" --detailed
# Search specific topic
pnpm devops knowledge search "batch" --topic architecture
# Search by tag
pnpm devops knowledge search --tag ai-development
MCP Tool:
// Antigravity can search via MCP
{
name: 'knowledge_search',
description: 'Search knowledge library',
inputSchema: {
query: string,
topic?: string,
limit?: number,
}
}
Migration Strategy
Phase 1: Create Structure (Week 1)
Tasks:
1. Create knowledge/ directory structure
2. Create all README.md files
3. Setup frontmatter template
4. Create migration script
Phase 2: Migrate Content (Week 2-3)
Priority order: 1. High value, high risk - Research notes in brain/ 2. Canonical architecture - architecture/GOLDEN/ → knowledge/architecture/golden/ 3. Active thinking - architecture/THINKING/ → knowledge/architecture/proposed/ or knowledge/research/ 4. Standards - architecture/STANDARDS/ → knowledge/standards/ 5. Documentation - docs/ → knowledge/guides/ or knowledge/reference/
Migration script:
Phase 3: Setup Search (Week 3)
Tasks: 1. Implement local embeddings 2. Index all documents 3. Create CLI search command 4. Add MCP search tool 5. Test and refine
Phase 4: Establish Practices (Week 4)
Tasks: 1. Update workflows to use knowledge/ 2. Add pre-commit hook to validate frontmatter 3. Auto-generate cross-reference indexes 4. Train team on new structure
File Naming Conventions
Consistent Naming
Format: {topic}-{description}.md
Examples:
monorepo-structure.md
batch-processing-architecture.md
ai-development-best-practices.md
deployment-procedures-marketing.md
Avoid
Date-Based (Only for Sessions)
knowledge/sessions/2026-01/m2-ai-implementation/
├── 2026-01-15-planning.md
├── 2026-01-16-implementation.md
└── 2026-01-18-completion.md
README Template
Master README (knowledge/README.md)
# Knowledge Library
**Purpose**: Central repository for all project knowledge
**Quick Start**:
- Browse by [Topic](#topics)
- Search by [Tag](#tags)
- [Vector Search](#search) for semantic discovery
## Topics
- [Architecture](./architecture/) - System architecture
- [Standards](./standards/) - Standards & practices
- [Research](./research/) - Research & thinking
- [Guides](./guides/) - How-to guides
- [Processes](./processes/) - Business processes
- [Reference](./reference/) - Reference materials
- [Sessions](./sessions/) - Session artifacts
## Search
**CLI:**
```bash
pnpm devops knowledge search "your query"
MCP (Antigravity):
Contributing
- Use frontmatter template
- Follow naming conventions
- Cross-reference related docs
- Update indexes:
pnpm devops knowledge index
Indexes
- Master Index - All documents
- Tag Index - By tag
- Topic Index - By topic
### Topic README Template ```markdown # {Topic Name} **Purpose**: {One-line description} **Contents**: - `{subdirectory}/` - {Description} - ... **Related**: - [{Related Topic}](../path/to/topic/) - ... **Search**: Use vector search for "{topic}" topics ## Documents {Auto-generated list of documents in this directory} ## Quick Links - Most recent: [{Document}](./path/to/doc.md) - Most referenced: [{Document}](./path/to/doc.md)
Automation & Tooling
Knowledge CLI Commands
# Search
pnpm devops knowledge search "query"
# Index
pnpm devops knowledge index # Regenerate all indexes
pnpm devops knowledge index --validate # Validate frontmatter
# Migrate
pnpm devops knowledge migrate --from docs/ --to knowledge/guides/
# Stats
pnpm devops knowledge stats # Show statistics
pnpm devops knowledge stats --topic architecture
# Validate
pnpm devops knowledge validate # Check all documents
pnpm devops knowledge validate --fix # Auto-fix issues
Pre-commit Hook
Auto-Index Generation
// scripts/lib/knowledge-indexer.ts
interface Document {
path: string;
frontmatter: Frontmatter;
content: string;
}
async function generateIndexes() {
const docs = await scanKnowledge();
// Generate master index
await generateMasterIndex(docs);
// Generate tag index
await generateTagIndex(docs);
// Generate topic index
await generateTopicIndex(docs);
// Update README files
await updateReadmes(docs);
}
Benefits
For Developers
- Single source of truth - Always know where to look
- Fast discovery - Vector search finds relevant docs
- Clear organization - Flat hierarchy, meaningful topics
- Cross-referencing - Related docs linked
- No cognitive overload - Simple navigation
For AI (Antigravity)
- Persistent memory - Knowledge survives sessions
- Searchable context - Vector search for relevant docs
- Clear structure - Predictable file locations
- Cross-references - Understand relationships
- MCP integration - Direct search access
For Organization
- Institutional memory - Knowledge preserved
- Onboarding - New team members find docs easily
- Consistency - Standard practices documented
- Discoverability - Vector search reveals hidden knowledge
- Maintainability - Clear lifecycle (golden/proposed/retired)
Success Metrics
Quantitative
- Search success rate: >90% of searches find relevant docs
- Time to find: <30 seconds average
- Document coverage: >95% of knowledge in library
- Cross-reference density: >3 links per document average
Qualitative
- Developers can find docs without asking
- AI can discover relevant context
- New team members onboard faster
- Knowledge doesn't get lost
Risks & Mitigation
Risk 1: Migration Disruption
Mitigation: - Gradual migration (high-value first) - Keep old structure during transition - Clear migration guide - Automated migration scripts
Risk 2: Maintenance Burden
Mitigation: - Automated index generation - Pre-commit validation - Simple flat structure - Clear ownership
Risk 3: Over-Organization
Mitigation: - Maximum 3 levels deep - Flat topic hierarchy - Avoid premature categorization - Regular pruning
Risk 4: Search Quality
Mitigation: - Start with local embeddings (fast, free) - Upgrade to API if needed - Hybrid approach (local + API) - Regular search quality reviews
Recommendation
✅ Proceed with Knowledge Library
Why: 1. Solves real problem (fragmented knowledge) 2. Balances organization with simplicity 3. Enables vector search (strategic advantage) 4. Supports AI persistent memory 5. Scalable and maintainable
Next Steps: 1. Review and approve this proposal 2. Create knowledge/ structure 3. Migrate high-value content 4. Implement vector search 5. Establish practices
Questions for Review
- Structure: Is the flat hierarchy appropriate?
- Topics: Are the top-level topics comprehensive?
- Search: Local embeddings vs. API vs. hybrid?
- Migration: Phased approach acceptable?
- Naming: File naming conventions clear?