Unified Knowledge Library System - Proposal

Problem: Knowledge fragmented across brain/, docs/, architecture/, with inconsistent filing and risk of loss
Goal: Single source of truth with flat hierarchy, meaningful topics, cross-referencing, and vector search

Current State Analysis

Scattered Knowledge Locations

Within repository (singular-dream/):

brain/ - Ephemeral session artifacts (risk of loss)
docs/ - Mixed purpose (user docs, research, guides) - 111 files
architecture/ - Architecture decisions, thinking, standards - 203 files
architecture/THINKING/ - Research, proposals
architecture/GOLDEN/ - Canonical architecture
architecture/STANDARDS/ - Standards documents
architecture/TRASH/ - Archived/deprecated
Various README.md files throughout codebase

Beyond repository boundary (SD-HOA/):

SD-HOA/README.md - Marketing website documentation (9,957 bytes)
SD-HOA/*.html - Marketing website files (index, visualizations)
SD-HOA/*.css - Styling documentation
SD-HOA/*.js - Interactive features documentation
Other project-level documentation

Total scope:

Repository: 118+ markdown files found at depth 2
Parent directory: Additional documentation and web assets
Multi-repository knowledge spanning different contexts

Problems Identified

Misplaced artifacts - Brain directory used for permanent research
Inconsistent filing - Same type of content in different places
Hard to discover - No central index or search
Risk of loss - Ephemeral directories for permanent knowledge
Cognitive overload - Too many places to look
No cross-referencing - Documents exist in isolation
Multi-repository fragmentation - Knowledge split between parent and repo
Boundary confusion - Unclear what belongs where

Proposed Solution: The Knowledge Library

Core Principle

"Library Floor 1, Shelf 16" Model:

One central location for all knowledge
Flat but meaningful topic organization
Consistent cross-referencing
README navigation at every level
Vector search for discovery

Directory Structure

knowledge/
├── README.md                          # Master index & search guide
├── architecture/                      # System architecture
│   ├── README.md
│   ├── golden/                        # Canonical architecture (current state)
│   ├── proposed/                      # Future architecture (planning)
│   ├── retired/                       # Deprecated architecture (history)
│   └── decisions/                     # ADRs (Architecture Decision Records)
├── standards/                         # Standards & practices
│   ├── README.md
│   ├── coding/                        # Code standards
│   ├── processes/                     # Development processes
│   ├── procedures/                    # Step-by-step procedures
│   └── conventions/                   # Naming, formatting, etc.
├── research/                          # Research & thinking
│   ├── README.md
│   ├── ai-development/                # AI & development research
│   ├── technical/                     # Technical research
│   └── business/                      # Business research
├── guides/                            # How-to guides
│   ├── README.md
│   ├── development/                   # Developer guides
│   ├── deployment/                    # Deployment guides
│   ├── operations/                    # Operations guides
│   └── user/                          # User documentation
├── processes/                         # Business processes
│   ├── README.md
│   ├── governance/                    # Governance processes
│   ├── finance/                       # Financial processes
│   ├── operations/                    # Operational processes
│   └── property/                      # Property management
├── reference/                         # Reference materials
│   ├── README.md
│   ├── apis/                          # API documentation
│   ├── tools/                         # Tool documentation
│   ├── integrations/                  # Integration docs
│   └── glossary.md                    # Terms & definitions
└── sessions/                          # Session artifacts
    ├── README.md
    ├── 2026-01/                       # Organized by month
    │   ├── m2-ai-implementation/      # Session topic
    │   └── devops-architecture/       # Session topic
    └── index.md                       # Searchable session index

Multi-Repository Strategy

The Boundary Question

Current structure:

SD-HOA/                          # Parent directory (Antigravity boundary)
├── README.md                    # Marketing website docs
├── index.html, *.css, *.js     # Marketing website
└── singular-dream/              # Monorepo
    ├── knowledge/               # NEW: Knowledge library
    ├── docs/                    # Current docs
    └── architecture/            # Current architecture

Key decision: Where should the knowledge library live?

Option 1: Repository-Scoped (Recommended)

Location: singular-dream/knowledge/

Scope:

All monorepo-related knowledge
Architecture, standards, guides
Development documentation
Session artifacts

Benefits:

✅ Version controlled with code
✅ Part of monorepo structure
✅ Clear ownership
✅ Portable with repository

Limitations:

❌ Doesn't include parent-level docs
❌ Marketing website docs separate

Option 2: Parent-Scoped

Location: SD-HOA/knowledge/

Scope:

All project knowledge (monorepo + marketing)
Cross-repository documentation
Project-level decisions

Benefits:

✅ Single source of truth for all knowledge
✅ Includes marketing website docs
✅ Project-level view

Limitations:

❌ Not version controlled with monorepo
❌ Harder to port repository
❌ Unclear ownership

Recommendation: Hybrid Approach

Primary library: singular-dream/knowledge/ (repository-scoped)

Cross-reference: Link to parent-level docs when needed

Structure:

SD-HOA/
├── README.md                    # Marketing website (stays here)
├── marketing/                   # NEW: Organize marketing assets
│   ├── index.html
│   ├── styles.css
│   └── script.js
└── singular-dream/
    └── knowledge/
        ├── README.md            # Master index
        ├── architecture/
        ├── standards/
        ├── research/
        ├── guides/
        ├── processes/
        ├── reference/
        │   └── marketing-website.md  # Links to ../../../marketing/
        └── sessions/

Benefits:

✅ Repository knowledge version-controlled
✅ Marketing website organized but separate
✅ Cross-references maintain connections
✅ Clear boundaries and ownership

Design Principles

1. Flat But Meaningful

Avoid deep nesting:

Maximum 3 levels deep
Clear topic separation at top level
Subtopics only when necessary

Example:

✅ Good: knowledge/architecture/golden/monorepo-structure.md
❌ Bad:  knowledge/architecture/systems/backend/structure/monorepo/design.md

2. State-Based Organization

Architecture example:

golden/ - Current canonical state
proposed/ - Future planned state
retired/ - Historical deprecated state

Benefits:

Clear lifecycle management
Easy to find current vs. future
Historical context preserved

3. Topic-Based, Not Type-Based

Organize by topic, not document type:

✅ Good: knowledge/architecture/golden/
         knowledge/architecture/proposed/
         knowledge/architecture/retired/

❌ Bad:  knowledge/markdown-files/
         knowledge/diagrams/
         knowledge/spreadsheets/

Every directory has README.md:

# Architecture

**Purpose**: System architecture documentation

**Contents**:

- `golden/` - Current canonical architecture
- `proposed/` - Planned future architecture
- `retired/` - Deprecated architecture
- `decisions/` - Architecture Decision Records (ADRs)

**Related**:

- [Standards](../standards/) - Coding standards
- [Guides](../guides/development/) - Developer guides
- [Master Index](../README.md) - Full library index

**Search**: Use vector search for "architecture" topics

Cross-Referencing System

Document Frontmatter

Every document includes:

---
title: Monorepo Structure
topic: architecture
state: golden
tags: [monorepo, turborepo, structure]
related:
  - knowledge/standards/coding/monorepo-conventions.md
  - knowledge/guides/development/monorepo-setup.md
created: 2026-01-15
updated: 2026-01-18
---

Automatic Cross-Reference Generation

Script: scripts/generate-knowledge-index.ts

// Scans all knowledge/ documents
// Extracts frontmatter
// Generates:
// - knowledge/INDEX.md (master index)
// - knowledge/TAGS.md (tag index)
// - knowledge/TOPICS.md (topic index)
// - Per-directory README.md updates

Bidirectional Links

Documents reference each other:

## Related Documents

- [Monorepo Conventions](../standards/coding/monorepo-conventions.md)
- [Monorepo Setup Guide](../guides/development/monorepo-setup.md)
- [ADR: Turborepo Selection](../architecture/decisions/adr-001-turborepo.md)

Vector Search Integration

Why Vector Search?

Problems with grep/find:

Requires exact keywords
Misses semantic matches
No relevance ranking
Hard to discover related content

Vector search benefits:

Semantic understanding
"Find documents about batch processing" (not just keyword "batch")
Relevance ranking
Discover related topics

Implementation Options

Option 1: Local Embeddings (Recommended)

Tool: @xenova/transformers (runs locally)

// scripts/lib/knowledge-search.ts
import { pipeline } from "@xenova/transformers";

const embedder = await pipeline(
  "feature-extraction",
  "Xenova/all-MiniLM-L6-v2",
);

// Index all documents
const documents = await scanKnowledgeDirectory();
const embeddings = await Promise.all(
  documents.map((doc) => embedder(doc.content)),
);

// Store in simple JSON
await fs.writeFile(
  "knowledge/.index/embeddings.json",
  JSON.stringify({
    documents,
    embeddings,
  }),
);

// Search
async function search(query: string, limit = 10) {
  const queryEmbedding = await embedder(query);
  const results = cosineSimilarity(queryEmbedding, embeddings);
  return results.slice(0, limit);
}

CLI:

pnpm devops knowledge search "batch processing architecture"
pnpm devops knowledge search "AI development best practices"

Option 2: Gemini Embeddings API

Use Google's embedding API:

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: "embedding-001" });

async function embed(text: string) {
  const result = await model.embedContent(text);
  return result.embedding;
}

Pros: Better quality, no local model Cons: API calls, requires internet

Option 3: Hybrid Approach

Local for fast search, API for quality:

// Quick local search
const quickResults = await localSearch(query);

// If user wants more, use API
const detailedResults = await apiSearch(query);

Search Interface

CLI:

# Quick search
pnpm devops knowledge search "monorepo"

# Detailed search
pnpm devops knowledge search "monorepo" --detailed

# Search specific topic
pnpm devops knowledge search "batch" --topic architecture

# Search by tag
pnpm devops knowledge search --tag ai-development

MCP Tool:

// Antigravity can search via MCP
{
  name: 'knowledge_search',
  description: 'Search knowledge library',
  inputSchema: {
    query: string,
    topic?: string,
    limit?: number,
  }
}

Migration Strategy

Phase 1: Create Structure (Week 1)

Tasks:

Create knowledge/ directory structure
Create all README.md files
Setup frontmatter template
Create migration script

Phase 2: Migrate Content (Week 2-3)

Priority order:

High value, high risk - Research notes in brain/
Canonical architecture - architecture/GOLDEN/ → knowledge/architecture/golden/
Active thinking - architecture/THINKING/ → knowledge/architecture/proposed/ or knowledge/research/
Standards - architecture/STANDARDS/ → knowledge/standards/
Documentation - docs/ → knowledge/guides/ or knowledge/reference/

Migration script:

pnpm devops knowledge migrate --dry-run  # Preview
pnpm devops knowledge migrate --execute  # Execute

Phase 3: Setup Search (Week 3)

Tasks:

Implement local embeddings
Index all documents
Create CLI search command
Add MCP search tool
Test and refine

Phase 4: Establish Practices (Week 4)

Tasks:

Update workflows to use knowledge/
Add pre-commit hook to validate frontmatter
Auto-generate cross-reference indexes
Train team on new structure

File Naming Conventions

Consistent Naming

Format: {topic}-{description}.md

Examples:

monorepo-structure.md
batch-processing-architecture.md
ai-development-best-practices.md
deployment-procedures-marketing.md

Avoid

❌ doc1.md
❌ notes.md
❌ temp-2026-01-18.md
❌ FINAL_FINAL_v3.md

Date-Based (Only for Sessions)

knowledge/sessions/2026-01/m2-ai-implementation/
  ├── 2026-01-15-planning.md
  ├── 2026-01-16-implementation.md
  └── 2026-01-18-completion.md

README Template

Master README (knowledge/README.md)

# Knowledge Library

**Purpose**: Central repository for all project knowledge

**Quick Start**:

- Browse by [Topic](#topics)
- Search by [Tag](#tags)
- [Vector Search](#search) for semantic discovery

## Topics

- [Architecture](./architecture/) - System architecture
- [Standards](./standards/) - Standards & practices
- [Research](./research/) - Research & thinking
- [Guides](./guides/) - How-to guides
- [Processes](./processes/) - Business processes
- [Reference](./reference/) - Reference materials
- [Sessions](./sessions/) - Session artifacts

## Search

**CLI:**

```bash
pnpm devops knowledge search "your query"
```

MCP (Antigravity):

Use knowledge_search tool

Contributing

Use frontmatter template
Follow naming conventions
Cross-reference related docs
Update indexes: pnpm devops knowledge index

Indexes

Master Index - All documents
Tag Index - By tag
Topic Index - By topic

### Topic README Template

```markdown
# {Topic Name}

**Purpose**: {One-line description}

**Contents**:
- `{subdirectory}/` - {Description}
- ...

**Related**:
- [{Related Topic}](../path/to/topic/)
- ...

**Search**: Use vector search for "{topic}" topics

## Documents

{Auto-generated list of documents in this directory}

## Quick Links

- Most recent: [{Document}](./path/to/doc.md)
- Most referenced: [{Document}](./path/to/doc.md)

Automation & Tooling

Knowledge CLI Commands

# Search
pnpm devops knowledge search "query"

# Index
pnpm devops knowledge index              # Regenerate all indexes
pnpm devops knowledge index --validate   # Validate frontmatter

# Migrate
pnpm devops knowledge migrate --from docs/ --to knowledge/guides/

# Stats
pnpm devops knowledge stats              # Show statistics
pnpm devops knowledge stats --topic architecture

# Validate
pnpm devops knowledge validate           # Check all documents
pnpm devops knowledge validate --fix     # Auto-fix issues

Pre-commit Hook

# .husky/pre-commit
pnpm devops knowledge validate --staged

Auto-Index Generation

// scripts/lib/knowledge-indexer.ts

interface Document {
  path: string;
  frontmatter: Frontmatter;
  content: string;
}

async function generateIndexes() {
  const docs = await scanKnowledge();

  // Generate master index
  await generateMasterIndex(docs);

  // Generate tag index
  await generateTagIndex(docs);

  // Generate topic index
  await generateTopicIndex(docs);

  // Update README files
  await updateReadmes(docs);
}

Benefits

For Developers

Single source of truth - Always know where to look
Fast discovery - Vector search finds relevant docs
Clear organization - Flat hierarchy, meaningful topics
Cross-referencing - Related docs linked
No cognitive overload - Simple navigation

For AI (Antigravity)

Persistent memory - Knowledge survives sessions
Searchable context - Vector search for relevant docs
Clear structure - Predictable file locations
Cross-references - Understand relationships
MCP integration - Direct search access

For Organization

Institutional memory - Knowledge preserved
Onboarding - New team members find docs easily
Consistency - Standard practices documented
Discoverability - Vector search reveals hidden knowledge
Maintainability - Clear lifecycle (golden/proposed/retired)

Success Metrics

Quantitative

Search success rate: >90% of searches find relevant docs
Time to find: <30 seconds average
Document coverage: >95% of knowledge in library
Cross-reference density: >3 links per document average

Qualitative

Developers can find docs without asking
AI can discover relevant context
New team members onboard faster
Knowledge doesn't get lost

Risks & Mitigation

Risk 1: Migration Disruption

Mitigation:

Gradual migration (high-value first)
Keep old structure during transition
Clear migration guide
Automated migration scripts

Risk 2: Maintenance Burden

Mitigation:

Automated index generation
Pre-commit validation
Simple flat structure
Clear ownership

Risk 3: Over-Organization

Mitigation:

Maximum 3 levels deep
Flat topic hierarchy
Avoid premature categorization
Regular pruning

Risk 4: Search Quality

Mitigation:

Start with local embeddings (fast, free)
Upgrade to API if needed
Hybrid approach (local + API)
Regular search quality reviews

Recommendation

✅ Proceed with Knowledge Library

Why:

Solves real problem (fragmented knowledge)
Balances organization with simplicity
Enables vector search (strategic advantage)
Supports AI persistent memory
Scalable and maintainable

Next Steps:

Review and approve this proposal
Create knowledge/ structure
Migrate high-value content
Implement vector search
Establish practices

Questions for Review

Structure: Is the flat hierarchy appropriate?
Topics: Are the top-level topics comprehensive?
Search: Local embeddings vs. API vs. hybrid?
Migration: Phased approach acceptable?
Naming: File naming conventions clear?

Version History

Version	Date	Author	Change
0.1.0	2026-01-26	Antigravity	Initial Audit & Metadata Injection

Unified Knowledge Library System - Proposal

Current State Analysis

Scattered Knowledge Locations

Problems Identified

Proposed Solution: The Knowledge Library

Core Principle

Directory Structure

Multi-Repository Strategy

The Boundary Question

Option 1: Repository-Scoped (Recommended)

Option 2: Parent-Scoped

Recommendation: Hybrid Approach

Design Principles

1. Flat But Meaningful

2. State-Based Organization

3. Topic-Based, Not Type-Based

4. README Navigation

Cross-Referencing System

Document Frontmatter

Automatic Cross-Reference Generation

Bidirectional Links

Vector Search Integration

Why Vector Search?

Implementation Options

Option 1: Local Embeddings (Recommended)

Option 2: Gemini Embeddings API

Option 3: Hybrid Approach

Search Interface

Migration Strategy

Phase 1: Create Structure (Week 1)

Phase 2: Migrate Content (Week 2-3)

Phase 3: Setup Search (Week 3)

Phase 4: Establish Practices (Week 4)

File Naming Conventions

Consistent Naming

Avoid

Date-Based (Only for Sessions)

README Template

Master README (knowledge/README.md)

Contributing

Indexes

Automation & Tooling

Knowledge CLI Commands

Pre-commit Hook

Auto-Index Generation

Benefits

For Developers

For AI (Antigravity)

For Organization

Success Metrics

Quantitative

Qualitative

Risks & Mitigation

Risk 1: Migration Disruption

Risk 2: Maintenance Burden

Risk 3: Over-Organization

Risk 4: Search Quality

Recommendation

Questions for Review

Version History