Spry LogoDocumentation
Contributing and Support

Architecture Deep Dive

Programmable Markdown Document Model

System Overview

Spry transforms Markdown into executable workflows through a multi-stage pipeline. Understanding this architecture helps you leverage Spry's full capabilities and extend it for your needs.

Pipeline Philosophy

Spry's design separates concerns into four distinct phases: Parse, Analyze, Project, and Execute. Each phase builds upon the previous one, creating a flexible and extensible system.

SPRY EXECUTION PIPELINE

1. PARSE

Markdown
(.md)
remark
+ plugins
mdast
(AST)

2. ANALYZE (Axiom)

mdast
(AST)
Edge Rules
Pipeline
Graph
(Nodes + Edges)

3. PROJECT

Graph
Projection
(Flexible, Playbook)
Domain
Model

4. EXECUTE / EMIT

Domain
Model
Task DAG
Executor
Artifacts
(SQL, HTML, Output)

Phase 1: Parse

The parsing phase converts Markdown text into an Abstract Syntax Tree (AST) using the unified/remark ecosystem.

Location

lib/axiom/io/mod.ts

Components

import { mardownParserPipeline } from "./lib/axiom/io/mod.ts";

const pipeline = mardownParserPipeline();
// Uses unified() with a series of plugins

Plugin Pipeline

The parser runs a carefully ordered sequence of plugins, each building on the previous:

OrderPluginPurpose
1remarkParseParse Markdown to mdast
2remarkFrontmatterExtract YAML frontmatter
3remarkDirectiveParse :x, ::x, :::x directives
4docFrontmatterParse and store document YAML
5remarkGfmGitHub Flavored Markdown
6resolveImportSpecsFind importable code blocks
7insertImportPlaceholdersGenerate imported cells
8nodeDecoratorTransform @id to decorators
9codeDirectiveCandidatesIdentify code directives
10actionableCodeCandidatesMark executable blocks

Plugin Order Matters

Plugins execute in sequence, with each building on the AST transformations of previous plugins. For example, nodeDecorator must run before actionableCodeCandidates to properly identify executable cells.

Output

An mdast Root node with enhanced data properties on relevant nodes, ready for semantic analysis.

Phase 2: Analyze (Axiom)

Axiom applies edge rules to build a semantic graph representing the relationships between elements in your Spryfile.

Location

lib/axiom/edge/

Edge Rules

Rules are functions that process the AST and yield relationship edges:

type GraphEdgesRule<Rel, Ctx, Edge> = (
  ctx: Ctx,
  prevEdges: Iterable<Edge>
) => Iterable<Edge> | false;

Core Rules

RuleRelationshipPurpose
containedInSectioncontainedInSectionSection hierarchy
parentHeadingparentHeadingHeading relationships
sectionSemanticIdsectionHasSemanticId@id decorators
frontmatterClassificationhasFrontmatterDocument frontmatter
nodeDependencydependsOnTask dependencies

Rule Composition

Rules can be composed and extended without modifying existing code. This makes it easy to add custom semantic relationships for domain-specific use cases.

Rule Execution

function* astGraphEdges(root, { prepareContext, rules }) {
  const ctx = prepareContext(root);

  let current = [];
  for (const rule of rules(ctx)) {
    const produced = rule(ctx, current);
    if (produced !== false) {
      current = produced;
    }
  }

  for (const edge of current) {
    yield edge;
  }
}

Output

A Graph object containing:

  • root - The mdast Root node
  • edges - Array of relationship edges
  • rels - Set of relationship types
  • relCounts - Count per relationship type

Phase 3: Project

Projections transform the semantic graph into domain-specific models optimized for different use cases.

Location

lib/axiom/projection/

Available Projections

FlexibleProjection

Provides a relational view of the document structure:

const model = await flexibleProjectionFromFiles(paths);

// Returns:
{
  documents: [...],      // Document metadata
  nodes: {...},          // Node lookup by ID
  hierarchies: {...},    // Section tree structures
  mdastStore: [...],     // AST node storage
}

Use Cases:

  • Document analysis and querying
  • Building custom tooling
  • Exploring document structure
  • Creating visualizations

PlaybookProjection

Creates an executable task model for runbook operations:

const { tasks, directives, issues } = await playbooksFromFiles(paths);

// Returns:
{
  sources: [...],           // Source documents
  executables: [...],       // Executable cells
  materializables: [...],   // Materializable cells
  directives: [...],        // Directive cells
  tasks: [...],             // ExecutableTask objects
  issues: [...],            // Validation issues
}

Use Cases:

  • Running automated workflows
  • Executing multi-step processes
  • Building CI/CD pipelines
  • Creating interactive runbooks

Task Classification

Tasks are categorized by their nature and intended use:

NaturePurposeExamples
EXECUTABLERun and capture outputbash, deno, python
MATERIALIZABLEEmit as filessql, html, json
DIRECTIVEControl behaviorPARTIAL, HEAD, TAIL

Extensible Classification

The task classification system is extensible. You can define custom natures and executors for domain-specific languages or tools.

Phase 4: Execute / Emit

The final phase executes tasks or emits artifacts using a DAG-based execution engine.

Location

lib/universal/task.ts, lib/axiom/orchestrate/task.ts

Execution Plan

const plan = executionPlan(tasks);

// Plan includes:
{
  ids: [...],         // Task identifiers
  byId: {...},        // Quick lookup map
  layers: [...],      // Parallel execution waves
  dag: [...],         // Topological order
  edges: [...],       // Dependency edges
  unresolved: [...],  // Cycle detection
}

Dependency Resolution

The execution planner uses Kahn's algorithm for topological sorting. Circular dependencies are detected and reported as errors before execution begins.

DAG Execution

The task executor uses Kahn's algorithm for dependency resolution:

Find Root Tasks

Identify all tasks with no unmet dependencies (indegree = 0).

Execute Available Tasks

Run all tasks in the current layer (optionally in parallel).

Update Dependencies

Mark completed tasks as done and reduce the indegree of dependent tasks.

Repeat or Complete

Continue until all tasks are complete or a cycle is detected.

Task Runner

const runbook = tasksRunbook({ directives, shellBus, tasksBus });
const results = await runbook.execute(plan);

Execution Features:

  • Event-driven progress reporting
  • Streaming output capture
  • Error handling and recovery
  • Parallel execution support (via layers)

Library Structure

Understanding Spry's module organization helps you navigate the codebase and extend functionality.

lib/axiom

The semantic graph engine and core transformation pipeline.

lib/axiom/
├── mod.ts              # Public exports
├── graph.ts            # Graph building
├── edge/               # Edge rules
│   ├── mod.ts
│   ├── orchestrate.ts  # Rule pipeline
│   ├── rule/           # Individual rules
│   └── pipeline/       # Rule compositions
├── io/                 # I/O and parsing
│   ├── mod.ts
│   └── resource.ts
├── mdast/              # AST utilities
│   ├── data-bag.ts
│   ├── node-issues.ts
│   └── ...
├── projection/         # Graph projections
│   ├── flexible.ts
│   ├── playbook.ts
│   └── tree.ts
├── remark/             # Remark plugins
│   ├── actionable-code-candidates.ts
│   ├── code-directive-candidates.ts
│   └── ...
├── text-ui/            # Terminal interfaces
└── web-ui/             # Web interface

lib/universal

Shared utilities used across the system.

lib/universal/
├── task.ts             # DAG execution
├── shell.ts            # Shell commands
├── resource.ts         # Resource loading
├── code.ts             # Code parsing
├── directive.ts        # Directive parsing
├── event-bus.ts        # Event system
├── watcher.ts          # File watching
└── ...

lib/courier

Data movement protocol implementations.

lib/courier/
├── protocol.ts         # DataMP protocol
├── singer.ts           # Singer adapter
└── airbyte.ts          # Airbyte adapter

lib/playbook

Domain-specific playbook implementations.

lib/playbook/
├── README.md           # Architecture docs
└── sqlpage/
    ├── cli.ts          # SQLPage CLI
    ├── content.ts      # Content generation
    ├── interpolate.ts  # Template interpolation
    └── orchestrate.ts  # Orchestration

Data Flow Examples

Understanding data flow through the pipeline helps you debug issues and optimize workflows.

Markdown to Tasks

Spryfile.md
markdownASTs()
mdast Root + VFile
graph()
Graph {edges, nodes}
playbooksFromFiles()
PlaybookProjection
{tasks, directives}
executionPlan()
TaskExecutionPlan
{layers, dag}
tasksRunbook().execute()
ExecutionResults

Markdown to SQLPage

Spryfile.md
sqlPagePlaybook()
SqlPagePlaybookProjection
sqlPageFiles()
SqlPageContent[]
materializeFs()
dev-src.auto/*.sql
sqlPageFilesUpsertDML()
SQL INSERT statements

Multiple Output Paths

The SQLPage flow demonstrates how Spry can emit artifacts in multiple formats from a single source - both as filesystem files for development and as database records for production deployment.

Design Principles

Spry's architecture embodies several key design principles that guide development and usage.

Determinism

Same input always produces the same output through stable sorting and reproducible graph generation

Composability

Components mix freely - projections stack, rules extend, and playbooks share infrastructure

Type Safety

TypeScript throughout with Zod schemas for validation and runtime type checking

Extensibility

Plugin architecture enables custom remark plugins, edge rules, projections, and executors

Determinism

Same input always produces same output:

  • Stable topological ordering - Tasks execute in a predictable sequence
  • Definition-order tie-breaking - When multiple orders are valid, uses document order
  • Reproducible graph generation - AST transformations are pure functions

Composability

Components mix freely without tight coupling:

  • Projections stack - Multiple views of the same graph
  • Rules extend - Add new relationships without modifying existing ones
  • Playbooks share infrastructure - Reuse executors and utilities

Type Safety

TypeScript provides strong guarantees:

  • Zod schemas - Runtime validation of configurations and data
  • Generic types - Flexible while maintaining type safety
  • Runtime checking - Catch errors before they cause problems

Extensibility

Plugin architecture throughout:

  • Custom remark plugins - Transform Markdown in new ways
  • Custom edge rules - Define new semantic relationships
  • Custom projections - Create domain-specific models
  • Custom executors - Support new languages and tools

Performance Considerations

Understanding performance characteristics helps you build efficient workflows.

Planning

  • O(V + E) complexity - Linear in vertices and edges
  • Stable sorting - Limited to frontier nodes only
  • Non-destructive indegree - Preserves original graph structure

Execution

  • Serial by default - Predictable and debuggable
  • Parallelization available - Via execution layers when safe
  • Event-driven reporting - Non-blocking progress updates
  • Streaming output - Memory-efficient capture of large outputs

Memory

  • AST nodes shared - Not copied, reducing memory footprint
  • Lazy iteration - Process data on-demand where possible
  • VFile-based resources - Efficient file handling

Large Workflows

For workflows with hundreds of tasks, consider breaking them into smaller Spryfiles that can be composed together. This improves both performance and maintainability.

Extending Spry

Spry's architecture makes it straightforward to add new capabilities.

Adding a Custom Language

Define the Language Nature

Determine if it's EXECUTABLE, MATERIALIZABLE, or DIRECTIVE:

const myLang = {
  nature: "EXECUTABLE" as const,
  lang: "mylang",
  extensions: [".mylang"]
};

Create an Executor

Implement the execution logic:

async function executeMyLang(cell: CodeCell, context: ExecutionContext) {
  // Your execution logic here
  const result = await runMyLanguage(cell.code);
  return { stdout: result, stderr: "", exitCode: 0 };
}

Register the Language

Add it to the language registry:

languageRegistry.register(myLang, executeMyLang);

Adding a Custom Projection

export function myCustomProjection(graph: Graph): MyDomainModel {
  // Extract relevant nodes and edges
  const relevantNodes = Array.from(graph.nodes())
    .filter(node => /* your criteria */);

  // Build your domain model
  return {
    // Your custom structure
  };
}

Adding a Custom Edge Rule

export function* myCustomRule(
  ctx: GraphContext,
  prevEdges: Iterable<Edge>
): Iterable<Edge> {
  // Analyze the AST
  const nodes = selectNodes(ctx.root);
  
  // Yield new edges
  for (const node of nodes) {
    yield {
      source: node.id,
      target: relatedNode.id,
      relationship: "myRelationship"
    };
  }
  
  // Pass through previous edges
  yield* prevEdges;
}

Understanding Enables Mastery

With this architectural knowledge, you can leverage Spry's full power, debug complex issues, and extend it for your unique needs.

How is this guide?

Last updated on

On this page