shelving/extractmodule

Turn a folder of source files into a navigable tree of documentation. extract is the engine behind the Shelving documentation site itself.

API references and hand-written guides usually live in separate tools. extract unifies them: it reads files and directories from disk and produces a single TreeElement tree describing every directory, file, and exported code symbol. A directory's README.md becomes that directory's intro page. A TypeScript file's exports become documented symbols. A name.md file sitting next to name.ts is merged onto the same page. One tree, one site, prose and source together.

Pair the tree with the shelving/ui and you have a complete static documentation site — exactly how this site is built.

Concepts

Extractors

An Extractor converts an input into a TreeElement. Extractors are composable — an outer extractor delegates to inner ones.

ExtractorInputOutput
DirectoryExtractora directory patha tree-element node, recursing into subdirectories
FileExtractora filea tree-element node holding the raw text
MarkupExtractora .md filea tree-element node with title taken from the first # heading
TypescriptExtractora .ts / .tsx filea tree-element node whose children are the exported symbols

DirectoryExtractor is the entry point. It walks a directory, dispatches each file to a FileExtractor by extension, and recurses into subdirectories.

The tree

Every extractor produces a TreeElement (see shelving/util/tree). There are two element types:

  • tree-element — a directory or a file. A directory's content is absorbed from an index file; a file's children are its exported symbols (for TypeScript). Its source records the absolute path it came from.
  • tree-documentation — one documented symbol (function, class, type, constant), carrying the signatures, params, returns, throws, and examples parsed from its JSDoc.

The tree is plain, JSON-serialisable data. The docs build writes it to tree.json so the browser can fetch it and hydrate.

Index files

A directory's index file — README.md by default — is absorbed into the directory element. Its title, description, and content become the directory's own. This is why writing a README.md is all it takes to give a directory an intro page.

Same-key merging

Files whose names slugify to the same key are merged into one element. template.md and template.ts both have the key template, so they become a single page: the Markdown supplies the prose, the TypeScript supplies the documented symbols. Markdown has the higher priority, so it wins on title.

This is the pattern behind every util/*.ts file paired with a name.md guide — the hand-written page and the extracted API land together.

Building a documentation site

The pipeline, end to end. docs/build.tsx and docs/render.tsx in this repository are the working reference; renderApp in docs/render.tsx is the best-annotated example.

1. Extract the tree

ts
import { DirectoryExtractor } from "shelving/extract";

const root = await new DirectoryExtractor().extract("/path/to/modules");

root is a tree-element node — the whole project as one tree.

2. Render with <TreeApp>

<TreeApp> is the shell. Give it the tree and it produces a complete site — a sidebar menu, routing, and one page per element.

tsx
import { TreeApp } from "shelving/ui";

<TreeApp tree={root} />

Internally <TreeApp> wires together:

  • <Navigation> — owns URL state and intercepts link clicks.
  • <Router>/ renders the root; /{...path} catches every deeper path.
  • <TreePage> — resolves the URL path to a tree element and renders it.

<TreePage> dispatches on element type: tree-element renders as <TreePage> and tree-documentation as <DocumentationPage>.

3. Build static pages

docs/build.tsx shows the production build: extract the tree, write tree.json, bundle the browser and server scripts, then render every page to static HTML. docs/render.tsx's renderApp enumerates every page from the canonical path keys of flattenTree() and writes one index.html per page. The browser later fetches tree.json and hydrates the same React tree the server rendered.

Customising

Controlling what gets indexed

DirectoryExtractor accepts a DirectoryExtractorOptions object:

ts
new DirectoryExtractor({
  index: [/^readme\.md$/i],                     // filenames treated as the directory index
  extractors: { md: new MarkupExtractor() },  // file extension → extractor
  ignore: [/\.test\.tsx?$/i],                   // entries to skip
  base: "/abs/path",                            // base for resolving relative paths
});

By default, test files, hidden files, underscore-prefixed files, and node_modules are skipped, and .md, .ts, .tsx, and .txt files are extracted.

Overriding page renderers

The tree components render through mappings — wrap a subtree to swap a renderer for one element type:

  • <TreePageMapping> — the full-page component for an element type.
  • <TreeMenuMapping> — the sidebar menu renderer.
  • <TreeCardMapping> — the card renderer used in directory listings.
tsx
<TreePageMapping mapping={{ "tree-element": MyTreePage }}>
  <TreeApp tree={root} />
</TreePageMapping>

Functions

Go

mergeTreeElements()function

Merge two tree elements — primary keeps its identity (type, key, source); secondary contributes any metadata primary does not already have.

mergeTreeElements(primary: T, secondary: TreeElement): T
Go

extractMarkdownProps()function

Parse a markdown source string once and derive its title and description in a single pass.

extractMarkdownProps(text: string): { title: string | undefined; description: string | undefined }

Classes

Go

Extractorclass

Base class for an extractor that converts an input of type I into a TreeElement output.

new Extractor<I, O>()
Go

PackageExtractorclass

Extractor that reads a package.json and produces a flat tree of modules — one kind: "module" DocumentationElement per export entry, in declaration order.

new PackageExtractor({ tree, extensions = DEFAULT_EXTENSIONS, module = new ModuleExtractor(), base }: PackageExtractorOptions)
Go

IndexExtractorclass

Through extractor that absorbs each element's index child into the element itself.

new IndexExtractor<I>(source: Extractor<I, TreeElement>, { index = DEFAULT_INDEX }: IndexExtractorOptions = {})
Go

ThroughExtractorclass

Base class for an extractor that wraps another extractor.

new ThroughExtractor<I, O>(source: Extractor<I, O>)
Go

TypescriptExtractorclass

File extractor that parses a TypeScript source file into a tree element.

new TypescriptExtractor()
Go

MergingExtractorclass

Through extractor that walks a tree of tree-element nodes and merges sibling tree elements whose keys match a merges template pair.

new MergingExtractor<I>(source: Extractor<I, TreeElement>, { merges = DEFAULT_MERGES }: MergingExtractorOptions = {})
Go

DirectoryExtractorclass

Extractor that walks a directory on disk and produces a tree of tree-element nodes.

new DirectoryExtractor({ extractors = DEFAULT_EXTRACTORS, base, ignore = DEFAULT_IGNORE }: DirectoryExtractorOptions = {})

Interfaces

Go

PackageExtractorOptionsinterface

Options for a PackageExtractor.

{
	readonly tree: TreeElement;
	readonly extensions?: ImmutableDictionary<readonly string[]>;
	readonly module?: ModuleExtractor;
	readonly base?: AbsolutePath;
}
Go

IndexExtractorOptionsinterface

Options for an IndexExtractor.

{
	readonly index?: Matchables;
}
Go

MergingExtractorOptionsinterface

Options for a MergingExtractor.

{
	readonly merges?: ImmutableDictionary<readonly string[]>;
}
Go

DirectoryExtractorOptionsinterface

Options for a directory extractor.

{
	readonly extractors?: ImmutableDictionary<FileExtractor>;
	readonly base?: AbsolutePath;
	readonly ignore?: Matchables;
}
Go

ModuleExtractorInputinterface

Input for a ModuleExtractor.

{
	readonly name: string;
	readonly source: TreeElement;
}