Warning
This is an internal project, and is not intended for public use. No support or stability guarantees are provided.
The parseSource utility parses source code into HAST (Hypertext Abstract Syntax Tree) nodes with syntax highlighting using Starry Night. It converts code into highlighted HTML structures for display in documentation and demos.
import { createParseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
// Initialize the parser (do this once, typically at app startup)
const parseSource = await createParseSource();
// Parse and highlight JavaScript code
const highlighted = parseSource('const x = 42;', 'example.js');
// Use the HAST tree for rendering (e.g., with hastToReact)
The parser automatically:
import { createParseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
const parseSource = await createParseSource();
// JavaScript
const jsCode = parseSource('const x = 42;', 'example.js');
// TypeScript
const tsCode = parseSource('interface User { name: string; }', 'types.ts');
// CSS
const cssCode = parseSource('.button { color: blue; }', 'styles.css');
// HTML
const htmlCode = parseSource('<div>Hello</div>', 'index.html');
import { createParseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
const parseSource = await createParseSource();
const files = [
{ name: 'App.tsx', content: 'export default function App() {}' },
{ name: 'styles.css', content: '.app { margin: 0; }' },
{ name: 'utils.ts', content: 'export const helper = () => {};' },
];
const highlighted = files.map((file) => ({
name: file.name,
hast: parseSource(file.content, file.name),
}));
For unsupported file extensions, the parser returns plain text:
const parseSource = await createParseSource();
// Unsupported extension - returns plain text node
const result = parseSource('Some content', 'file.xyz');
// {
// type: 'root',
// children: [{ type: 'text', value: 'Some content' }]
// }
// File without extension - also returns plain text
const readme = parseSource('# README', 'README');
After initialization, parseSource can be used directly from anywhere:
import { createParseSource, parseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
// Initialize once (e.g., in app startup)
await createParseSource();
// Use the global instance anywhere in your code
function highlightCode(code: string, fileName: string) {
return parseSource(code, fileName); // Works without re-initialization
}
import { createParseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
import { hastToReact } from '@mui/internal-docs-infra/hastUtils';
const parseSource = await createParseSource();
function HighlightedCode({ code, fileName }: { code: string; fileName: string }) {
const hast = parseSource(code, fileName);
const jsx = hastToReact(hast);
return <pre>{jsx}</pre>;
}
createParseSource() creates a Starry Night instance with grammar definitionspl-* classesdi-* classes (numbers, booleans, nullish, attributes) without removing existing classesstarryNightGutter() adds line number elements to the treeThe Starry Night instance is cached globally using __docs_infra_starry_night_instance__ to avoid re-initialization.
Starry Night maps TextMate scopes to a small set of pl-* CSS classes (the GitHub Prettylights theme). Many token types share the same class — pl-c1 covers numbers, booleans, null, component names, and more.
During parsing, parseSource extends the highlighted tree with additive di-* classes that distinguish these cases, giving CSS full control over each token type:
| Class | Meaning | Example tokens |
|---|---|---|
di-num | Numeric constant | 42, 3.14, 0xFF, div { width: 100px } |
di-bool | Boolean | true, false |
di-n | Nullish value | null, undefined, "", '' |
di-this | Self-reference keyword | this, super |
di-bt | Built-in type keyword | string, number, boolean, void, never |
di-jsx | JSX component element | Button in <Button />, </Button> |
di-jv | JSX expression variable | age in <Select value={age} /> |
di-ak | Attribute key | className in <div className="x"> |
di-ae | Attribute equals | = in <div className="x"> |
di-av | Attribute value | "x" in <div className="x"> |
di-pu | Symbolic punctuation | =, =>, &&, ||, ..., + |
di-op | Object property name | height in { height: 400 } |
di-ps | Object property string | 'aria-label' in { 'aria-label': 'x' } |
di-te | Template interpolation region | `Hi ${user.name}` (the ${user.name} slice) |
di-td | Template interpolation delimiter | ${ and } in `${value}` |
di-da | CSS data attribute selector | data-active in [data-active] |
di-cp | CSS property name | color in div { color: red } |
di-cv | CSS property value | flex in div { display: flex } |
di-ht | HTML tag wrapper (inline only) | <div>, </span> (wraps brackets + tag name) |
di-jt | JSX component tag wrapper (inline only) | <Box>, </Stack> (wraps brackets + tag name) |
di-this applies to JS/TS family grammars (.js, .ts, .tsx, .jsx, .mdx). di-bt applies to TS family grammars only (.ts, .tsx, .jsx, .mdx) — plain JS is excluded because string, number, etc. are valid variable names. di-jsx and di-jv only apply to JSX grammars (.tsx, .jsx, .mdx); di-jv is added to identifier spans (pl-smi, pl-v) appearing inside JSX expression braces ({...}), including spread arguments, arrow-function parameters, and pl-c1 member-access property names (e.g. name in {row.name}). Object property keys inside JSX expressions also receive di-jv in addition to di-op. di-ak, di-ae, di-av only apply to HTML/JSX/MDX grammars. di-pu applies to all grammars and tags pl-k operator tokens whose text is purely symbolic (it is not added to word keywords like const or if). di-op applies to JS-family grammars and tags both bare-identifier object keys (e.g. height in { height: 400 }) and string keys (e.g. 'aria-label'); string keys additionally receive di-ps. di-da, di-cp, and di-cv only apply to CSS. di-ht and di-jt are added by enhanceCodeInline (not parseSource) as wrapper spans around tag brackets and the inner pl-ent or pl-c1 span.
di-te and di-td apply to JS-family grammars and handle template-literal interpolations (`...${expr}...`). Starry Night tokenizes the whole backtick string as a single pl-s span, so without these classes the ${/} delimiters and any bare punctuation inside the expression (e.g. . in ${a.b}) inherit the string color. parseSource wraps each ${ ... } slice in a di-te span so the interpolated expression resets from the string color, and wraps the ${ and } glyphs themselves in di-td spans. Inner pl-* tokens (and additive di-* classes such as di-num on ${42}) still apply on top of the reset. Only backtick template literals are processed — a ${...} sequence inside a regular "/' string is inert text and is left untouched. Because a multi-line template literal is tokenized as one pl-s span per line, an interpolation that spans several lines is wrapped with a separate di-te slice on each line (mirroring how the string itself continues across lines).
Existing pl-* classes are never removed — <span class="pl-c1">42</span> becomes <span class="pl-c1 di-num">42</span>.
The parser supports languages based on file extensions:
| Language | Extensions |
|---|---|
| JavaScript | .js, .mjs, .cjs, .jsx |
| TypeScript | .ts, .tsx |
| CSS | .css |
| HTML | .html, .htm |
| JSON | .json |
| And more | Via Starry Night grammars |
See grammars.ts for the complete list of supported languages.
import { parseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
// ❌ This will throw an error
parseSource('code', 'file.js');
// Error: Starry Night not initialized. Use createParseSource to create an initialized parseSource function.
// ✓ Initialize first
import { createParseSource } from '@mui/internal-docs-infra/pipeline/parseSource';
await createParseSource();
parseSource('code', 'file.js'); // Now works
const parseSource = await createParseSource();
// Handles empty content gracefully
const result = parseSource('', 'empty.js');
// Returns valid HAST tree with empty content
When NOT to use:
<pre> tags if highlighting isn't neededcreateParseSourceInitializes Starry Night and returns a configured parseSource function.
Only needs to be called once per application; the instance is stored globally
for reuse across calls.
With no initialScopes, loads ALL grammars via the (lazy) ./grammars barrel — the eager CodeProvider / Node / build-time behavior, so the heavy TextMate
JSON is split into its own chunk but fully available. Pass initialScopes
(possibly []) to create a lean instance that registers grammars on demand
via registerGrammars — the CodeProviderLazy per-language path.
| Parameter | Type | Description |
|---|---|---|
| initialScopes | |
Promise<ParseSource>A Promise that resolves to the initialized parseSource function
parseSourceParses source code into a HAST tree with syntax highlighting.
| Parameter | Type | Description |
|---|---|---|
| source | | |
| fileName | | |
| language | |
HastRootHAST Root node containing highlighted code structure with line gutters
getGrammarFromLanguageGets the grammar scope from a language name.
| Parameter | Type | Description |
|---|---|---|
| language | | The language name (e.g., ‘tsx’, ‘css’, ‘typescript’) |
string | undefinedThe grammar scope or undefined if not recognized
registerGrammarsRegisters the grammars for the given scopes (and their dependencies) on the global Starry Night instance, loading the per-scope chunks on demand. Idempotent and deduped. Fails open: a chunk that fails to load leaves its scope as plain text rather than rejecting the batch.
This is the heavy implementation (it can create the engine instance). Client
code should call the light facade ensureGrammars from ./grammarCache
instead, so the engine stays out of the client bundle until a block needs it.
| Parameter | Type | Description |
|---|---|---|
| scopes | |
Promise<void>Light-weight grammar metadata maps. These can be statically imported without
pulling in the heavy TextMate grammar JSON payloads (which live in
./grammars.ts and should be loaded via dynamic import('./grammars') so
the bundler can code-split them into their own chunk).
type extensionMap = Record<string, string>Maps simplified language names back to grammar scope names.
Used when language prop is provided instead of fileName.
type languageToGrammarMap = Record<string, string>hastUtils - For converting HAST to React JSXCodeHighlighter - Uses parseSource for syntax highlightingloadPrecomputedCodeHighlighter - Build-time optimization using parseSource