semantics: add CstReader-based operation dispatch#613
Conversation
…-reader # Conflicts: # packages/runtime/src/cstReader.ts # packages/runtime/src/unstableDebug.ts
commit: |
There was a problem hiding this comment.
Pull request overview
This PR extends the v18 semantics package with a new CstReader-based operation API (aimed at lower-allocation / faster semantics execution), while also refining runtime CST-reader utilities and fixing a global action-stack regression.
Changes:
- Add
createReaderOperation(+ tests) for semantics execution overohm-js/cstReader. - Improve runtime CST-reader ergonomics (
withChildren, tuple helpers, edge-flag constant sharing) and harden MatchResult failure-info behavior across subsequent matches. - Update package entrypoints/imports (export
Operationtype publicly; update lang-python to import from@ohm-js/semantics) and add a small benchmark.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Adds tinybench to the lockfile for benchmarking support. |
| packages/semantics/src/reader.ts | Introduces createReaderOperation for CstReader-based semantic actions with dispatch-table caching. |
| packages/semantics/src/reader.test.ts | Adds AVA tests covering arithmetic + list/opt behavior for the reader-based operation. |
| packages/semantics/src/index.ts | Exports Operation type and fixes global action-stack cleanup by restoring stack length. |
| packages/semantics/src/index.test.ts | Adds regression test ensuring missing-action errors don’t corrupt the global action stack. |
| packages/semantics/package.json | Adds package exports, adds tinybench, and runs a benchmark as part of test. |
| packages/semantics/bench.ts | Adds a benchmark comparing CstNode vs CstReader operation implementations. |
| packages/runtime/src/unstableDebug.ts | Uses the shared CST_HAS_LEADING_SPACES_FLAG constant instead of a magic bitmask. |
| packages/runtime/src/miniohm.ts | Stabilizes MatchResult.input per-result and makes stale failure-info access throw; updates reader integration points. |
| packages/runtime/src/cstReaderShared.ts | Extracts packed-handle utilities/constants into a shared module. |
| packages/runtime/src/cstReaderFactory.ts | Moves createReaderFromCtx into a dedicated factory module with bounds checks. |
| packages/runtime/src/cstReader.ts | Adds reader helpers (ruleNames, ruleId, tupleArity, isPresent, withChildren, forEachTuple) and updates child iteration signature. |
| packages/lang-python/convertToOhm.ts | Switches imports to the published @ohm-js/semantics entrypoint. |
| packages/compiler/test/test-wasm.js | Adds regressions for stable MatchResult.input and guarding stale failure-info reads. |
| packages/compiler/test/test-cstReader.js | Updates forEachChild signature usage; adds tests for new reader helper APIs and createHandle relocation. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "main": "dist/index.js", | ||
| "exports": { | ||
| ".": { | ||
| "types": "./dist/src/index.d.ts", | ||
| "default": "./dist/src/index.js" | ||
| }, | ||
| "./reader": { | ||
| "types": "./dist/src/reader.d.ts", | ||
| "default": "./dist/src/reader.js" | ||
| } |
There was a problem hiding this comment.
main points to dist/index.js, but this package’s exports entrypoints point at ./dist/src/index.js and ./dist/src/index.d.ts. With the current tsconfig (outDir=dist and sources under src/), dist/index.js likely won’t exist, and some tooling still consults main even when exports is present. Align main (and optionally add a types field) to the same files as exports or add a root-level build entry that actually emits dist/index.js.
| "scripts": { | ||
| "build": "tsc", | ||
| "test": "ava" | ||
| "test": "ava && node --experimental-strip-types bench.ts --small-size" |
There was a problem hiding this comment.
The test script runs a benchmark (bench.ts) in addition to AVA. This makes pnpm -r test slower/noisier and couples correctness tests to an experimental Node flag (--experimental-strip-types). Consider moving the benchmark into a separate script (e.g. bench) and keeping test focused on deterministic unit tests.
| "test": "ava && node --experimental-strip-types bench.ts --small-size" | |
| "test": "ava", | |
| "bench": "node --experimental-strip-types bench.ts --small-size" |
| childAt(handle: number, index: number, edgeStartIdx: number): number { | ||
| const raw = rawHandle(handle); | ||
| const slot = this._ctx.view.getUint32(raw + CST_CHILDREN_OFFSET + index * 4, true); | ||
| const hasLeadingSpaces = (slot & CST_HAS_LEADING_SPACES_FLAG) !== 0; | ||
| const rawChild = slot & ~CST_HAS_LEADING_SPACES_FLAG; | ||
|
|
||
| const {getSpacesLenAt} = this._ctx; | ||
| const leadingSpacesLen = | ||
| hasLeadingSpaces && getSpacesLenAt && this._hasParentSpaces(rawChild) | ||
| ? Math.max(0, getSpacesLenAt(edgeStartIdx)) | ||
| : 0; | ||
|
|
||
| return createHandle(rawChild, edgeStartIdx + leadingSpacesLen); | ||
| } |
There was a problem hiding this comment.
childAt is marked @internal but is still a public method on an exported class, and it performs unchecked reads (no terminal/type guard and no bounds check for index). If it’s called with a terminal handle or an out-of-range index, it can read arbitrary offsets from the wasm DataView. Consider making it private/#childAt, or at least asserting !isTaggedTerminal(raw) and index < childCount(handle) to fail fast with a clear error.
createReaderOperationin packages/semantics/src/reader.ts — a zero-allocation operation dispatch that works directly with CstReader handles instead of materialized CstNode objects. Uses a ruleId-indexed dispatch table for fast action lookup.CstReaderwith new methods and extract handle packing into cstReaderShared.ts so it can be reused.forEachChildcallback signaturecreateOperation: thefinallyblock always poppedglobalActionStack, even on paths that never pushed (missing-action errors). This corrupted the stack trace for subsequent errors in the same tree walk.MatchResult.input: was reading fromgrammar._input(mutable), so a stale result would report the wrong input after a subsequentmatch(). Now reads from the per-resultctx.input.FailedMatchResult.getRightmostFailures()against use when wasm state has been overwritten by a later match.