RFC-005 Segment Addressing Scheme
RFC 005: Segment Addressing Scheme
Section titled “RFC 005: Segment Addressing Scheme”Status: Implemented Date: 2026-01-30 Topics: URN, Linking, Interlinear Alignment
1. Summary
Section titled “1. Summary”This RFC defines the standard syntax for addressing sub-segments within a Vyasa Marker. It extends the Base URN scheme (e.g., urn:vyasa:bg:1:1) with a path component to allow granular linking to specific commands and text segments, enabling robust interlinear alignment and precise citations.
2. Motivation
Section titled “2. Motivation”Currently, we can address a “Whole Marker” (like a Verse), but we cannot address:
- Specific Layers: e.g., “The Sanskrit transliteration of Verse 1”.
- Specific Segments: e.g., “The second distinct phrase in the verse”.
Granular addressing is required for:
- Interlinear Alignment: Mapping specific words in the source to the translation.
- Fine-grained Commentary: Commenting on a specific word or phrase.
- Audio Sync: Mapping timecodes to text segments.
3. Design Specification
Section titled “3. Design Specification”3.1 Syntax
Section titled “3.1 Syntax”The Segment URN is constructed by appending a path to the Base Marker URN.
{BaseURN}/{ComponentPath}- Separator:
/(Forward Slash) - BaseURN: The URN defined by the
markercommand (e.g.,urn:vyasa:bg:6:5). - ComponentPath: One or more path segments describing the target node.
3.2 Resolution Logic: Descendant Search
Section titled “3.2 Resolution Logic: Descendant Search”To ensure stability against layout refactoring (e.g., wrapping content in <div> or textstream), the resolution algorithm uses a Descendant Search strategy (similar to CSS selectors or XPath //).
Algorithm:
- Start at the Marker Node.
- Parse the first path component (e.g.,
d). - Perform a Breadth-First Search (BFS) of the marker’s subtree.
- Return the first descendant command that matches the alias/name.
- Note: Structural wrappers (
textstream,interlinear-streams) are implicitly skipped unless explicitly targeted.
- Note: Structural wrappers (
3.3 Path Component Types
Section titled “3.3 Path Component Types”A. Command Selector (/name)
Section titled “A. Command Selector (/name)”Targets a descendant command by its name or alias.
- Syntax:
/command - Example:
.../d- Finds the first
d(Devanagari) command inside the marker.
- Finds the first
B. Ordinal Selector (/name:N)
Section titled “B. Ordinal Selector (/name:N)”Targets the Nth instance of a command (0-indexed) found during the BFS traversal.
- Syntax:
/command:N - Example:
.../p:1- Finds the second paragraph (
p) inside the marker.
- Finds the second paragraph (
C. Interlinear Segment Selector (/name/s:N)
Section titled “C. Interlinear Segment Selector (/name/s:N)”Targets a specific text segment within a command. This is used for split blocks where content is separated by pipes (|).
- Syntax:
/command/s:N - Virtual Node:
s(Segment) is a virtual selector. - Logic:
- Resolve the parent command (e.g.,
d). - Collect all
Textchildren, split bySegmentBreaknodes. - Return the Nth logical text chunk.
- Resolve the parent command (e.g.,
- Example:
.../d/s:1- Given
d[ A | B ]: Targets string “B”.
- Given
4. Examples
Section titled “4. Examples”Source:
`marker 1.1 `textstream [ `d [ धर्मक्षेत्रे कुरुक्षेत्रे | समवेता युयुत्सव: ... ] `i [ dharmakṣētrē | kurukṣētrē ... ] ]Addressable URNs:
| Target | URN | Note |
|---|---|---|
| Marker | urn:vyasa:bg:1:1 | The root anchor. |
| Devanagari | urn:vyasa:bg:1:1/d | Finds nested d, skipping textstream. |
| Transliteration | urn:vyasa:bg:1:1/i | Finds nested i. |
| Word 1 (San) | urn:vyasa:bg:1:1/d/s:0 | ”धर्मक्षेत्रे कुरुक्षेत्रे” |
| Word 2 (San) | urn:vyasa:bg:1:1/d/s:1 | ”समवेता युयुत्सव:“ |
5. Implementation Plan
Section titled “5. Implementation Plan”- Resolver: Implement
resolve_segment(root: &Node, path: &str) -> Option<&Node>invyasac::models. - Enricher: During compilation, optionally pre-calculate these URNs for leaf nodes if
strict_addressingis enabled. - Packer: Ensure these URNs are valid targets for external tooling.