Syntaxes Reference
Block syntax definitions for parsing different formats.
Overview
Syntaxes define how blocks are detected and parsed in text streams. Streamblocks includes three built-in syntaxes.
classDiagram
class BaseSyntax {
<<abstract>>
+detect(line: str) DetectionResult
+parse_metadata(text: str) dict
+parse_content(text: str) str
}
BaseSyntax <|-- DelimiterPreambleSyntax
BaseSyntax <|-- DelimiterFrontmatterSyntax
BaseSyntax <|-- MarkdownFrontmatterSyntax
DelimiterPreambleSyntax
Compact inline syntax with metadata in the opening marker.
Format
Usage
from hother.streamblocks import DelimiterPreambleSyntax
syntax = DelimiterPreambleSyntax()
# Custom delimiters
syntax = DelimiterPreambleSyntax(
start_pattern=r"^<<(\w+):(\w+)$",
end_marker="<<end",
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
start_pattern |
str |
r"^!!(\w+):(\w+)$" |
Regex for opening marker |
end_marker |
str |
"!!end" |
Closing marker string |
Example
!!task01:task
Review the pull request
Add comments for improvement
!!end
!!code01:code
def hello():
print("Hello!")
!!end
DelimiterFrontmatterSyntax
Delimiter syntax with YAML frontmatter metadata section.
Format
Usage
from hother.streamblocks import DelimiterFrontmatterSyntax
syntax = DelimiterFrontmatterSyntax()
# Custom delimiters
syntax = DelimiterFrontmatterSyntax(
start_delimiter="[[START]]",
metadata_end="[[META]]",
end_delimiter="[[END]]",
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
start_delimiter |
str |
"<<<BLOCK" |
Opening marker |
metadata_end |
str |
">>>" |
Metadata section end |
end_delimiter |
str |
"<<<END>>>" |
Closing marker |
Example
<<<BLOCK
id: message01
type: message
author: assistant
>>>
Hello! How can I help you today?
<<<END>>>
MarkdownFrontmatterSyntax
Standard Markdown frontmatter with YAML metadata.
Format
Usage
from hother.streamblocks import MarkdownFrontmatterSyntax
syntax = MarkdownFrontmatterSyntax()
# Custom delimiters
syntax = MarkdownFrontmatterSyntax(
delimiter="+++", # TOML-style
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
delimiter |
str |
"---" |
Frontmatter delimiter |
Example
---
id: article01
type: article
author: John Doe
tags:
- python
- tutorial
---
# Introduction
This is the article content.
---
Syntax Selection
Choose syntax based on your use case:
| Syntax | Best For | Pros | Cons |
|---|---|---|---|
DelimiterPreambleSyntax |
LLM output, tool calls | Compact, easy to generate | Limited metadata |
DelimiterFrontmatterSyntax |
Structured documents | Full YAML metadata | More verbose |
MarkdownFrontmatterSyntax |
Markdown documents | Standard format | Conflicts with content |
Creating Custom Syntaxes
Extend BaseSyntax for custom formats:
from hother.streamblocks.syntaxes.base import BaseSyntax
from hother.streamblocks import DetectionResult
class XMLBlockSyntax(BaseSyntax):
"""XML-style block syntax."""
def detect(self, line: str) -> DetectionResult:
"""Detect block markers in a line."""
if line.startswith("<block"):
# Parse attributes
import re
match = re.match(r'<block\s+id="(\w+)"\s+type="(\w+)">', line)
if match:
return DetectionResult(
is_opening=True,
metadata={
"id": match.group(1),
"block_type": match.group(2),
},
)
if line == "</block>":
return DetectionResult(is_closing=True)
return DetectionResult()
def parse_metadata(self, text: str) -> dict:
"""Parse metadata from text."""
return {} # Metadata parsed in detect()
def parse_content(self, text: str) -> str:
"""Parse content from text."""
return text.strip()
@property
def name(self) -> str:
return "xml_block"
Required Methods
| Method | Description |
|---|---|
detect(line) |
Detect block markers, return DetectionResult |
parse_metadata(text) |
Parse metadata section to dict |
parse_content(text) |
Parse content section |
name |
Syntax name property |
DetectionResult Fields
| Field | Type | Description |
|---|---|---|
is_opening |
bool |
Line is block opening marker |
is_closing |
bool |
Line is block closing marker |
is_metadata_boundary |
bool |
Line ends metadata section |
metadata |
dict \| None |
Inline metadata from marker |
Multiple Syntaxes
Use multiple syntaxes in a single processor:
from hother.streamblocks import (
StreamBlockProcessor,
Registry,
DelimiterPreambleSyntax,
MarkdownFrontmatterSyntax,
)
processor = StreamBlockProcessor(
registry=Registry(),
syntaxes=[
DelimiterPreambleSyntax(),
MarkdownFrontmatterSyntax(),
],
)
# Both formats are detected
text = """
!!task01:task
Do something
!!end
---
id: article01
type: article
---
Article content
---
"""
API Reference
hother.streamblocks.syntaxes.DelimiterPreambleSyntax
Bases: BaseSyntax
This syntax uses delimiter markers with inline metadata in the opening line. Metadata is extracted from the delimiter preamble, and all lines between opening and closing delimiters become the content.
Format
!!
The opening delimiter must include
- Block ID (alphanumeric, required)
- Block type (alphanumeric, required)
- Additional parameters (optional, colon-separated)
Additional parameters are stored as param_0, param_1, etc. in metadata.
Examples:
>>> # Simple block with just ID and type
>>> '''
... !!patch001:patch
... Fix the login bug
... !!end
... '''
>>>
>>> # Block with parameters
>>> '''
... !!file123:operation:create:urgent
... Create new config file
... !!end
... '''
>>> # Metadata will be: {
>>> # "id": "file123",
>>> # "block_type": "operation",
>>> # "param_0": "create",
>>> # "param_1": "urgent"
>>> # }
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
delimiter
|
str
|
Opening delimiter string (default: "!!") |
'!!'
|
detect_line
detect_line(
line: str, candidate: BlockCandidate | None = None
) -> DetectionResult
Detect delimiter-based markers.
should_accumulate_metadata
should_accumulate_metadata(
candidate: BlockCandidate,
) -> bool
No separate metadata section for this syntax.
extract_block_type
extract_block_type(candidate: BlockCandidate) -> str | None
Extract block_type from opening line.
parse_block
parse_block(
candidate: BlockCandidate,
block_class: type[Any] | None = None,
) -> ParseResult[BaseMetadata, BaseContent]
Parse the complete block using the specified block class.
validate_block
validate_block(
_block: ExtractedBlock[BaseMetadata, BaseContent],
) -> bool
Additional validation after parsing.
parse_metadata_early
parse_metadata_early(
candidate: BlockCandidate,
) -> dict[str, Any] | None
Parse metadata from inline preamble.
For this syntax, metadata is extracted from the opening line (e.g., !!id:type:param1:param2).
parse_content_early
parse_content_early(
candidate: BlockCandidate,
) -> dict[str, Any] | None
Parse content section early.
Returns raw content dict with the content text.
hother.streamblocks.syntaxes.DelimiterFrontmatterSyntax
Bases: BaseSyntax, YAMLFrontmatterMixin
This syntax uses simple delimiter markers with YAML frontmatter for metadata. The frontmatter section is delimited by --- markers and must be valid YAML.
Format
!!start
id: block_001 block_type: example custom_field: value
Content lines here !!end
The YAML frontmatter should include
- id: Block identifier (required if using BaseMetadata)
- block_type: Block type (required if using BaseMetadata)
- Any additional custom fields defined in your metadata class
Examples:
>>> # Simple block with minimal metadata
>>> '''
... !!start
... ---
... id: msg001
... block_type: message
... ---
... Hello, world!
... !!end
... '''
>>>
>>> # Block with nested YAML metadata
>>> '''
... !!start
... ---
... id: task001
... block_type: task
... priority: high
... tags:
... - urgent
... - backend
... ---
... Implement user authentication
... !!end
... '''
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start_delimiter
|
str
|
Opening delimiter string (default: "!!start") |
'!!start'
|
end_delimiter
|
str
|
Closing delimiter string (default: "!!end") |
'!!end'
|
detect_line
detect_line(
line: str, candidate: BlockCandidate | None = None
) -> DetectionResult
Detect delimiter markers and frontmatter boundaries.
should_accumulate_metadata
should_accumulate_metadata(
candidate: BlockCandidate,
) -> bool
Check if we're still in metadata section.
extract_block_type
extract_block_type(candidate: BlockCandidate) -> str | None
Extract block_type from YAML frontmatter.
parse_block
parse_block(
candidate: BlockCandidate,
block_class: type[Any] | None = None,
) -> ParseResult[BaseMetadata, BaseContent]
Parse the complete block using the specified block class.
validate_block
validate_block(
_block: ExtractedBlock[BaseMetadata, BaseContent],
) -> bool
Additional validation after parsing.
parse_metadata_early
parse_metadata_early(
candidate: BlockCandidate,
) -> dict[str, Any] | None
Parse YAML metadata section early.
Returns parsed YAML frontmatter as a dict.
parse_content_early
parse_content_early(
candidate: BlockCandidate,
) -> dict[str, Any] | None
Parse content section early.
Returns raw content dict with the content text.
hother.streamblocks.syntaxes.MarkdownFrontmatterSyntax
Bases: BaseSyntax, YAMLFrontmatterMixin
This syntax uses Markdown-style fenced code blocks with optional YAML frontmatter for metadata. The info_string after the opening fence can be used as a fallback block_type when no frontmatter is present.
Format
```[info_string]
id: block_001 block_type: example custom_field: value
Content lines here ```
The info_string is optional. When provided, it's used as the block_type if no YAML frontmatter is present. The YAML frontmatter is also optional - if omitted, all content becomes the block content.
Examples:
>>> # Block with frontmatter
>>> '''
... ```python
... ---
... id: code001
... block_type: code
... language: python
... ---
... def hello():
... print("Hello, world!")
... ```
... '''
>>>
>>> # Block without frontmatter (info_string becomes block_type)
>>> '''
... ```patch
... diff --git a/file.py b/file.py
... - old line
... + new line
... ```
... '''
>>> # block_type will be "patch" from info_string
>>>
>>> # Block with nested YAML
>>> '''
... ```task
... ---
... id: task001
... block_type: task
... assignees:
... - alice
... - bob
... ---
... Implement user authentication
... ```
... '''
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fence
|
str
|
Fence string (default: "```") |
'```'
|
info_string
|
str | None
|
Optional info string used as fallback block_type |
None
|
parse_metadata_early
parse_metadata_early(
candidate: BlockCandidate,
) -> dict[str, Any] | None
Parse metadata section early, before content accumulation.
This method is called when the metadata section completes, allowing early validation and processing. The result can be cached in the candidate for reuse during full block parsing.
Default implementation returns None (no early parsing). Override in subclasses to provide early metadata parsing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
candidate
|
BlockCandidate
|
The current block candidate with metadata accumulated |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any] | None
|
Parsed metadata dict if successful, None if parsing not supported |
dict[str, Any] | None
|
or failed |
parse_content_early
parse_content_early(
candidate: BlockCandidate,
) -> dict[str, Any] | None
Parse content section early, before final block extraction.
This method is called when the content section completes (block closes), allowing early validation. The result can be cached in the candidate for reuse during full block parsing.
Default implementation returns None (no early parsing). Override in subclasses to provide early content parsing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
candidate
|
BlockCandidate
|
The complete block candidate with content accumulated |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any] | None
|
Parsed content dict if successful, None if parsing not supported |
dict[str, Any] | None
|
or failed |
detect_line
detect_line(
line: str, candidate: BlockCandidate | None = None
) -> DetectionResult
Detect markdown fence markers and frontmatter boundaries.
should_accumulate_metadata
should_accumulate_metadata(
candidate: BlockCandidate,
) -> bool
Check if we're still in metadata section.
extract_block_type
extract_block_type(candidate: BlockCandidate) -> str | None
Extract block_type from YAML frontmatter.
parse_block
parse_block(
candidate: BlockCandidate,
block_class: type[Any] | None = None,
) -> ParseResult[BaseMetadata, BaseContent]
Parse the complete block using the specified block class.
validate_block
validate_block(
_block: ExtractedBlock[BaseMetadata, BaseContent],
) -> bool
Additional validation after parsing.