Skip to content

Syntaxes Reference

Block syntax definitions for parsing different formats.

Overview

Syntaxes define how blocks are detected and parsed in text streams. Streamblocks includes three built-in syntaxes.

classDiagram
    class BaseSyntax {
        <<abstract>>
        +detect(line: str) DetectionResult
        +parse_metadata(text: str) dict
        +parse_content(text: str) str
    }

    BaseSyntax <|-- DelimiterPreambleSyntax
    BaseSyntax <|-- DelimiterFrontmatterSyntax
    BaseSyntax <|-- MarkdownFrontmatterSyntax

DelimiterPreambleSyntax

Compact inline syntax with metadata in the opening marker.

Format

!!block_id:block_type
Block content here
!!end

Usage

from hother.streamblocks import DelimiterPreambleSyntax

syntax = DelimiterPreambleSyntax()

# Custom delimiters
syntax = DelimiterPreambleSyntax(
    start_pattern=r"^<<(\w+):(\w+)$",
    end_marker="<<end",
)

Parameters

Parameter Type Default Description
start_pattern str r"^!!(\w+):(\w+)$" Regex for opening marker
end_marker str "!!end" Closing marker string

Example

!!task01:task
Review the pull request
Add comments for improvement
!!end

!!code01:code
def hello():
    print("Hello!")
!!end

DelimiterFrontmatterSyntax

Delimiter syntax with YAML frontmatter metadata section.

Format

<<<BLOCK
id: block01
type: task
priority: high
>>>
Block content here
<<<END>>>

Usage

from hother.streamblocks import DelimiterFrontmatterSyntax

syntax = DelimiterFrontmatterSyntax()

# Custom delimiters
syntax = DelimiterFrontmatterSyntax(
    start_delimiter="[[START]]",
    metadata_end="[[META]]",
    end_delimiter="[[END]]",
)

Parameters

Parameter Type Default Description
start_delimiter str "<<<BLOCK" Opening marker
metadata_end str ">>>" Metadata section end
end_delimiter str "<<<END>>>" Closing marker

Example

<<<BLOCK
id: message01
type: message
author: assistant
>>>
Hello! How can I help you today?
<<<END>>>

MarkdownFrontmatterSyntax

Standard Markdown frontmatter with YAML metadata.

Format

---
id: block01
type: task
priority: high
---
Block content here
---

Usage

from hother.streamblocks import MarkdownFrontmatterSyntax

syntax = MarkdownFrontmatterSyntax()

# Custom delimiters
syntax = MarkdownFrontmatterSyntax(
    delimiter="+++",  # TOML-style
)

Parameters

Parameter Type Default Description
delimiter str "---" Frontmatter delimiter

Example

---
id: article01
type: article
author: John Doe
tags:
  - python
  - tutorial
---
# Introduction

This is the article content.

---

Syntax Selection

Choose syntax based on your use case:

Syntax Best For Pros Cons
DelimiterPreambleSyntax LLM output, tool calls Compact, easy to generate Limited metadata
DelimiterFrontmatterSyntax Structured documents Full YAML metadata More verbose
MarkdownFrontmatterSyntax Markdown documents Standard format Conflicts with content

Creating Custom Syntaxes

Extend BaseSyntax for custom formats:

from hother.streamblocks.syntaxes.base import BaseSyntax
from hother.streamblocks import DetectionResult

class XMLBlockSyntax(BaseSyntax):
    """XML-style block syntax."""

    def detect(self, line: str) -> DetectionResult:
        """Detect block markers in a line."""
        if line.startswith("<block"):
            # Parse attributes
            import re
            match = re.match(r'<block\s+id="(\w+)"\s+type="(\w+)">', line)
            if match:
                return DetectionResult(
                    is_opening=True,
                    metadata={
                        "id": match.group(1),
                        "block_type": match.group(2),
                    },
                )
        if line == "</block>":
            return DetectionResult(is_closing=True)
        return DetectionResult()

    def parse_metadata(self, text: str) -> dict:
        """Parse metadata from text."""
        return {}  # Metadata parsed in detect()

    def parse_content(self, text: str) -> str:
        """Parse content from text."""
        return text.strip()

    @property
    def name(self) -> str:
        return "xml_block"

Required Methods

Method Description
detect(line) Detect block markers, return DetectionResult
parse_metadata(text) Parse metadata section to dict
parse_content(text) Parse content section
name Syntax name property

DetectionResult Fields

Field Type Description
is_opening bool Line is block opening marker
is_closing bool Line is block closing marker
is_metadata_boundary bool Line ends metadata section
metadata dict \| None Inline metadata from marker

Multiple Syntaxes

Use multiple syntaxes in a single processor:

from hother.streamblocks import (
    StreamBlockProcessor,
    Registry,
    DelimiterPreambleSyntax,
    MarkdownFrontmatterSyntax,
)

processor = StreamBlockProcessor(
    registry=Registry(),
    syntaxes=[
        DelimiterPreambleSyntax(),
        MarkdownFrontmatterSyntax(),
    ],
)

# Both formats are detected
text = """
!!task01:task
Do something
!!end

---
id: article01
type: article
---
Article content
---
"""

API Reference

hother.streamblocks.syntaxes.DelimiterPreambleSyntax

Bases: BaseSyntax

This syntax uses delimiter markers with inline metadata in the opening line. Metadata is extracted from the delimiter preamble, and all lines between opening and closing delimiters become the content.

Format

!!:[:param1:param2:...] Content lines here !!end

The opening delimiter must include
  • Block ID (alphanumeric, required)
  • Block type (alphanumeric, required)
  • Additional parameters (optional, colon-separated)

Additional parameters are stored as param_0, param_1, etc. in metadata.

Examples:

>>> # Simple block with just ID and type
>>> '''
... !!patch001:patch
... Fix the login bug
... !!end
... '''
>>>
>>> # Block with parameters
>>> '''
... !!file123:operation:create:urgent
... Create new config file
... !!end
... '''
>>> # Metadata will be: {
>>> #     "id": "file123",
>>> #     "block_type": "operation",
>>> #     "param_0": "create",
>>> #     "param_1": "urgent"
>>> # }

Parameters:

Name Type Description Default
delimiter str

Opening delimiter string (default: "!!")

'!!'

delimiter instance-attribute

delimiter = delimiter

detect_line

detect_line(
    line: str, candidate: BlockCandidate | None = None
) -> DetectionResult

Detect delimiter-based markers.

should_accumulate_metadata

should_accumulate_metadata(
    candidate: BlockCandidate,
) -> bool

No separate metadata section for this syntax.

extract_block_type

extract_block_type(candidate: BlockCandidate) -> str | None

Extract block_type from opening line.

parse_block

parse_block(
    candidate: BlockCandidate,
    block_class: type[Any] | None = None,
) -> ParseResult[BaseMetadata, BaseContent]

Parse the complete block using the specified block class.

validate_block

validate_block(
    _block: ExtractedBlock[BaseMetadata, BaseContent],
) -> bool

Additional validation after parsing.

parse_metadata_early

parse_metadata_early(
    candidate: BlockCandidate,
) -> dict[str, Any] | None

Parse metadata from inline preamble.

For this syntax, metadata is extracted from the opening line (e.g., !!id:type:param1:param2).

parse_content_early

parse_content_early(
    candidate: BlockCandidate,
) -> dict[str, Any] | None

Parse content section early.

Returns raw content dict with the content text.

hother.streamblocks.syntaxes.DelimiterFrontmatterSyntax

Bases: BaseSyntax, YAMLFrontmatterMixin

This syntax uses simple delimiter markers with YAML frontmatter for metadata. The frontmatter section is delimited by --- markers and must be valid YAML.

Format

!!start

id: block_001 block_type: example custom_field: value


Content lines here !!end

The YAML frontmatter should include
  • id: Block identifier (required if using BaseMetadata)
  • block_type: Block type (required if using BaseMetadata)
  • Any additional custom fields defined in your metadata class

Examples:

>>> # Simple block with minimal metadata
>>> '''
... !!start
... ---
... id: msg001
... block_type: message
... ---
... Hello, world!
... !!end
... '''
>>>
>>> # Block with nested YAML metadata
>>> '''
... !!start
... ---
... id: task001
... block_type: task
... priority: high
... tags:
...   - urgent
...   - backend
... ---
... Implement user authentication
... !!end
... '''

Parameters:

Name Type Description Default
start_delimiter str

Opening delimiter string (default: "!!start")

'!!start'
end_delimiter str

Closing delimiter string (default: "!!end")

'!!end'

start_delimiter instance-attribute

start_delimiter = start_delimiter

end_delimiter instance-attribute

end_delimiter = end_delimiter

detect_line

detect_line(
    line: str, candidate: BlockCandidate | None = None
) -> DetectionResult

Detect delimiter markers and frontmatter boundaries.

should_accumulate_metadata

should_accumulate_metadata(
    candidate: BlockCandidate,
) -> bool

Check if we're still in metadata section.

extract_block_type

extract_block_type(candidate: BlockCandidate) -> str | None

Extract block_type from YAML frontmatter.

parse_block

parse_block(
    candidate: BlockCandidate,
    block_class: type[Any] | None = None,
) -> ParseResult[BaseMetadata, BaseContent]

Parse the complete block using the specified block class.

validate_block

validate_block(
    _block: ExtractedBlock[BaseMetadata, BaseContent],
) -> bool

Additional validation after parsing.

parse_metadata_early

parse_metadata_early(
    candidate: BlockCandidate,
) -> dict[str, Any] | None

Parse YAML metadata section early.

Returns parsed YAML frontmatter as a dict.

parse_content_early

parse_content_early(
    candidate: BlockCandidate,
) -> dict[str, Any] | None

Parse content section early.

Returns raw content dict with the content text.

hother.streamblocks.syntaxes.MarkdownFrontmatterSyntax

Bases: BaseSyntax, YAMLFrontmatterMixin

This syntax uses Markdown-style fenced code blocks with optional YAML frontmatter for metadata. The info_string after the opening fence can be used as a fallback block_type when no frontmatter is present.

Format

```[info_string]

id: block_001 block_type: example custom_field: value


Content lines here ```

The info_string is optional. When provided, it's used as the block_type if no YAML frontmatter is present. The YAML frontmatter is also optional - if omitted, all content becomes the block content.

Examples:

>>> # Block with frontmatter
>>> '''
... ```python
... ---
... id: code001
... block_type: code
... language: python
... ---
... def hello():
...     print("Hello, world!")
... ```
... '''
>>>
>>> # Block without frontmatter (info_string becomes block_type)
>>> '''
... ```patch
... diff --git a/file.py b/file.py
... - old line
... + new line
... ```
... '''
>>> # block_type will be "patch" from info_string
>>>
>>> # Block with nested YAML
>>> '''
... ```task
... ---
... id: task001
... block_type: task
... assignees:
...   - alice
...   - bob
... ---
... Implement user authentication
... ```
... '''

Parameters:

Name Type Description Default
fence str

Fence string (default: "```")

'```'
info_string str | None

Optional info string used as fallback block_type

None

fence instance-attribute

fence = fence

info_string instance-attribute

info_string = info_string

parse_metadata_early

parse_metadata_early(
    candidate: BlockCandidate,
) -> dict[str, Any] | None

Parse metadata section early, before content accumulation.

This method is called when the metadata section completes, allowing early validation and processing. The result can be cached in the candidate for reuse during full block parsing.

Default implementation returns None (no early parsing). Override in subclasses to provide early metadata parsing.

Parameters:

Name Type Description Default
candidate BlockCandidate

The current block candidate with metadata accumulated

required

Returns:

Type Description
dict[str, Any] | None

Parsed metadata dict if successful, None if parsing not supported

dict[str, Any] | None

or failed

parse_content_early

parse_content_early(
    candidate: BlockCandidate,
) -> dict[str, Any] | None

Parse content section early, before final block extraction.

This method is called when the content section completes (block closes), allowing early validation. The result can be cached in the candidate for reuse during full block parsing.

Default implementation returns None (no early parsing). Override in subclasses to provide early content parsing.

Parameters:

Name Type Description Default
candidate BlockCandidate

The complete block candidate with content accumulated

required

Returns:

Type Description
dict[str, Any] | None

Parsed content dict if successful, None if parsing not supported

dict[str, Any] | None

or failed

detect_line

detect_line(
    line: str, candidate: BlockCandidate | None = None
) -> DetectionResult

Detect markdown fence markers and frontmatter boundaries.

should_accumulate_metadata

should_accumulate_metadata(
    candidate: BlockCandidate,
) -> bool

Check if we're still in metadata section.

extract_block_type

extract_block_type(candidate: BlockCandidate) -> str | None

Extract block_type from YAML frontmatter.

parse_block

parse_block(
    candidate: BlockCandidate,
    block_class: type[Any] | None = None,
) -> ParseResult[BaseMetadata, BaseContent]

Parse the complete block using the specified block class.

validate_block

validate_block(
    _block: ExtractedBlock[BaseMetadata, BaseContent],
) -> bool

Additional validation after parsing.