Block Types
Blocks are structured units of content extracted from text streams. This guide covers the block model, built-in types, and creating custom blocks.
Block Structure
Every block has two components:
flowchart TB
Block[Block] --> Metadata[Metadata]
Block --> Content[Content]
Metadata --> ID[id: str]
Metadata --> Type[block_type: str]
Metadata --> Custom[Custom fields...]
Content --> Raw[raw_content: str]
Content --> Parsed[Parsed data...]
Block Class
from streamblocks import Block
block = Block(
metadata=TaskMetadata(id="task01", block_type="task"),
content=TaskContent(raw_content="Implement feature"),
)
# Access data
block.metadata.id # "task01"
block.metadata.block_type # "task"
block.content.raw_content # "Implement feature"
Metadata
Metadata is parsed from the block header or frontmatter.
BaseMetadata
All metadata classes inherit from BaseMetadata:
from streamblocks import BaseMetadata
from typing import Literal
class BaseMetadata:
"""Base class for block metadata."""
id: str # Required: unique identifier
block_type: str # Required: type discriminator
Custom Metadata
Create custom metadata with additional fields:
from streamblocks import BaseMetadata
from typing import Literal
class TaskMetadata(BaseMetadata):
"""Metadata for task blocks."""
block_type: Literal["task"] = "task"
priority: str = "normal"
assignee: str | None = None
due_date: str | None = None
tags: list[str] = []
class CodeMetadata(BaseMetadata):
"""Metadata for code blocks."""
block_type: Literal["code"] = "code"
language: str = "python"
filename: str | None = None
line_start: int | None = None
Metadata Validation
Pydantic validates metadata automatically:
from pydantic import field_validator
class TaskMetadata(BaseMetadata):
block_type: Literal["task"] = "task"
priority: str = "normal"
@field_validator("priority")
@classmethod
def validate_priority(cls, v: str) -> str:
valid = {"low", "normal", "high", "critical"}
if v not in valid:
raise ValueError(f"priority must be one of {valid}")
return v
Content
Content holds the parsed block body.
BaseContent
All content classes inherit from BaseContent:
from streamblocks import BaseContent
class BaseContent:
"""Base class for block content."""
raw_content: str # The raw text content
@classmethod
def parse(cls, raw_text: str) -> "BaseContent":
"""Parse raw text into content."""
return cls(raw_content=raw_text)
Custom Content
Create content classes with custom parsing:
from streamblocks import BaseContent
import json
class JSONContent(BaseContent):
"""Content parsed as JSON."""
data: dict | list | None = None
@classmethod
def parse(cls, raw_text: str) -> "JSONContent":
try:
data = json.loads(raw_text)
except json.JSONDecodeError:
data = None
return cls(raw_content=raw_text, data=data)
class MarkdownContent(BaseContent):
"""Content with Markdown parsing."""
title: str | None = None
sections: list[str] = []
@classmethod
def parse(cls, raw_text: str) -> "MarkdownContent":
lines = raw_text.strip().split("\n")
# Extract title (first # heading)
title = None
for line in lines:
if line.startswith("# "):
title = line[2:].strip()
break
# Extract sections (## headings)
sections = [
line[3:].strip()
for line in lines
if line.startswith("## ")
]
return cls(raw_content=raw_text, title=title, sections=sections)
Block Definition
Combine metadata and content into a block type:
from streamblocks import Block, BaseMetadata, BaseContent
from typing import Literal
# 1. Define metadata
class TaskMetadata(BaseMetadata):
block_type: Literal["task"] = "task"
priority: str = "normal"
assignee: str | None = None
# 2. Define content
class TaskContent(BaseContent):
@classmethod
def parse(cls, raw_text: str) -> "TaskContent":
return cls(raw_content=raw_text.strip())
# 3. Define block type alias
TaskBlock = Block[TaskMetadata, TaskContent]
Block Registry
The registry maps block types to their definitions:
from streamblocks import BlockRegistry
registry = BlockRegistry()
# Register block types
registry.register(TaskBlock)
registry.register(CodeBlock, priority=10) # Higher priority
registry.register(MessageBlock)
# With custom type name
registry.register(TaskBlock, type_name="todo")
Priority
Higher priority blocks are matched first:
registry.register(HighPriorityBlock, priority=100)
registry.register(NormalBlock, priority=50)
registry.register(FallbackBlock, priority=1)
# If multiple blocks match, HighPriorityBlock is used
Default Block
Set a fallback for unknown types:
Built-in Block Types
Streamblocks provides common block types:
GenericBlock
Default block for any content:
from streamblocks.blocks import GenericBlock
# Works with any metadata
block = GenericBlock(
metadata=BaseMetadata(id="gen01", block_type="unknown"),
content=BaseContent(raw_content="Any content"),
)
TextBlock
Plain text content:
from streamblocks.blocks import TextBlock
block = TextBlock(
metadata=TextMetadata(id="text01", block_type="text"),
content=TextContent(raw_content="Plain text here"),
)
Creating Custom Blocks
Step-by-Step
from streamblocks import Block, BaseMetadata, BaseContent, BlockRegistry
from typing import Literal
from pydantic import field_validator
# Step 1: Define metadata
class ReviewMetadata(BaseMetadata):
"""Metadata for code review blocks."""
block_type: Literal["review"] = "review"
file_path: str
line_start: int
line_end: int | None = None
severity: str = "info"
@field_validator("severity")
@classmethod
def validate_severity(cls, v: str) -> str:
valid = {"info", "warning", "error"}
if v not in valid:
raise ValueError(f"severity must be one of {valid}")
return v
# Step 2: Define content
class ReviewContent(BaseContent):
"""Content for code review blocks."""
suggestion: str | None = None
code_snippet: str | None = None
@classmethod
def parse(cls, raw_text: str) -> "ReviewContent":
lines = raw_text.strip().split("\n")
suggestion = None
code_lines = []
in_code = False
for line in lines:
if line.startswith("```"):
in_code = not in_code
elif in_code:
code_lines.append(line)
else:
if suggestion is None:
suggestion = line
return cls(
raw_content=raw_text,
suggestion=suggestion,
code_snippet="\n".join(code_lines) if code_lines else None,
)
# Step 3: Create block type
ReviewBlock = Block[ReviewMetadata, ReviewContent]
# Step 4: Register
registry = BlockRegistry()
registry.register(ReviewBlock)
Using the Custom Block
!!review01
---
type: review
file_path: src/main.py
line_start: 42
severity: warning
---
Consider using a context manager here.
```python
# Instead of:
f = open("file.txt")
f.close()
# Use:
with open("file.txt") as f:
...
## Block Lifecycle
```mermaid
sequenceDiagram
participant Stream
participant Processor
participant Registry
participant Block
Stream->>Processor: Text chunk
Processor->>Processor: Detect block start
Processor->>Processor: Accumulate content
Processor->>Processor: Detect block end
Processor->>Registry: Get block type
Registry->>Processor: Block class
Processor->>Block: Create with metadata + content
Block->>Processor: Validated block
Processor->>Stream: BLOCK_EXTRACTED event
Block Validation
Blocks are validated during creation:
Metadata Validation
class TaskMetadata(BaseMetadata):
block_type: Literal["task"] = "task"
priority: str = "normal"
@field_validator("priority")
@classmethod
def validate_priority(cls, v: str) -> str:
if v not in {"low", "normal", "high"}:
raise ValueError("Invalid priority")
return v
Content Validation
class JSONContent(BaseContent):
data: dict | None = None
@model_validator(mode="after")
def validate_json(self) -> "JSONContent":
if self.raw_content and self.data is None:
raise ValueError("Content must be valid JSON")
return self
Handling Validation Failures
async for event in processor.process_stream(stream):
if event.type == EventType.BLOCK_REJECTED:
rejection = event.rejection
if rejection.reason == BlockErrorCode.VALIDATION_FAILED:
print(f"Validation error: {rejection.message}")
Block Serialization
To Dictionary
block_dict = block.model_dump()
# {
# "metadata": {"id": "task01", "block_type": "task", ...},
# "content": {"raw_content": "...", ...}
# }
To JSON
From Dictionary
Best Practices
Keep Metadata Simple
Put complex data in content, not metadata. Metadata should be quick to parse.
Validate Early
Use Pydantic validators to catch errors during parsing, not later.
Use Literal Types
Use Literal["type"] for block_type to enable type checking.
Document Block Formats
Include examples in your block class docstrings.
Next Steps
- Validation - Validation system details
- Events - Block events
- Patterns - Common usage patterns