Extraction Pipeline¶

The LLM-powered entity and relation extraction system.

ExtractionPipeline¶

ExtractionPipeline ¶

Extracts entities and relations from conversational messages using an LLM.

Usage

pipeline = ExtractionPipeline(llm_client) result = pipeline.extract("My wife's name is Lena")

extract `async` ¶

extract(message: str) -> ExtractionResult

Extract entities and relations from a user message.

Parameters:

Name	Type	Description	Default
`message`	`str`	The user's conversational message.	required

Returns:

Type	Description
`ExtractionResult`	ExtractionResult with entities, relations, and timing info.
`ExtractionResult`	On failure, returns an empty result (never raises).

Result Types¶

ExtractionResult¶

ExtractionResult `dataclass` ¶

Complete result from processing a single message.

ExtractedEntity¶

ExtractedEntity `dataclass` ¶

An entity extracted from a user message.

ExtractedRelation¶

ExtractedRelation `dataclass` ¶

A relation extracted between two entities.

LLM Clients¶

LLMClient Protocol¶

LLMClient ¶

Bases: Protocol

Protocol for LLM clients used by the extraction pipeline.

extract `async` ¶

extract(system_prompt: str, user_message: str) -> str

Send extraction prompt to LLM and return raw text response.

Parameters:

Name	Type	Description	Default
`system_prompt`	`str`	System instructions for extraction.	required
`user_message`	`str`	The user's conversational message to extract from.	required

Returns:

Type	Description
`str`	Raw LLM response text (expected to be JSON).

Raises:

Type	Description
`LLMError`	If the LLM call fails.

MockLLMClient¶

MockLLMClient ¶

Mock LLM that returns predetermined extraction results.

Register responses with set_response(message_substring, json_response). Falls back to an empty extraction if no match is found.

call_count `property` ¶

call_count: int

last_system_prompt `property` ¶

last_system_prompt: str

last_user_message `property` ¶

last_user_message: str

set_response ¶

set_response(message_contains: str, response: dict[str, Any]) -> None

Register a canned response for messages containing the given substring.

extract `async` ¶

extract(system_prompt: str, user_message: str) -> str

AnthropicLLMClient¶

AnthropicLLMClient ¶

LLM client using the Anthropic async API (Claude Haiku for extraction).

extract `async` ¶

extract(system_prompt: str, user_message: str) -> str

Utilities¶

repair_llm_json¶

repair_llm_json ¶

repair_llm_json(raw_output: str) -> dict[str, Any] | list[Any] | None

Attempt to parse and repair common LLM JSON output issues.

Handles

Markdown fences / prose around fenced blocks
Preamble / trailing extra text (extracts first complete JSON object/array)
Trailing commas before closing braces/brackets
Unclosed top-level brackets/braces (best-effort)

Returns:

Type	Description
`dict[str, Any] \| list[Any] \| None`	Parsed dict/list or None if repair fails.

Extraction Pipeline¶