scriptling.ai.agent

Agentic AI loop for building AI agents with automatic tool execution. The agent handles the complete agentic loop including tool calling, execution, and response formatting.

Available Classes & Methods

Class/Method Description
Agent(client, tools, system_prompt, model, memory, max_tokens, compaction_threshold) Create AI agent
agent.trigger(message, max_iterations) One-shot trigger with response
agent.interact(max_iterations) Start interactive session
agent.get_messages() Get conversation history
agent.set_messages(messages) Set conversation history

For tool registry documentation, see AI Library.

Quick Start

import scriptling.ai as ai
import scriptling.ai.agent as agent

# Create AI client
client = ai.Client("http://127.0.0.1:1234/v1")

# Create tool registry
tools = ai.ToolRegistry()
tools.add("calculate", "Calculate square root", {"number": "number"}, lambda args: str(args["number"] ** 0.5))

# Create agent
bot = agent.Agent(client, tools=tools, system_prompt="You are a helpful assistant", model="gpt-4")

# One-shot trigger
response = bot.trigger("What is the square root of 144?", max_iterations=10)
print(response.content)

# Interactive session (requires scriptling.console)
bot.interact()

Agent Class

Agent(client, tools=None, system_prompt="", model="", memory=None, max_tokens=32000, compaction_threshold=80)

Creates an AI agent with automatic tool execution.

Parameters:

  • client (AIClient): AI client instance from ai.Client()
  • tools (ToolRegistry, optional): Tool registry with available tools
  • system_prompt (str, optional): System prompt for the agent
  • model (str, optional): Model to use
  • memory (memory object, optional): Memory store from memory.new() — see Memory Integration
  • max_tokens (int, optional): Maximum token budget for the conversation. When estimated token usage reaches the compaction threshold, the conversation history is automatically compacted (summarized). Default: 32000
  • compaction_threshold (int, optional): Percentage of max_tokens at which auto-compaction triggers (0-100). For example, with max_tokens=32000 and compaction_threshold=80, compaction triggers at ~25600 tokens. Default: 80

Example:

import scriptling.ai as ai
import scriptling.ai.agent as agent

client = ai.Client("http://127.0.0.1:1234/v1")
tools = ai.ToolRegistry()
tools.add("reverse", "Reverse text", {"text": "string"}, lambda args: args["text"][::-1])

bot = agent.Agent(
    client,
    tools=tools,
    system_prompt="You are a coding assistant",
    model="gpt-4"
)

# With custom compaction settings (compact at 50% of 16k tokens)
bot = agent.Agent(
    client,
    tools=tools,
    max_tokens=16000,
    compaction_threshold=50
)

agent.trigger(message, max_iterations=1)

Processes a message with the agent, executing tools as needed.

Parameters:

  • message (str or dict): User message to process
  • max_iterations (int): Maximum tool call rounds (default: 1)

Returns: dict — agent’s response message

Behavior:

  • Strips <think>...</think> blocks from responses
  • Executes tools automatically
  • Maintains conversation history
  • Uses automatic request timeouts for model calls
  • Stops after max_iterations or when no more tool calls

Example:

response = bot.trigger("What is 2+2?")
print(response.content)

response = bot.trigger("Reverse the word 'hello'", max_iterations=10)
print(response.content)

agent.interact(max_iterations=25)

Runs an interactive CLI session. Requires scriptling.console library.

Parameters:

  • max_iterations (int, optional): Maximum tool call rounds per message. Default: 25

Behavior:

  • Streams reasoning and assistant text into the main console panel as it arrives
  • Keeps the spinner active for the full request lifecycle
  • Shows tool call and result messages with status and preview
  • Uses streaming via ai.collect_stream() with configurable timeouts
  • Preserves conversation history between turns

Example:

bot = agent.Agent(client, tools=tools, system_prompt="Coding assistant")
bot.interact()

agent.get_messages() / set_messages(messages)

Get or replace the conversation history.

messages = bot.get_messages()
bot.set_messages([
    {"role": "system", "content": "You are helpful"},
    {"role": "user", "content": "Hello"},
])

Memory Integration

Pass a memory store to Agent via the memory= kwarg. The agent automatically:

  1. Registers memory_remember, memory_recall, and memory_forget as tools
  2. Appends memory usage instructions to the system prompt
  3. Pre-loads all stored preference memories into the system prompt so the LLM has immediate context on the first message without a tool call round-trip
import scriptling.ai as ai
import scriptling.ai.agent as agent
import scriptling.ai.memory as memory
import scriptling.runtime.kv as kv

client = ai.Client("http://127.0.0.1:1234/v1")
mem = memory.new(kv.open("./memory-db"))

bot = agent.Agent(
    client,
    model="gpt-4",
    system_prompt="You are a helpful assistant.",
    memory=mem
)

bot.interact()

You can combine memory= with your own tools — the memory tools are added to the existing registry:

tools = ai.ToolRegistry()
tools.add("search", "Search the web", {"query": "string"}, search_handler)

bot = agent.Agent(client, tools=tools, memory=mem, model="gpt-4")
# bot.tool_schemas now contains: search, memory_remember, memory_recall, memory_forget

Memory Tools

When memory= is provided, the following tools are registered automatically:

Tool Parameters Description
memory_remember content, type?, importance? Store a fact, preference, event or note
memory_recall query?, limit?, type? Search memories by keyword; omit query for recent context
memory_forget id Remove a memory by ID

System Prompt Augmentation

The agent appends a ## Memory block to the system prompt explaining when and how to use the memory tools. It also injects a ## Remembered Preferences block containing all stored preference memories, so the LLM has user preferences available immediately.

The original system_prompt you pass is always preserved — the memory content is appended after it.

Auto-Compaction

The agent automatically compacts conversation history when it grows too large, preventing context window overflow and reducing API costs.

How it works:

  1. Before each completion call, the agent estimates the token count of the current messages
  2. If the estimated tokens reach the compaction threshold (percentage of max_tokens), the conversation is compacted
  3. Compaction asks the AI to summarize the conversation so far, preserving key facts and context
  4. The history is rebuilt as: system prompt + summary + protected recent context
  5. Active tool rounds are preserved so assistant tool calls remain paired with their tool results
  6. The agent continues normally with the compacted history

Parameters:

  • max_tokens (int): Maximum token budget. Default: 32000
  • compaction_threshold (int): Percentage of max_tokens at which compaction triggers. Default: 80

Example:

import scriptling.ai as ai
import scriptling.ai.agent as agent

client = ai.Client("http://127.0.0.1:1234/v1")

# Default: compact at 80% of 32000 tokens (25600 tokens)
bot = agent.Agent(client, model="gpt-4")

# Custom: compact at 50% of 16000 tokens (8000 tokens)
bot = agent.Agent(client, model="gpt-4", max_tokens=16000, compaction_threshold=50)

Note: Set max_tokens=0 or compaction_threshold=0 to disable auto-compaction entirely.

With LLM-based Deduplication

Pass an AI client to memory.new() to enable intelligent deduplication. When similar memories are found during remember() or compact(), the LLM decides whether to merge them or keep them separate:

client = ai.Client("http://127.0.0.1:1234/v1")
mem = memory.new(kv.open("./memory-db"), ai_client=client, model="qwen3-8b")

bot = agent.Agent(client, model="qwen3-8b", memory=mem)

Without an AI client, deduplication is rule-based only (MinHash similarity ≥ 85% auto-merges, otherwise keeps separate).

See ai.memory for full memory store documentation.

Complete Example

#!/usr/bin/env scriptling
import scriptling.ai as ai
import scriptling.ai.agent as agent
import scriptling.ai.memory as memory
import scriptling.runtime.kv as kv
import os

client = ai.Client("http://127.0.0.1:1234/v1", api_key=os.getenv("OPENAI_API_KEY", ""))

# Tools
tools = ai.ToolRegistry()
tools.add("sqrt", "Calculate square root", {"number": "number"}, lambda args: str(args["number"] ** 0.5))
tools.add("reverse", "Reverse a text string", {"text": "string"}, lambda args: args["text"][::-1])

# Memory
mem = memory.new(kv.open("./memory-db"))

# Agent with tools and memory
bot = agent.Agent(
    client,
    tools=tools,
    memory=mem,
    system_prompt="You are a helpful math and text assistant.",
    model="gpt-4"
)

bot.interact()

Tool Handler Interface

Tool handlers receive a dict of arguments and can return any value — complex types are automatically JSON-encoded for the LLM.

def get_time(args):
    import datetime
    return str(datetime.datetime.now())

def calculate_safe(args):
    try:
        import math
        return str(math.sqrt(args["number"]))
    except ValueError as e:
        return f"Error: {e}"

Thinking Blocks

The agent automatically handles <think>...</think> blocks:

  • In trigger(): strips thinking blocks from responses
  • In interact(): displays thinking in purple, then strips from final output

Manual Extraction

import scriptling.ai as ai

result = ai.extract_thinking(response_text)
thinking_blocks = result["thinking"]
clean_content = result["content"]

See Also