Reference
Native Tools
29 built-in tools available to the agent out of the box. No extensions needed.
Overview
Native tools are built into the Omni runtime and available to the agent immediately. The LLM can call these tools during conversations to interact with the filesystem, make web requests, manage memory, send messages, and more.
Every native tool is permission-gated. The required capability is listed next to each tool. If the agent hasn't been granted the capability, a permission prompt appears in the UI.
Total Tools
29
System
7 tools
Web
3 tools
Dev Tools
8 tools
Other
11 tools
System Tools
execprocess.spawnExecute shell commands on the host system. Returns stdout, stderr, and exit code.
command (string), args (string[], optional), cwd (string, optional)
read_filefilesystem.readRead the contents of a file as text. Supports any text-based file format.
path (string)
write_filefilesystem.writeWrite content to a file. Creates the file if it doesn't exist, overwrites if it does.
path (string), content (string)
edit_filefilesystem.writeEdit a file by replacing a specific string with new content. Fails if the old string is not found.
path (string), old_string (string), new_string (string)
list_filesfilesystem.readList files and directories at a given path. Returns names, types, and sizes.
path (string), recursive (bool, optional)
apply_patchfilesystem.writeApply a unified diff patch to one or more files. Supports standard patch format.
patch (string)
grep_searchfilesystem.readSearch file contents using regex patterns. Returns matching lines with file paths and line numbers.
pattern (string), path (string, optional), include (string, optional)
Web Tools
web_fetchnetwork.httpFetch content from a URL via HTTP. Supports GET, POST, PUT, DELETE with custom headers and body.
url (string), method (string, optional), headers (object, optional), body (string, optional)
web_searchsearch.webSearch the web and return results. Returns titles, URLs, and snippets.
query (string), num_results (integer, optional)
web_scrapebrowser.scrapeScrape web content with 3 modes: extract (fast HTML parsing), browser (Puppeteer with anti-bot stealth), or crawl (BFS multi-page). Converts HTML to Markdown.
url (string), mode (string), selector (string, optional), max_pages (integer, optional), max_depth (integer, optional), url_pattern (string, optional)
Memory Tools
memory_savestorage.persistentSave text to the agent's memory store. Persists across sessions for long-term recall.
key (string), content (string), tags (string[], optional)
memory_searchstorage.persistentSearch saved memories by keyword or tag. Returns matching entries sorted by relevance.
query (string), limit (integer, optional)
memory_getstorage.persistentRetrieve a specific memory entry by its key.
key (string)
Vision Tools
image_analyzeai.inferenceAnalyze an image using the LLM's vision capabilities. Describe, extract text, or answer questions about the image.
image_path (string), prompt (string, optional)
Messaging Tools
send_messagemessaging.chatSend a message through a connected channel. The channel instance and recipient are specified by the agent. Checks channel bindings before sending.
channel_id (string), recipient (string), text (string), media_url (string, optional)
list_channelsmessaging.chatList all connected channel instances with their status and features.
None
Notifications & Scheduling Tools
notifysystem.notificationsSend a system notification to the user's desktop. Returns structured JSON for the UI to display.
title (string), body (string)
cron_schedulesystem.schedulingSchedule a recurring task using a cron expression. The task is stored and executed at the specified intervals.
name (string), cron_expression (string), action (string)
Sessions Tools
session_liststorage.persistentList all chat sessions with their IDs, creation time, and metadata. Requires database access.
limit (integer, optional)
session_historystorage.persistentRetrieve the full message history for a specific session. Requires database access.
session_id (string), limit (integer, optional)
Desktop Automation Tools
app_interactapp.automationLaunch and control desktop applications via Windows UI Automation APIs. Supports 11 actions: launch, list_windows, find_element, find_elements, click, type_text, read_text, get_tree, get_subtree, screenshot, and close. Security-hardened with LOLBIN blocklist, password field protection, rate limiting, and audit logging.
action (string), executable (string, optional), window_title (string, optional), process_name (string, optional), element_name (string, optional), element_type (string, optional), automation_id (string, optional), element_ref (string, optional), text (string, optional), max_depth (integer, optional), max_results (integer, optional), timeout_ms (integer, optional), args (string[], optional)
Version Control Tools
gitvcs.operationsVersion control operations returning structured JSON. 10 actions: status, diff, log, commit, branch, checkout, stash, merge, show_conflict, resolve. Includes automatic secret scanning before commits and conflict marker parsing.
action (string), repo_path (string, optional), message (string, for commit), files (string[], for commit), branch (string), name (string), create (bool), delete (bool), list (bool), staged (bool), file (string), content (string), count (integer), since (string), author (string), pop (bool)
Testing Tools
test_runnerprocess.spawnRun tests with automatic framework detection and structured output. 3 actions: run (execute tests and parse results), list (discover available tests), coverage (run with coverage enabled). Auto-detects: cargo test (Rust), jest/vitest/mocha (JS/TS), pytest (Python), go test (Go), dotnet test (.NET).
action (string), framework (string, optional — auto-detected), file (string, optional), pattern (string, optional), coverage (bool, optional), working_dir (string, optional)
Clipboard Tools
clipboardclipboard.readRead from or write to the system clipboard. 2 actions: read (get current clipboard text) and write (set clipboard text). Maximum content size: 1 MB.
action (string: read | write), content (string, required for write)
Code Intelligence Tools
code_searchfilesystem.readOffline code intelligence using syntax-aware regex analysis. 4 actions: index (build symbol index for a project), search (query symbols by name with type/language filters), symbols (list all symbols in a file), dependencies (show imports/uses for a file). Supports 9 languages: Rust, TypeScript, JavaScript, Python, Go, C, C++, Java, C#. Works without a language server.
action (string), root_path (string), languages (string[], optional), query (string), type (string, optional), language (string, optional), limit (integer, optional), file (string)
lspcode.intelligenceLanguage Server Protocol client for real-time code intelligence. 8 actions: start (launch a language server), stop, goto_definition, find_references, hover, diagnostics, symbols (document or workspace), rename_preview. Auto-detects servers: rust-analyzer, typescript-language-server, pyright, gopls.
action (string), language (string), root_path (string), file (string), position ({ line, character }), query (string, for workspace symbols)
Agent Orchestration Tools
agent_spawnagent.spawnSpawn a sub-agent to handle a task in parallel. The sub-agent gets its own conversation context and tool access (except agent_spawn, to prevent recursion). Set wait=true to block until the sub-agent completes, or wait=false to get a task ID for later retrieval.
task (string), context_files (string[], optional), model (string, optional), max_iterations (integer, optional — default 15), wait (bool, optional — default true)
Debugging Tools
debuggerdebug.sessionDebug Adapter Protocol (DAP) client for controlling debug sessions. 11 actions: launch (start debug session), attach (connect to running process by PID), set_breakpoints, continue, step_over, step_into, step_out, evaluate (evaluate expression in frame), variables (list variables in scope), stack_trace, disconnect.
action (string), program (string), adapter (string, optional — auto-detected), file (string), breakpoints (array of { line }), expression (string), frame_id (integer), process_id (integer, for attach)
Interactive Execution Tools
replprocess.spawnPersistent REPL sessions for interactive code execution. 4 actions: execute (run code in a session), list (show active sessions), reset (clear session state), close (terminate session). Supports Python and Node.js. Up to 3 concurrent sessions, 30-second execution timeout.
action (string), language (string: python | javascript), code (string), session_id (string, optional — auto-generated)
Web Scrape Modes
The web_scrape tool supports three modes with increasing capability and resource usage.
extract
Fast HTML parsing using the scraper crate. No browser needed. Best for static pages with predictable HTML structure.
500 KB/page, 2 MB download
browser
Full Puppeteer browser with stealth plugins. Handles JavaScript rendering, anti-bot protection, and dynamic content. Uses Mozilla Readability + Turndown for content extraction.
500 KB/page, random viewport/delays
crawl
BFS multi-page crawl. Follows links matching a URL pattern up to a configurable depth. Combines content from all visited pages.
100 pages max, depth 5, 5 MB total
App Interact Actions
The app_interact tool supports 11 actions for full desktop application control. Windows only (uses native UI Automation APIs).
launchStart a desktop application. Returns PID and window title.
executable (required), args (optional)
{ pid, executable, window_title }
list_windowsList all visible top-level windows with title, process name, PID, and bounds.
process_name (optional filter)
{ windows: [...], count }
find_elementFind a single UI element by name, type, or automation ID. Returns an opaque element_ref for use in subsequent actions.
window_title, process_name, element_name, element_type, automation_id, timeout_ms (default 5000)
{ element_ref, name, control_type, automation_id, is_enabled, patterns }
find_elementsFind multiple matching elements. Returns up to max_results matches.
Same as find_element + max_results (default 20, max 100)
{ elements: [...], count }
clickClick a UI element using semantic patterns (InvokePattern, TogglePattern, SelectionItemPattern). Never uses screen coordinates.
element_ref (required)
{ status: "clicked" }
type_textType text into an input element. Uses ValuePattern with SendKeys fallback. Blocked on password fields.
element_ref (required), text (required)
{ status: "typed" }
read_textRead text from an element. Tries ValuePattern, TextPattern, then element name. Blocked on password fields.
element_ref (required)
{ text: "..." }
get_treeGet the UI element tree of a window. Includes truncation reporting when element cap (500) or depth limit is hit.
window_title or process_name, max_depth (default 4, max 8)
{ root: { name, control_type, children: [...] }, total_elements, depth_reached, truncated }
get_subtreeGet a subtree starting from a specific element. Useful for exploring deeper when get_tree is truncated.
element_ref (required), max_depth (default 4, max 8)
Same structure as get_tree
screenshotCapture a window as PNG. Uses Windows GDI PrintWindow (works for occluded windows) with BitBlt fallback. Capped at 4K. Returns base64 image via multimodal pipeline.
window_title or process_name
{ window_title, width, height, _image_data: [{ mime_type, data }] }
closeClose a window. Tries graceful close first, then force-kills by PID if that fails.
window_title or process_name
{ status: "closed" | "force_closed" }
App Interact Security
Desktop app automation is a high-risk capability. The app_interact tool enforces 12 layers of defense-in-depth to prevent misuse.
The entire tool is gated by the app.automation capability. Requires explicit user approval before any action.
43 dangerous Windows executables (cmd.exe, powershell.exe, rundll32.exe, certutil.exe, mshta.exe, etc.) are permanently blocked from being launched. Case-insensitive, checked against filename regardless of path.
The app.automation scope can restrict which applications are launchable via allowed_apps. Only apps on the list can be opened.
The Windows backend checks the IsPassword property before any read or write. Password fields cannot be typed into or read from.
Regex patterns detect element names containing password, secret, token, api_key, credit_card, cvv, ssn, pin_code, 2fa, otp, and similar. These elements are blocked for click, type_text, and read_text.
60-second sliding window per app, default 60 actions/minute. Configurable via scope. Prevents rapid-fire automation.
Default 3 simultaneously running managed processes. Configurable via scope. Prevents resource exhaustion.
UI tree walks are capped at depth 8 and 500 elements to prevent LLM context overflow. Truncation is reported with actionable suggestions.
Password field values are automatically replaced with "[REDACTED]" in tree output. Sensitive data never enters the LLM context.
Interactions use UI Automation patterns (InvokePattern, ValuePattern), never raw screen coordinates or simulated mouse events. No way to bypass UI structure.
All text scraped from desktop apps passes through the existing 4-layer Guardian pipeline at scan point SP-5, preventing prompt injection via app content.
Every action (launch, click, type_text, screenshot, etc.) emits an AppAutomationAction audit event with action type, target app, target element, and success/failure status.
App Automation Scope
The app.automation capability accepts a scope with 4 configurable fields to restrict what the tool can do.
Element References
When you call find_element or find_elements, each result includes an opaque element_ref string. This reference is used in subsequent actions like click, type_text, read_text, and get_subtree.
Element references are re-resolved on each use by re-searching the window for the matching element. This means references remain valid even if the window is restructured between calls. If the element is no longer found, the tool returns a descriptive error.
Do not parse or construct element references manually. Always obtain them from find_element, find_elements, or get_tree results.
MCP Client (Model Context Protocol)
Omni includes a built-in MCP client that can connect to external MCP servers and expose their tools to the agent. MCP tools are automatically namespaced as mcp_<server>_<tool> and appear alongside native tools in the agent loop. All MCP tool output is scanned by Guardian at SP-6.
Communicates with MCP servers over stdin/stdout using JSON-RPC 2.0. No HTTP server needed — fully local, no network surface.
MCP servers listed in [mcp.servers] config with auto_start=true are launched automatically on startup.
On connection, Omni sends tools/list to discover available tools and their JSON schemas. Tools are registered dynamically.
Each MCP tool is prefixed with the server name (e.g., filesystem server's read tool becomes mcp_filesystem_read) to prevent collisions.
MCP tool execution requires the mcp.server capability. Scoped by server name and allowed tools list.
All MCP tool responses are scanned at SP-6 before being returned to the LLM, preventing prompt injection via external tool output.
McpManager supports add, remove, restart, list, and shutdown operations. Servers are killed on drop if unresponsive.
Git Tool Actions
The git tool provides 10 structured version control actions. Prefer this over exec git ... for parsed, JSON-structured output.
statusdifflogcommitbranchcheckoutstashmergeshow_conflictresolveSecret scanning: The commit action automatically scans staged content for API keys, tokens, passwords, and other secrets before committing. If secrets are detected, the commit is blocked with a detailed warning.
Conflict resolution: The show_conflict action parses conflict markers into structured JSON (ours/theirs/ancestor sections). The resolve action writes the final resolved content.
Debugger Actions
The debugger tool implements the Debug Adapter Protocol (DAP) for controlling debug sessions across languages. It auto-detects debug adapters for Rust (codelldb), Python (debugpy), Node.js (node-debug), and Go (dlv-dap).
launchStart a debug session for a program
attachAttach to a running process by PID
set_breakpointsSet breakpoints in a source file
continueResume execution until next breakpoint
step_overStep over to the next line
step_intoStep into a function call
step_outStep out of the current function
evaluateEvaluate an expression in the current frame
variablesList variables in the current scope
stack_traceGet the current call stack
disconnectEnd the debug session
LSP Tool Actions
The lsp tool manages Language Server Protocol connections and exposes real-time code intelligence. Auto-detects servers: rust-analyzer (Rust), typescript-language-server (TS/JS), pyright (Python), gopls (Go).
startLaunch a language server for a project
stopShut down a running language server
goto_definitionJump to the definition of a symbol
find_referencesFind all references to a symbol
hoverGet type info and docs for a position
diagnosticsGet compiler errors and warnings
symbolsList symbols in a file or workspace
rename_previewPreview renames across files