trycua/cua

cua-driver get_window_state omits per-element geometry

Open

#1.564 aberto em 18 de mai. de 2026

Ver no GitHub
 (2 comments) (0 reactions) (0 assignees)HTML (1.051 forks)batch import
enhancementgood first issue

Métricas do repositório

Stars
 (16.722 stars)
Métricas de merge de PR
 (Mesclagem média 7h 42m) (198 fundiu PRs em 30d)

Description

Summary

get_window_state returns tree_markdown and screenshot dimensions, but no per-element geometry (bounds, frame, x/y/width/height, AXPosition, AXSize, etc.).

This makes it impossible for downstream clients to map element indexes/refs to real coordinates.

Reproduction

list_windows includes window bounds

cua-driver call list_windows '{"on_screen_only":false}'

Example result:

{
  "app_name": "Safari浏览器",
  "bounds": { "x": 0, "y": 0, "width": 1920, "height": 30 },
  "pid": 2395,
  "window_id": 3882
}

get_window_state does not include element geometry

cua-driver call get_window_state '{"pid":2395,"window_id":140}' --raw

Observed structuredContent:

{
  "bundle_id": "com.apple.Safari",
  "element_count": 474,
  "name": "Safari浏览器",
  "pid": 2395,
  "screenshot_height": 304,
  "screenshot_scale_factor": 2,
  "screenshot_width": 388,
  "tree_markdown": "...",
  "turn_id": 8
}

Missing fields:

  • bounds
  • frame
  • elements
  • AXPosition
  • AXSize
  • per-element x/y/width/height

Expected

Either:

  1. return a structured element list with geometry, or
  2. expose a geometry map keyed by element index, or
  3. document explicitly that get_window_state does not provide per-element geometry.

Environment

  • macOS 26.5
  • CuaDriver.app version 0.2.0

Question

Is this intentional API design, or should get_window_state expose per-element geometry?

Guia do colaborador