notesum.ai
Published at November 13Sharingan: Extract User Action Sequence from Desktop Recordings
cs.CV
cs.AI
Released Date: November 13, 2024
Authors: Yanting Chen, Yi Ren, Xiaoting Qin, Jue Zhang, Kehong Yuan, Lu Han, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang
![[Uncaptioned image]](https://arxiv.org/html/2411.08768v1/extracted/5997676/figure/sharingan.png)
| Method | Model | Recall (Operation) | Precision (Operation) | Recall (All) | Precision (All) |
| DF | Gemini1.5-Pro | 0.71 | 0.73 | 0.49 | 0.51 |
| Gemini1.5-Flash | 0.69 | 0.59 | 0.30 | 0.26 | |
| GPT-4o | 0.83 | 0.81 | 0.71 | 0.68 | |
| GPT-4o-mini | 0.63 | 0.33 | 0.38 | 0.17 | |
| DiffF | Gemini1.5-Pro | 0.75 | 0.48 | 0.45 | 0.24 |
| Gemini1.5-Flash | 0.74 | 0.37 | 0.54 | 0.27 | |
| GPT-4o | 0.87 | 0.66 | 0.76 | 0.59 | |
| GPT-4o-mini | 0.59 | 0.26 | 0.45 | 0.19 |