🎯 VISTA: GUI Grounding

Upload a GUI screenshot and describe the element you want to click. VISTA-9B predicts the click coordinate and marks it on the image.

Based on VISTA: View-Consistent Self-Verified Training for GUI Grounding | Model Card | GitHub

Examples