🎯 VISTA: GUI Grounding
Upload a GUI screenshot and describe the element you want to click. VISTA-9B predicts the click coordinate and marks it on the image.
Based on VISTA: View-Consistent Self-Verified Training for GUI Grounding | Model Card | GitHub
Examples