Visual Studio Open SQLite Database

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Abstract: 3D visual grounding is a critical skill for household robots, enabling them to navigate, manipulate objects, and answer questions based on their environment. While existing approaches often ...

IEEE

Learning Visual-Inertial Odometry With Robocentric Iterated Extended Kalman Filter

Abstract: In recent years, deep learning methodologies have been increasingly applied to the intricate challenges of visual-inertial odometry (VIO), especially in scenarios with rapid movements and ...

GitHub

VAR: a new visual generation method elevates GPT-style models beyond diffusion & Scaling laws observed

🕹️ Try and Play with VAR! We provide a demo website for you to play with VAR models and generate images interactively. Enjoy the fun of visual autoregressive modeling! We provide a demo website for ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Learning Visual-Inertial Odometry With Robocentric Iterated Extended Kalman Filter

VAR: a new visual generation method elevates GPT-style models beyond diffusion & Scaling laws observed

Trending now