Hi there 👋🏻,
🔭 I am currently focused on developing cutting-edge multi-modality models capable of natively generating and understanding text and images.
My specific areas of interest include:
- Path planning
PS: I recognize all these tasks as path-planning tasks. Check this blog for more details.
Auto-regressive GenerationDiffusion
PS: I recognize all these tasks as diffusion and next-token generation tasks.
Document Understanding & Layout AnalysisOptical Character RecognitionObject Detection
In addition to my current work, I have prior experience in Robotics Perception from my Master's studies. I hope my work can be helpful to you.
Feel free to reach out if you have any questions or if there's anything I can assist with!