Zhourobotics
diff --git a/‎_data/masters_undergraduates.yml‎
Lines changed: 11 additions & 0 deletions b/‎_data/masters_undergraduates.yml‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎_pages/about.md‎
Lines changed: 8 additions & 5 deletions b/‎_pages/about.md‎
Lines changed: 8 additions & 5 deletions
diff --git a/‎_pages/research.md‎
Lines changed: 7 additions & 2 deletions b/‎_pages/research.md‎
Lines changed: 7 additions & 2 deletions
diff --git a/‎_projects/adverallo5.md‎
Lines changed: 1 addition & 3 deletions b/‎_projects/adverallo5.md‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎_projects/airground3.md‎
Lines changed: 27 additions & 4 deletions b/‎_projects/airground3.md‎
Lines changed: 27 additions & 4 deletions
diff --git a/‎_projects/autodrive.md‎
Lines changed: 48 additions & 0 deletions b/‎_projects/autodrive.md‎
Lines changed: 48 additions & 0 deletions
@@ -25,6 +25,17 @@
   description: 
   description1:
 
+- name: Justin Nguyen
+  degrees: BE Student
+  image: Nguyen.png
+  altimage:
+  position: Electrical and Computer Engineering, Drexel University
+  email: 
+  website: 
+  linkedin: 
+  scholar: 
+
+
 - name: Logan Voravong
   degrees: BS Student
   image: Logan.jpeg
 
@@ -35,17 +35,20 @@ social: true  # includes social icons at the bottom of the page
       <img class="d-block w-100" src="assets/img/labmempic.jpg">
     </div> 
     <div class="carousel-item ">
-      <img class="d-block w-100" src="assets/img/huskydrone.png">
-    </div>  
+      <img class="d-block w-100" src="assets/img/airground1.jpg">
+    </div>
     <div class="carousel-item ">
       <img class="d-block w-100" src="assets/img/modalaidrone.png">
     </div> 
+    <!-- <div class="carousel-item ">
+      <img class="d-block w-100" src="assets/img/huskydrone.png">
+    </div>   -->
     <div class="carousel-item">
-      <img class="d-block w-100" src="assets/img/autovehicle.jpeg">
+      <img class="d-block w-100" src="assets/img/autodrive.jpg">
     </div>
     <div class="carousel-item ">
-      <img class="d-block w-100" src="assets/img/robottargt.jpeg">
-    </div>
+      <img class="d-block w-100" src="assets/img/robotarm.jpg">
+    </div>  
 
 </div>
 <a class="carousel-control-prev" href="#carouselExampleIndicators" role="button" data-slide="prev">
 
@@ -8,9 +8,14 @@ nav_order: 1
 display_categories: [current, past]
 horizontal: false
 ---
-Today, robotics and autonomous systems have been increasingly used in various areas such as manufacturing, military, agriculture, medical sciences, and environmental monitoring. However, most of these systems are fragile and vulnerable to adversarial attacks and uncertain environmental conditions. In most cases, even if a part of the system fails, the entire system performance can be significantly undermined. As robots start to coexist with humans, we need algorithms that can be trusted under real-world (not just ideal) conditions. To this end, our research focuses on enabling **security, trustworthiness, and long-term autonomy** in robotics and autonomous systems. We devise _efficient coordination algorithms with rigorous theoretical guarantees_ to make robots resilient to attacks and aware of the loss from uncertainty. Our long-term goal is to investigate **secure, reliable, and scalable multi-robot autonomy** when robots use data-driven machine learning techniques in the areas of **cyber-physical systems, the Internet of Things, precision agriculture, and smart cities**.
+<!-- Today, robotics and autonomous systems have been increasingly used in various areas such as manufacturing, military, agriculture, medical sciences, and environmental monitoring. However, most of these systems are fragile and vulnerable to adversarial attacks and uncertain environmental conditions. In most cases, even if a part of the system fails, the entire system performance can be significantly undermined. As robots start to coexist with humans, we need algorithms that can be trusted under real-world (not just ideal) conditions. To this end, our research focuses on enabling **security, trustworthiness, and long-term autonomy** in robotics and autonomous systems. We devise _efficient coordination algorithms with rigorous theoretical guarantees_ to make robots resilient to attacks and aware of the loss from uncertainty. Our long-term goal is to investigate **secure, reliable, and scalable multi-robot autonomy** when robots use data-driven machine learning techniques in the areas of **cyber-physical systems, the Internet of Things, precision agriculture, and smart cities**. -->
 
-[Current Robotics Reports](https://link.springer.com/article/10.1007/s43154-021-00046-5): a survery of multi-robot coordination and planning in uncertain and adversarial environments.
+Our lab explores the foundations of **robust, reliable, and scalable autonomy** for robotic and multi-robot systems. We aim to build intelligent systems that can **perceive, reason, and act** in dynamic and unstructured environments, enabling applications in environmental monitoring, precision agriculture, disaster response, urban mobility, and beyond. We are particularly interested in how **coordination, perception, planning, and control** can be enhanced by combining algorithmic tools with **data-driven approaches**, including emerging **foundation models**. This includes investigating how robot teams can operate effectively under limited communication, uncertainty, and evolving tasks, while generalizing across diverse environments and missions.
+
+
+[Preprint](https://arxiv.org/abs/2502.03814): A survey of large language models (LLMs) for multi-robot coordination. 
+
+[Current Robotics Reports](https://link.springer.com/article/10.1007/s43154-021-00046-5): A survery of multi-robot coordination and planning in uncertain and adversarial environments.
 
 
 <!-- pages/projects.md -->
 
@@ -7,8 +7,6 @@ importance: 5
 category: past
 ---
 
-To be updated.
-
-[arXiv:2505.06319](https://arxiv.org/abs/2505.06319): Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs.
+[Preprint](https://arxiv.org/abs/2505.06319): Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs.
 
 [T-RO](https://ieeexplore.ieee.org/abstract/document/10989573): Double oracle algorithm for game-theoretic robot allocation on graphs.
@@ -2,13 +2,36 @@
 layout: page
 title: VLMs for Air Ground Systems 
 description: 
-img: assets/img/airground.jpg
+img: assets/img/airground1.jpg
 importance: 2
 category: current
 ---
 
-To be updated.
+#### *Project Lead: [Bill Cai](https://scholar.google.com/citations?user=9OTtpc8AAAAJ&hl=en)*
 
-[arXiv:2505.06399](https://arxiv.org/abs/2505.06399): LLM-Land: Large Language Models for Context-Aware Drone Landing.
+## LLM-Land: Large Language Models for Context-Aware Drone Landing
 
-[arXiv:2310.07729](https://arxiv.org/abs/2310.07729): Energy-aware routing algorithm for mobile ground-to-air charging.
+<div class="row">
+    <div class="col-sm-7 mt-3 mt-md-0">
+        {% include figure.html path="assets/img/land.png" title="example image" class="img-fluid rounded z-depth-1" style="height: 210px; object-fit: cover;" %}
+    </div>
+    <div class="col-sm-5 mt-3 mt-md-0">
+        <iframe width="100%" height="200" src="https://www.youtube.com/embed/9yGEpqmCtdA" frameborder="0" allowfullscreen></iframe>
+    </div>    
+</div>
+
+Autonomous landing is essential for drones deployed in emergency deliveries, post-disaster response, and other large-scale missions. By enabling self-docking on charging platforms, it facilitates continuous operation and significantly extends mission endurance. However, traditional approaches often fall short in dynamic, unstructured environments due to limited semantic awareness and reliance on fixed, context-insensitive safety margins. To address these limitations, we propose a hybrid framework that integrates large language models (LLMs) with model predictive control (MPC). Our approach begins with a vision–language encoder (VLE) (e.g., BLIP), which transforms real-time images into concise textual scene descriptions. These descriptions are processed by a lightweight LLM (e.g., Qwen 2.5 1.5B  or LLaMA 3.2 1B) equipped with retrieval-augmented generation (RAG) to classify scene elements and infer context-aware safety buffers, such as 3 meters for pedestrians and 5 meters for vehicles. The resulting semantic flags and unsafe regions are then fed into an MPC module, enabling real-time trajectory replanning that avoids collisions while maintaining high landing precision. We validate our framework in the ROS-Gazebo simulator, where it consistently outperforms conventional vision-based MPC baselines. Our results show a significant reduction in near-miss incidents with dynamic obstacles, while preserving accurate landings in cluttered environments. [Preprint](https://arxiv.org/abs/2505.06399).
+
+
+## An Energy-Aware Routing Algorithm for Mobile Ground-to-Air Charging
+
+<div class="row">
+    <div class="col-sm-8 mt-3 mt-md-0">
+        {% include figure.html path="assets/img/routing.png" title="example image" class="img-fluid rounded z-depth-1" style="height: 210px; object-fit: cover;" %}
+    </div>
+    <div class="col-sm-4 mt-3 mt-md-0">
+        <iframe width="100%" height="190" src="https://www.youtube.com/embed/eYPMPYThhKE" frameborder="0" allowfullscreen></iframe>
+    </div>    
+</div>
+
+We investigate the problem of energy-constrained planning for a cooperative system consisting of an Unmanned Ground Vehicle (UGV) and an Unmanned Aerial Vehicle (UAV). In scenarios where the UGV serves as a mobile base to ferry the UAV and as a charging station to recharge the UAV, we formulate a novel energy-constrained routing problem. To tackle this problem, we design an energy-aware routing algorithm, aiming to minimize the overall mission duration under the energy limitations of both vehicles. The algorithm first solves a Traveling Salesman Problem (TSP) to generate a guided tour. Then, it employs the Monte-Carlo Tree Search (MCTS) algorithm to refine the tour and generate paths for the two vehicles, taking into account multiple physical constraints such as charging speed, total energy expenditure, travel time, and other operational requirements. We evaluate the performance of our algorithm through extensive simulations and a proof-of-concept experiment. The results show that our algorithm consistently achieves near-optimal mission time and maintains fast running time across a wide range of problem instances. [ISRR'24](https://arxiv.org/abs/2310.07729).
@@ -0,0 +1,48 @@
+---
+layout: page
+title: VLMs for Autonomous Driving
+description: 
+img: assets/img/autodrive.jpg
+importance: 3
+category: current
+---
+
+#### *Project Lead: [Amirhosein Chahe](https://scholar.google.com/citations?user=MeK_1LUAAAAJ&hl=en)*
+
+## ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models
+
+<div class="row">
+    <div class="col-sm mt-3 mt-md-0">
+        {% include figure.html path="assets/img/reasondrive.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
+    </div>
+</div>
+
+Vision-language models (VLMs) show promise for autonomous driving but often lack transparent reasoning capabilities that are critical for safety. We investigate whether explicitly modeling reasoning during fine-tuning enhances VLM performance on driving decision tasks. Using GPT-4o, we generate structured reasoning chains for driving scenarios from the DriveLM benchmark with category-specific prompting strategies. We compare reasoning-based fine-tuning, answer-only fine-tuning, and baseline instruction-tuned models across multiple small VLM families (Llama 3.2, Llava 1.5, and Qwen 2.5VL). Our results demonstrate that reasoning-based fine-tuning consistently outperforms alternatives, with Llama3.2-11B-reason achieving the highest performance. Models fine-tuned with reasoning show substantial improvements in accuracy and text generation quality, suggesting explicit reasoning enhances internal representations for driving decisions. These findings highlight the importance of transparent decision processes in safety-critical domains and offer a promising direction for developing more interpretable autonomous driving systems. [CVPR'25 WDFM-AD Workshop](https://openaccess.thecvf.com/content/CVPR2025W/WDFM-AD/html/Chahe_ReasonDrive_Efficient_Visual_Question_Answering_for_Autonomous_Vehicles_with_Reasoning-Enhanced_CVPRW_2025_paper.html) & [GitHub](https://github.com/Zhourobotics/ReasonDrive).
+
+
+## Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians
+
+<div class="row">
+    <div class="col-sm mt-3 mt-md-0">
+        {% include figure.html path="assets/img/query3d.png" title="example image" class="img-fluid rounded z-depth-1" %}
+    </div>
+</div>
+
+This paper introduces a novel method for open-vocabulary 3D scene querying in autonomous driving by combining Language Embedded 3D Gaussians with Large Language Models (LLMs). We propose utilizing LLMs to generate both contextually canonical phrases and helping positive words for enhanced segmentation and scene interpretation. Our method leverages GPT-3.5 Turbo as an expert model to create a high-quality text dataset, which we then use to fine-tune smaller, more efficient LLMs for on-device deployment.
+Our comprehensive evaluation on the WayveScenes101 dataset demonstrates that LLM-guided segmentation significantly outperforms traditional approaches based on predefined canonical phrases. Notably, our fine-tuned smaller models achieve performance comparable to larger expert models while maintaining faster inference times. Through ablation studies, we discover that the effectiveness of helping positive words correlates with model scale, with larger models better equipped to leverage additional semantic information.
+This work represents a significant advancement towards more efficient, context-aware autonomous driving systems, effectively bridging 3D scene representation with high-level semantic querying while maintaining practical deployment considerations. [WACV'25 LLVM-AD Workshop](https://openaccess.thecvf.com/content/WACV2025W/LLVMAD/html/Chahe_Query3D_LLM-Powered_Open-Vocabulary_Scene_Segmentation_with_Language_Embedded_3D_Gaussians_WACVW_2025_paper.html) & [GitHub](https://github.com/Zhourobotics/Query-3DGS-LLM).
+
+
+## Dynamic Adversarial Attacks on Autonomous Driving Systems
+
+<div class="row">
+    <div class="col-sm-7 mt-3 mt-md-0">
+        {% include figure.html path="assets/img/rssadversiral.png" title="example image" class="img-fluid rounded z-depth-1" style="height: 210px; object-fit: cover;" %}
+    </div>
+    <div class="col-sm-5 mt-3 mt-md-0">
+        <iframe width="100%" height="165" src="https://www.youtube.com/embed/Wh2sPYpWczQ" frameborder="0" allowfullscreen></iframe>
+    </div>    
+</div>
+
+This paper introduces an attacking mechanism to challenge the resilience of autonomous driving systems. Specifically, we manipulate the decision-making processes of an autonomous vehicle by dynamically displaying adversarial patches on a screen mounted on another moving vehicle. These patches are optimized to deceive the object detection models into misclassifying targeted objects, e.g., traffic signs. Such manipulation has significant implications for critical multi-vehicle interactions such as intersection crossing, which are vital for safe and efficient autonomous driving systems. 
+Particularly, we make four major contributions. First, we introduce a novel adversarial attack approach where the patch is not co-located with its target, enabling more versatile and stealthy attacks. Moreover, our method utilizes dynamic patches displayed on a screen, allowing for adaptive changes and movements, enhancing the flexibility and performance of the attack. To do so, we design a Screen Image Transformation Network (SIT-Net), which simulates environmental effects on the displayed images, narrowing the gap between simulated and real-world scenarios. Further, we integrate a positional loss term into the adversarial training process to increase the success rate of the dynamic attack. Finally, we shift the focus from merely attacking perceptual systems to influencing the decision-making algorithms of self-driving systems. Our experiments demonstrate the first successful implementation of such dynamic adversarial attacks in real-world autonomous driving scenarios, paving the way for advancements in the field of robust and secure autonomous driving. [RSS'24](https://www.roboticsproceedings.org/rss20/p076.pdf) & [GitHub](https://github.com/amirhoseinch/dynamicpatch).