- Project Goal and Overview
- System Architecture
- Hardware Setup
- Directory Structure
- Installation
- Usage
- Future Improvements
- Acknowledgments
The Home Service Robot is a groundbreaking project aimed at assisting the elderly and individuals with limited mobility by autonomously performing daily tasks. This robot integrates cutting-edge multi-vision models, natural language processing (NLP), and language grounding techniques to ensure seamless human-robot interaction in home environments.
Key components include:
- Multi-Vision Models for Enviroment recognition and detection
- Natural Language Understanding (NLU) for command interpretation
- PDDL-based Planning for decision-making and task execution
- MoveIt with 6DOF for robotic arm manipulation
- Human-Robot Interaction: Communicate with the robot using natural language through a messaging interface.
- Integrated Multi-Vision Models: Combines VQA, DOPE, and DINO models for rich environmental understanding.
- Language Grounding: Utilizes NLU and language models to convert language into actionable tasks.
- Autonomous Navigation: Leverages LiDAR and the ROS navigation stack for safe movement.
- Object Manipulation: Uses MoveIt and a 5DOF robotic arm for complex object handling.
- RGBD Camera: Provides color (RGB) and depth information for object detection, and human-robot interaction.
- 4DOF Arm: A 4-degree-of-freedom robotic arm that allows for object manipulation and grasping tasks.
- RPLidar: Used for precise environment mapping and obstacle avoidance, ensuring safe navigation.
- Mobile Base: The mobile platform that enables movement across the environment.
- Onboard Computer: Handles all processing tasks, running the robot's software stack, vision models, and navigation algorithms.
Home_Service_robot/
├── src/ # Source code directory
│ ├── Classification # Classification package
│ ├── detection # GroundingDINO package
│ ├── dope # Deep Pose estimation package
│ ├── navigation # Navigation package
│ ├── VQA # Visual question answering package
│ ├── LLM # Large Language Model package
│ ├── planning # PDDLStream package
│ ├── manipulation # Manipulation package
│ ├── Porject_environment.yml # List of project dependencies for conda enviroment
│ └── DOPE.yml # List of Dope package dependencies for conda enviroment
└── README.md # Project README
Ensure the following software is installed on your system:
- ROS (Robot Operating System): Required for robot control and navigation.
- Installation instructions: ROS Installation Guide
- Python 3.x: The project requires Python 3 for compatibility with various models and libraries.
- Anaconda: Recommended for managing Python environments.
- Installation instructions: Anaconda Installation Guide
Start by cloning this repository to your local machine:
git clone https://github.com/yourusername/Home_Service_robot.git
cd Home_Service_robot
catkin make
The project uses multiple Conda environments to manage dependencies for different modules.
conda env create -f Project_environment.yml #This file includes dependencies for NLP, navigation, and general utilities.
conda env create -f dope.yml # This environment specifically handles the DOPE (Deep Object Pose Estimation) model dependencies.
- Turtlebot package: Required for robot control and navigation.
- Installation instructions: Turtlebot Installation Guide
- Place the model in main directory:
src/
- DOPE Model: Required for Object Pose Estimation.
- Installation instructions: DOPE Installation Guide
- Place the model in
src/dope/
- VQA Model: Required for Visual Question Answering.
- Installation instructions: VQA Installation Guide
- Place the model in
src/VQA/
- GroundingDINO Model: Required for Object Detection.
- Installation instructions: GroundingDINO Installation Guide
- Place the model in
src/detection/
- LLM Model: Required for human robot language interaction.
- Installation instructions: LLM Installation Guide
- Place the model in
src/LLM/
- PPDLStream Model: Required for decision-making and task execution .
- Installation instructions: LLM Installation Guide
- Place the package in
src/planning/
- Finally: in the main directory :
catkin make
- LLM and Communication Node:
- This node enables communication and language processing for user commands:
conda activate Home_Service_robot rosrun llm llm.py
- This node enables communication and language processing for user commands:
- NLU (Natural Language Understanding) Node:
- For command parsing and classification:
conda activate Home_Service_robot rosrun grounding rnn_classification.py
- For command parsing and classification:
- Planning Node:
- Use this node for planning tasks and goal management:
conda activate Home_Service_robot rosrun planning main_node.py
- Use this node for planning tasks and goal management:
- Navigation Node:
- For robot navigation, launch the navigation stack:
roslaunch <robot_bringup_package> <robot_bringup_launch>.launch roslaunch <robot_navigation_package> <navigation_launch>.launch map_file:=<path_to_map> roslaunch <rviz_launchers_package> <view_navigation_launch>.launch rosrun navigation navigation_node.py rosrun navigation move_motors_node.py
- For robot navigation, launch the navigation stack:
- Object Detection Node:
- Run this node for camera activation and object detection using DINO:
conda activate Home_Service_robot roslaunch <robot_camera_package> <robot_camera_launch>.launch rosrun detection read_camera.py rosrun detection dino.py
- Run this node for camera activation and object detection using DINO:
- Vision-Language Model (VQA):
- Use this node for visual question answering with the vision-language model:
conda activate Home_Service_robot rosrun vqa vqa_ros.py
- Use this node for visual question answering with the vision-language model:
- Manipulation Setup:
- Set up the robotic arm and camera for manipulation tasks:
roslaunch <robot_arm_bringup_package> arm_with_group.launch roslaunch <robot_arm_bringup_package> moveit_bringup.launch roslaunch astra_camera astra.launch rosrun tf static_transform_publisher x, y, z, yaw, pitch, roll camera_topic camera_topic_frame period_hz
- Set up the robotic arm and camera for manipulation tasks:
- DOPE for Object Pose Estimation:
- Activate the DOPE environment and launch for pose estimation tasks:
conda activate dope roslaunch dope dope.launch
- Activate the DOPE environment and launch for pose estimation tasks:
- Manipulation Node:
- Run this node to enable robotic arm manipulation:
rosrun <robot_arm_demos_package> grasp.py
- Run this node to enable robotic arm manipulation:
- Decision Trees for Improved Planning: We plan to incorporate decision trees to enhance task-planning capabilities, allowing the robot to make smarter choices when performing tasks.
- Additional Multi-Modal Models: Adding other vision and language models for improved perception and interaction.
- This project is inspired by the work of Sebastian Castro. His contributions and insightful blog post, 2020 Review: Service Robotics – MIT CSAIL, provided invaluable inspiration and guidance for our team.
- We extend our gratitude for his dedication to advancing the field of robotics, which has greatly influenced our approach and the development of this project.