-
Notifications
You must be signed in to change notification settings - Fork 0
Getting Started
contact: vivek.raghuram@berkeley.edu
Depending on your interests, you'll need to download/clone different repositories of the general ECG system. This page attempts to describe the main interests, as well as the required repositories and dependencies for each.
- System Requirements
- Grammar Design
- Viewing ActSpecs
- Text-based Robot Demo
- Simulated Robot Demo
- Integrating to a New Product
- Installing and Running Speech Recognition
These requirements apply for every part below except Grammar Design. The system has been tested on OSX and Linux environments.
The following packages are required:
- Python (versions 2.7 and up should work)
-
Pyre, a python port of Zyre: use
pip install https://github.com/zeromq/pyre/archive/master.zip
- Jython, python for the java platform
- Six, a python package for python 2/3 compatibility.
- Java
You will also need the following repository from GitHub:
- ECG Framework
-
ecg-grammars
-
Note: the
analyzer.sh
script assumes this is installed in the same directory asecg_framework_code
-
Note: the
Note: Once you clone the repository, you'll also want to set your PYTHONPATH
in your bash_profile
to point to:
{INSTALL_PATH}/ecg_framework_code/src/main
If your primary interest is in viewing and modifying Embodied Construction Grammar (ECG), you may not need to download the rest of the "full-path" system. This is the functionality of earlier ECG releases. Instead, you'll need to clone the following repositories:
The first is a repository of hand-built ECG grammars. These grammars are necessary for the ECG Analyzer to produce a Semantic Specification (SemSpec) of an input utterance. To clone this repository, navigate to the directory of your choice and enter the following command:
git clone https://github.com/icsi-berkeley/ecg_grammars.git
This will create a new folder called ecg_grammars
on your machine, located in the directory in which you entered the command. If there are updates to the origin repository, you can retrieve them with the following commands:
cd ecg_grammars
git pull
The ECG Workbench is a tool for editing ECG grammars and visualizing SemSpecs (more info here). To clone this repository, navigate to the directory of your choice (ideally the same directory in which you cloned the ecg_grammars repository) and enter this command:
git clone https://github.com/icsi-berkeley/ecg_workbench_release.git
This will create a new folder called ecg_workbench_release
. It's a large repository, so it may take longer to clone than the ecg_grammars
repository. Once it's finished, you'll have access to the ECG Workbench. By default, this repository comes with three models of the workbench, built for different platforms:
- Linux
- Mac OS X
- Windows
In the ecg_workbench_release
folder, open up the workbench
directory, and then navigate into the folder corresponding to your machine. Open up the application.
For more information about using the ECG Workbench and viewing SemSpecs, check out additional documentation.
If your primary interest is in viewing the n-tuple data structure, which is the foundational communication paradigm for our natural language understanding system, then you will need at least:
For the first, see above for information on cloning the ECG Grammars repository.
The ECG Framework repository contains code for the core modules of our NLU system. It requires the ECG Grammars repository to run, as well as other dependencies. Once you've cloned ecg_grammars
, clone the ecg_framework_code
repository in the same directory:
git clone https://github.com/icsi-berkeley/ecg_framework_code.git
See here for information on running the system; this section explicitly covers the scripts required to view JSON n-tuples.
As mentioned here, you'll want to point your PYTHONPATH (in your .bash_profile
) the following:
export PYTHONPATH = {PATH_TO_FRAMEWORK}/src/main:{$PYTHONPATH}
For a general wiki on the framework, see here.
We have created a text-based demo that allows a user to type commands and questions into a Terminal prompt, and receive either answers or printed-out information about the robot's movements, e.g.:
> Robot1, move to the big red box!
FED1_ProblemSolver: robot1_instance is moving to (6.0, 6.0, 0.0)
If you're interested in using this, you'll need at least the following three repositories:
The instructions here provide an overview for installing, setting up, and running the demo.
Like with the ecg_framework_code
, you'll want to manually add this to your path:
export PYTHONPATH = {PATH_TO_ROBOT_CODE}/src/main:{$PYTHONPATH}
NOTE: If you don't want to install each repository separately, we suggest you use the ecg_interface repository, which contains the others as submodules. Follow the instructions in the README for installation.
We have also used this NLU system for a simulated robot demo (video). The bulk of this work was done using the Morse robotics simulator, but we have also implemented a version in ROS.
For both simulations, you'll need the following repositories:
If you're interested in running the system with the Morse simulator, you'll want to follow the installation instructions here.
For information on how to install and run our ROS demo, check out these instructions. Note that these instructions include the installation of ecg_interface, which includes the ecg_framework_code
, ecg_grammars
, and ecg_robot_code
directories (but does not include ecg_workbench_release
).
Ultimately, it's possible that you are interested in adapting the NLU system to an entirely new application. Part of the fundamental motivation for this research is to facilitate relatively simple re-targeting of a language understanding system to new domains. We have done this between Morse and ROS (see above), which are both in the "robotics" domain, and are working on an implementation for Starcraft, which is both a new domain and a new application.
Click here for a tutorial on retargeting this system to a new application.
We've integrated the Kaldi open source speech recognition toolkit to allow you to speak commands to an autonomous system. It's in the preliminary stages, but we have some files available for experimentation with the Robots demo.
To begin, you will have to download speech recognition models. In the instructions below, replace the single instance of /path/to/directory
with a path to where you want to store the models. The models take about 600M.
asrdir=/path/to/directory
mkdir -p $asrdir
cd $asrdir
wget https://github.com/icsi-berkeley/ecg_asr/releases/download/alpha1/asr_models_20160627.tar.gz -O - | tar xvzf -
cp online_nnet2_decoding.conf online_nnet2_decoding.conf.orig
sed s:/t/janin/ecg/asr/:$(pwd)/: < online_nnet2_decoding.conf.orig > online_nnet2_decoding.conf
You might need to set $asrdir
in your .bash_profile
.
Next, you'll need to install Kaldi. They've packaged it up to be fairly easy to install for an experienced unix-ish programmer, but it's quite big. You only need a single program, called online2-wav-nnet2-latgen-faster
, so, for your convenience, we've generated precompiled binaries for a few platforms. Note that these will likely become out of date in the future, and we do not intend to provide binaries for additional platforms. See http://kaldi-asr.org for how to install Kaldi yourself.
As of June 2016, we have three versions available: OSX_ELCAPITAN
is for Macintosh OS X 10.11 "El Capitan"; LINUX_GLIBC_2.23
is for reasonably recent versions of Linux; LINUX_GLIBC_2.12
if for older versions of Linux.
In the below, replace e.g. OSX_ELCAPITAN
with the version you want.
wget https://github.com/icsi-berkeley/ecg_asr/releases/download/alpha1/online2-wav-nnet2-latgen-faster_OSX_ELCAPITAN -O online2-wav-nnet2-latgen-faster
chmod uog+x online2-wav-nnet2-latgen-faster
export PATH=$asrdir:$PATH
Finally, you need to have the program sox
installed. It's an open source audio processing package available at http://sox.sourceforge.net. As usual for open source software, it's easiest to install it with a package manager (e.g. brew install sox
or apt-get install sox
).
At this point, you should be able to run speechagent.py
, which came packaged in the ecg-framework repository.
speechagent.py -asr $asrdir