Merge remote-tracking branch 'vocodedev/main'

ArtisanLabs · Jun 20, 2024 · 4ef3813 · 4ef3813
2 parents 9e5a207 + b74c85b
commit 4ef3813
Show file tree

Hide file tree

Showing 257 changed files with 18,699 additions and 10,855 deletions.
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -15,8 +15,6 @@ jobs:
       fail-fast: false
       matrix:
         python-version:
-          - "3.8"
-          - "3.9"
           - "3.10"
           - "3.11"
         poetry-version:

diff --git a/Makefile b/Makefile
@@ -9,6 +9,12 @@ transcribe:
 synthesize:
 	poetry run python playground/streaming/synthesizer/synthesize.py
 
+turn_based_conversation:
+	poetry run python quickstarts/turn_based_conversation.py
+
+streaming_conversation:
+	poetry run python quickstarts/streaming_conversation.py
+
 PYTHON_FILES=.
 lint: PYTHON_FILES=vocode/ quickstarts/ playground/
 lint_diff typecheck_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d main | grep -E '\.py$$')

diff --git a/README.md b/README.md
@@ -2,10 +2,11 @@
 
 ![Hero](https://user-images.githubusercontent.com/6234599/228337850-e32bb01d-3701-47ef-a433-3221c9e0e56e.png)
 
-[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/vocodehq.svg?style=social&label=Follow%20%40vocodehq)](https://twitter.com/vocodehq) [![GitHub Repo stars](https://img.shields.io/github/stars/vocodedev/vocode-python?style=social)](https://github.com/vocodedev/vocode-python)
+[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/vocodehq.svg?style=social&label=Follow%20%40vocodehq)](https://twitter.com/vocodehq) [![GitHub Repo stars](https://img.shields.io/github/stars/vocodedev/vocode-core?style=social)](https://github.com/vocodedev/vocode-core)
+[![pypi](https://img.shields.io/pypi/v/vocode.svg)](https://pypi.python.org/pypi/vocode)
 [![Downloads](https://static.pepy.tech/badge/vocode/month)](https://pepy.tech/project/vocode)
 
-[Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev) | [Dashboard](https://app.vocode.dev)
+[Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev/open-source) | [Dashboard](https://app.vocode.dev)
 
 </div>
 
@@ -19,11 +20,11 @@ We're actively looking for community maintainers, so please reach out if interes
 
 # ⭐️ Features
 
-- 🗣 [Spin up a conversation with your system audio](https://docs.vocode.dev/python-quickstart)
-- ➡️ 📞 [Set up a phone number that responds with a LLM-based agent](https://docs.vocode.dev/telephony#inbound-calls)
-- 📞 ➡️ [Send out phone calls from your phone number managed by an LLM-based agent](https://docs.vocode.dev/telephony#outbound-calls)
-- 🧑‍💻 [Dial into a Zoom call](https://github.com/vocodedev/vocode-python/blob/main/vocode/streaming/telephony/hosted/zoom_dial_in.py)
-- 🤖 [Use an outbound call to a real phone number in a Langchain agent](https://docs.vocode.dev/langchain-agent)
+- 🗣 [Spin up a conversation with your system audio](https://docs.vocode.dev/open-source/python-quickstart)
+- ➡️ 📞 [Set up a phone number that responds with a LLM-based agent](https://docs.vocode.dev/open-source/telephony#inbound-calls)
+- 📞 ➡️ [Send out phone calls from your phone number managed by an LLM-based agent](https://docs.vocode.dev/telephony/open-source/#outbound-calls)
+- 🧑‍💻 [Dial into a Zoom call](https://github.com/vocodedev/vocode-core/blob/53b01dab0b59f71961ee83dbcaf3653a6935c2e3/vocode/streaming/telephony/conversation/zoom_dial_in.py)
+- 🤖 [Use an outbound call to a real phone number in a Langchain agent](https://docs.vocode.dev/open-source/langchain-agent)
 - Out of the box integrations with:
   - Transcription services, including:
     - [AssemblyAI](https://www.assemblyai.com/)
@@ -34,19 +35,16 @@ We're actively looking for community maintainers, so please reach out if interes
     - [RevAI](https://www.rev.ai/)
     - [Whisper](https://openai.com/blog/introducing-chatgpt-and-whisper-apis)
     - [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)
-
   - LLMs, including:
-    - [ChatGPT](https://openai.com/blog/chatgpt)
-    - [GPT-4](https://platform.openai.com/docs/models/gpt-4)
+    - [OpenAI](https://platform.openai.com/docs/models)
     - [Anthropic](https://www.anthropic.com/)
-    - [GPT4All](https://github.com/nomic-ai/gpt4all)
   - Synthesis services, including:
     - [Rime.ai](https://rime.ai)
     - [Microsoft Azure](https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/)
     - [Google Cloud](https://cloud.google.com/text-to-speech)
     - [Play.ht](https://play.ht)
     - [Eleven Labs](https://elevenlabs.io/)
-    - [Coqui](https://coqui.ai/)
+    - [Cartesia](https://cartesia.ai/)
     - [Coqui (OSS)](https://github.com/coqui-ai/TTS)
     - [gTTS](https://gtts.readthedocs.io/)
     - [StreamElements](https://streamelements.com/)
@@ -59,45 +57,63 @@ Check out our React SDK [here](https://github.com/vocodedev/vocode-react-sdk)!
 
 We're an open source project and are extremely open to contributors adding new features, integrations, and documentation! Please don't hesitate to reach out and get started building with us.
 
-For more information on contributing, see our [Contribution Guide](https://github.com/vocodedev/vocode-python/blob/main/contributing.md).
+For more information on contributing, see our [Contribution Guide](https://github.com/vocodedev/vocode-core/blob/main/contributing.md).
 
-And check out our [Roadmap](https://github.com/vocodedev/vocode-python/blob/main/roadmap.md).
+And check out our [Roadmap](https://github.com/vocodedev/vocode-core/blob/main/roadmap.md).
 
 We'd love to talk to you on [Discord](https://discord.gg/NaU4mMgcnC) about new ideas and contributing!
 
 # 🚀 Quickstart
 
 ```bash
-pip install 'vocode'
+pip install vocode
 ```
 
 ```python
 import asyncio
-import logging
 import signal
-from vocode.streaming.streaming_conversation import StreamingConversation
+
+from pydantic_settings import BaseSettings, SettingsConfigDict
+
 from vocode.helpers import create_streaming_microphone_input_and_speaker_output
-from vocode.streaming.transcriber import *
-from vocode.streaming.agent import *
-from vocode.streaming.synthesizer import *
-from vocode.streaming.models.transcriber import *
-from vocode.streaming.models.agent import *
-from vocode.streaming.models.synthesizer import *
+from vocode.logging import configure_pretty_logging
+from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
+from vocode.streaming.models.agent import ChatGPTAgentConfig
 from vocode.streaming.models.message import BaseMessage
-import vocode
-
-# these can also be set as environment variables
-vocode.setenv(
-    OPENAI_API_KEY="<your OpenAI key>",
-    DEEPGRAM_API_KEY="<your Deepgram key>",
-    AZURE_SPEECH_KEY="<your Azure key>",
-    AZURE_SPEECH_REGION="<your Azure region>",
+from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
+from vocode.streaming.models.transcriber import (
+    DeepgramTranscriberConfig,
+    PunctuationEndpointingConfig,
 )
+from vocode.streaming.streaming_conversation import StreamingConversation
+from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
+from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriber
 
+configure_pretty_logging()
 
-logging.basicConfig()
-logger = logging.getLogger(__name__)
-logger.setLevel(logging.DEBUG)
+
+class Settings(BaseSettings):
+    """
+    Settings for the streaming conversation quickstart.
+    These parameters can be configured with environment variables.
+    """
+
+    openai_api_key: str = "ENTER_YOUR_OPENAI_API_KEY_HERE"
+    azure_speech_key: str = "ENTER_YOUR_AZURE_KEY_HERE"
+    deepgram_api_key: str = "ENTER_YOUR_DEEPGRAM_API_KEY_HERE"
+
+    azure_speech_region: str = "eastus"
+
+    # This means a .env file can be used to overload these settings
+    # ex: "OPENAI_API_KEY=my_key" will set openai_api_key over the default above
+    model_config = SettingsConfigDict(
+        env_file=".env",
+        env_file_encoding="utf-8",
+        extra="ignore",
+    )
+
+
+settings = Settings()
 
 
 async def main():
@@ -106,8 +122,6 @@ async def main():
         speaker_output,
     ) = create_streaming_microphone_input_and_speaker_output(
         use_default_devices=False,
-        logger=logger,
-        use_blocking_speaker_output=True
     )
 
     conversation = StreamingConversation(
@@ -116,24 +130,25 @@ async def main():
             DeepgramTranscriberConfig.from_input_device(
                 microphone_input,
                 endpointing_config=PunctuationEndpointingConfig(),
-            )
+                api_key=settings.deepgram_api_key,
+            ),
         ),
         agent=ChatGPTAgent(
             ChatGPTAgentConfig(
+                openai_api_key=settings.openai_api_key,
                 initial_message=BaseMessage(text="What up"),
                 prompt_preamble="""The AI is having a pleasant conversation about life""",
             )
         ),
         synthesizer=AzureSynthesizer(
-            AzureSynthesizerConfig.from_output_device(speaker_output)
+            AzureSynthesizerConfig.from_output_device(speaker_output),
+            azure_speech_key=settings.azure_speech_key,
+            azure_speech_region=settings.azure_speech_region,
         ),
-        logger=logger,
     )
     await conversation.start()
     print("Conversation started, press Ctrl+C to end")
-    signal.signal(
-        signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate())
-    )
+    signal.signal(signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate()))
     while conversation.is_active():
         chunk = await microphone_input.get_audio()
         conversation.receive_audio(chunk)
@@ -145,8 +160,8 @@ if __name__ == "__main__":
 
 # 📞 Phone call quickstarts
 
-- [Telephony Server - Self-hosted](https://docs.vocode.dev/telephony)
+- [Telephony Server - Self-hosted](https://docs.vocode.dev/open-source/telephony)
 
 # 🌱 Documentation
 
-[docs.vocode.dev](https://docs.vocode.dev/)
+[docs.vocode.dev](https://docs.vocode.dev/open-source)
diff --git a/apps/client_backend/Dockerfile b/apps/client_backend/Dockerfile
@@ -1,4 +1,4 @@
-FROM python:3.9-bullseye
+FROM python:3.11-bullseye
 
 # get portaudio and ffmpeg
 RUN apt-get update \
@@ -15,4 +15,4 @@ RUN poetry config virtualenvs.create false
 RUN poetry install --no-dev --no-interaction --no-ansi
 COPY main.py /code/main.py
 
-CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "3000"]
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "3000"]
diff --git a/apps/client_backend/main.py b/apps/client_backend/main.py
@@ -1,23 +1,19 @@
-import logging
+from dotenv import load_dotenv
 from fastapi import FastAPI
 
-from vocode.streaming.models.agent import ChatGPTAgentConfig
-from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
-from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
-
+from vocode.logging import configure_pretty_logging
 from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
 from vocode.streaming.client_backend.conversation import ConversationRouter
+from vocode.streaming.models.agent import ChatGPTAgentConfig
 from vocode.streaming.models.message import BaseMessage
-
-from dotenv import load_dotenv
+from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
+from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
 
 load_dotenv()
 
 app = FastAPI(docs_url=None)
 
-logging.basicConfig()
-logger = logging.getLogger(__name__)
-logger.setLevel(logging.DEBUG)
+configure_pretty_logging()
 
 conversation_router = ConversationRouter(
     agent_thunk=lambda: ChatGPTAgent(
@@ -31,7 +27,6 @@
             output_audio_config, voice_name="en-US-SteffanNeural"
         )
     ),
-    logger=logger,
 )
 
 app.include_router(conversation_router.get_router())
-Original file line number
+Diff line change
@@ Expand Up / @@ -15,8 +15,6 @@ jobs: @@
           fail-fast: false
           matrix:
             python-version:
-              - "3.8"
-              - "3.9"
               - "3.10"
               - "3.11"
             poetry-version:
@@ Expand Down @@