Skip to content

Latest commit

 

History

History
83 lines (73 loc) · 3.53 KB

ToDoList.md

File metadata and controls

83 lines (73 loc) · 3.53 KB

Project Todo List

Tech Stack used

  • Autogen
  • Models :
    • LLM
      • GPT-3.5-turbo (OpenAI)
      • Claude-2.0 (Anthropic)
      • Gemma-2b-it (Google)
    • Embedding
      • mxbai-embed-large-v1 (Mixedbread AI)
    • CrossEmbedding
      • mxbai-rerank-base-v1 (Mixedbread AI)
    • Safety
      • Toxicity
        • bias-detection-model (d4data)
      • Bias
        • bias-detection-model (d4data)

Project : Coder Assistants

  • Code Assistant agent templates

    • Python Dev Team

      • Co-ordinating Manager
        • Technical Manager: Identifies which agent is required for the successive tasks.
      • Developer Team
        • Project Planner: Breaks down the initial problem into simpler chunks to be done by the code writer
        • Code Writer: Writes codes
        • QA developer: Write code tests to be done
      • Support Tools
        • Code Repo: Identifies already existing code repository that can be used for development.
        • Code Testing support: Runs and checks if the codes work as intended.
    • SQL Dev Team

      • Co-ordinating Manager
        • Technical Manager: Identifies which agent is required for the successive tasks.
      • Developer Team
        • Lead Analyst: Breaks down the complex issues in smaller queries that can be used by the Senior analyst and Junior Analyst.
        • Senior Analyst: Write queries for complex tasks
        • Junior Analyst: Write queries for simpler tasks
      • Support Tools
        • Database Admin: Helps the Analyst team to write queries by providing context about the tables and columns that can be used for building queries.
        • Query Testing support: Runs and checks if the queries work as intended.
  • Code Migration

    • SAS to PySpark (TBD)

      • Create a SQL Code and Assets repository (borrows Database Admin from SQL Dev team)
        • Tables and related metadata
          • Tables and their description
          • Tables Columns and their description [comment]: # (This resembles closely to a data dictionary.)
        • Relationships between tables
          • Join conditions from existing query repo
          • Keys mapping (FK,CK,PK etc)
        • Pre-existing queries as baseline
          • Queries for specific request as example (especially if request is closely related or similar)
      • Create a PySpark Code repository (TBD)
    • SAS to SQL (TBD)

    • Python(Pandas) to PySpark (TBD)

Learning for future updates