idiap · qanastek · Jul 23, 2025 · Jul 23, 2025 · Jul 23, 2025 · Jul 23, 2025
diff --git a/.flake8 b/.flake8
@@ -2,5 +2,5 @@
 max-line-length = 120
 exclude =
     .venv,
-    ./dscaper
+    ./tutorials/dscaper
     ./git
diff --git a/.gitignore b/.gitignore
@@ -6,10 +6,27 @@ docs_html
 __*__
 
 dscaper_data/
-
-#audio files
 *.wav
 *.mp3
+*.png
 *.mp4
 *.flac
 *.ogg
+*.zip
+!tests/data/my_custom_voices.zip
+!tests/data/demo_dialog_doctor_patient.json
+!tests/data/customer_support_dialogue.json
+*.tar
+old.git
+tutorials/demo_dialog_doctor_patient.json
+tutorials/demo_dialog_doctor_patient_no_age_no_gender.json
+tutorials/customer_support_dialogue.json
+tutorials/=0.9.4
+tutorials/dscaper
+tutorials/dscaper_data
+tutorials/dscaper_data_customer_support
+tutorials/audio_outputs
+tutorials/audio_outputs_customer_support
+tutorials/my_custom_voices
+tutorials/room.png
+*audio_dialog.json
diff --git a/docs/about/changelog.rst b/docs/about/changelog.rst
@@ -0,0 +1,150 @@
+
+ChangeLog
+=========
+
+All notable changes to SDialog will be documented here.
+
+----
+
+[0.3.0] 2025-09-03 🚀
+---------------------
+
+Added
+^^^^^
+
+
+* **sdialog**\ : 
+
+  * ``Context``\ : new class class to explicitly model the common/shared context of conversations (#73)
+  * ``Dialog``\ : merge functionality - Added option to merge consecutive turns of the same speaker when loading a dialog (#77)
+  * ``Dialog``\ : built-in string support - Added support to built-in str functions for ``Dialog`` class (#83)
+
+* **sdialog.agents**\ : Added new ``sdialog.agents`` module and moved ``Agent`` class inside (#81)
+
+  * ``Agent``\ : thinking capabilities - Agents can now handle internal thinking processes (#95)
+  * ``Agent``\ : tools support - Added tools capabilities to Agents (e.g. RAG or any other function) (#84)
+
+    * New tutorial for agents with tools and thoughts.
+
+* **sdialog.generators**\ : 
+
+  * ``ContextGenerator``\ : new class added to explicitly model the common/shared context of conversations (#73)
+  * ``Paraphraser``\ : new class class to paraphrase dialogues (#76)
+
+* **sdialog.evaluation**\ : 
+
+  * ``LinguisticFeatureScore``\ : new class added to compute Flesch reading ease, Gunning fog, Hesitation rate, and/or Mean turn length (#63)
+
+* **sdialog.personas**\ : 
+
+  * ``Customer`` and ``SupportAgent``\ : new personas added for customer service dialogues (#85)
+  * ``Persona``\ : Added static method to get the list of all attributes in ``Persona`` class (#79)
+
+Changed
+^^^^^^^
+
+
+* **sdialog**\ : Improved metadata handling (#66)
+* **sdialog.interpretability**\ : Improved and simplified the way inspection targets are defined in ``interpretability`` submodule (#78)
+* **sdialog.evaluation.base**\ : 
+
+  * ``LLMJudgeYesNoOutput``\ : Renamed attribute ``yes`` to ``positive`` (#86)
+  * ``LLMJudgeScoreOutput``\ : Renamed attribute ``feedback`` to ``reason`` (#86)
+
+Fixed
+^^^^^
+
+
+* **sdialog.generators**\ : Fixed potential bug in ``PersonaDialogGenerator`` class (#67)
+
+Enhanced
+^^^^^^^^
+
+
+* **sdialog.agents**\ : Added ``base_model`` attribute to ``Agent`` to direclty access the LLM's underlying model for mechanistic interpretability (#74)
+* **sdialog.config**\ : Added ``clear_cache()`` method to config (#75)
+
+Documentation
+^^^^^^^^^^^^^
+
+
+* API Documentation: Refactored/cleaned all components and added docstrings with examples (#82, #88)
+* Updated all tutorials to work with new code and added "Open in Colab" badges
+* Completed API documentation for initial official release (#87)
+* Automatic generation of ``llm.txt`` from API documentation (24f6ee6)
+
+----
+
+[0.1.0] 2025-08-05 🌱
+---------------------
+
+Added
+^^^^^
+
+
+* Multi-backend support (Hugging Face, Ollama, OpenAI, AWS)
+* Enhanced persona generation (beyond initial ``PersonaDialogGenerator``\ )
+* Interpretability module (\ ``sdialog.interpretability``\ ): inspectors, steerers, hooks, intruders
+* Evaluation module (\ ``sdialog.evaluation``\ ): metrics, LLM-as-a-judge scoring, evaluators, dataset comparators
+
+Changed
+^^^^^^^
+
+
+* Standardized / improved dialog format
+
+Notes
+^^^^^
+
+
+* 
+  ..
+
+     500 commits since 0.0.2 (post-JSALT 2025 consolidation)
+
+
+Pending
+^^^^^^^
+
+
+* Audio module (\ ``sdialog.audio``\ ) integration
+* Documentation updates
+
+----
+
+[0.0.2] 2025-06-03 🔧
+---------------------
+
+Added
+^^^^^
+
+
+* ``language`` attribute to ``Persona`` class
+* 
+  ``PersonaDialogGenerator`` to ``generators`` module to support persona-based dialogue generatin with single LLM:
+
+  .. code-block:: python
+
+     from sdialog.generators import PersonaDialogGenerator
+
+     dialog_generator = PersonaDialogGenerator(
+         model=MODEL_NAME,
+         persona_a=bob_persona,
+         persona_b=alice_persona,
+     )
+
+     dialog_generator.generate().print()
+
+Fixed
+^^^^^
+
+
+* Python 2 and 3 compatibility problem with scikit-learn (using version 0.20.1 from now on)
+* PyPi: setup.py: ``long_description_content_type`` set to ``'text/markdown'``
+
+----
+
+[0.0.1] 2025-05-22 🎉
+---------------------
+
+*(initial release)*
diff --git a/docs/about/contributing.rst b/docs/about/contributing.rst
@@ -0,0 +1,127 @@
+
+Contributing
+============
+
+Thanks for your interest in the project — you're awesome! 😎🎉
+
+Any kind of help is welcome (Code, Bug reports, Content, Data, Documentation, Design, Examples, Ideas, Feedback, etc.). Issues and Pull Requests are encouraged: from a tiny typo fix to a new feature. Help us make SDialog better 👍
+
+You can use the Edit button (pencil icon) on GitHub to quickly propose changes to any file via the web UI.
+
+We follow `Chris Beams' guidelines <https://chris.beams.io/posts/git-commit/>`_ for commit messages.
+
+Development installation
+------------------------
+
+.. code-block:: bash
+
+   git clone git@github.com:idiap/sdialog.git
+   cd sdialog
+   pip install -e .
+
+Running tests & style
+---------------------
+
+.. code-block:: bash
+
+   flake8 --ignore=W503 --max-line-length=120
+   pytest -v
+
+Coverage (HTML + terminal):
+
+.. code-block:: bash
+
+   pytest -v --cov=src/sdialog --cov-report=term-missing --cov-report=html
+   # Open htmlcov/index.html
+
+Manual documentation build
+--------------------------
+
+Generate HTML:
+
+.. code-block:: bash
+
+   cd docs
+   python -m sphinx -T -b html -d _build/doctrees -D language=en . ../docs_html
+
+Regenerate API reference (only needed if new submodules are are added):
+
+.. code-block:: bash
+
+   cd docs
+   sphinx-apidoc -f --ext-autodoc -o api ../src/sdialog
+
+ReadTheDocs latest build list: https://app.readthedocs.org/projects/sdialog/
+
+Release (PyPI)
+--------------
+
+
+#. Update version in ``src/sdialog/util.py`` (follow semver)
+#. Update CHANGELOG (if present)
+#. Tag & push
+   .. code-block:: bash
+
+      git commit -m "Release v0.x.x"
+      git tag v0.x.x
+      git push origin main --tags
+
+#. Build & upload:
+   .. code-block:: bash
+
+      python -m build
+      python -m twine upload dist/*
+
+Guidelines
+----------
+
+
+* Keep functions/classes small & composable
+* Add/extend tests for new features or bug fixes
+* Document public APIs (docstrings + docs reference where appropriate)
+* Prefer pure functions where state is not needed
+* Avoid introducing heavy deps without discussion (open issue first)
+* Use meaningful names; avoid abbreviations except standard ones (LLM, NLP, etc.)
+
+Adding tutorials / notebooks
+----------------------------
+
+Place new notebooks under ``tutorials/`` and keep naming numeric + descriptive (e.g., ``8.new_feature_example.ipynb``\ ). Ensure they run top-to-bottom in Colab. Use lightweight models or small number of elements to keep runtime short.
+
+Opening an issue
+----------------
+
+Provide:
+
+
+* Summary
+* Steps to reproduce (if bug)
+* Expected vs actual
+* Environment (Python version, OS, backend model)
+* Minimal reproducible code snippet
+
+Pull request checklist
+----------------------
+
+
+* [ ] Feature / bug issue linked (if applicable)
+* [ ] Tests added or updated
+* [ ] Docs / examples updated
+* [ ] No lint errors
+* [ ] Local tests pass
+* [ ] Changelog updated (if user-facing change)
+
+Communication
+-------------
+
+Use GitHub Issues / Discussions for feature proposals. For larger changes, open a draft PR early for feedback.
+
+AI-assisted development
+-----------------------
+
+This project provides an `llm.txt file <https://sdialog.readthedocs.io/en/latest/llm.txt>`_ following the `llms.txt specification <https://llmstxt.org/>`_ for AI coding assistants. GitHub Copilot and other AI tools can fetch structured project information with: ``#fetch https://sdialog.readthedocs.io/en/latest/llm.txt``
+
+Thanks
+------
+
+Your contributions make the project better for everyone. 🙏