Releases: Arize-ai/phoenix
v0.0.24
This release updates Phoenix's capabilities for cluster-based analysis - providing more metrics to help you assess the performance and data quality of your unstructured data.
✨ Cluster Performance Metrics
Clusters can now be analyzed for model performance degradation! Our new release includes accuracy_score
as a model performance metric. Using accuracy as the base metric on the embedding projection allows you to drill into clusters that map to bad predictions quicker than ever before. Finding pockets of bad performance is as simple as picking the metric and sorting the clusters by worst performing. If you are using Phoenix to identify production data that should be re-labeled and fed back into your training pipeline, this is the feature for you.
cluster_performance.mp4
✨ Cluster Data Quality / Custom Metrics
Clusters can now be analyzed via ad-hoc metrics! You can now calculate the average of any numeric feature, tag, prediction, or actual sent into Phoenix. This means you can now find "low-quality" clusters via the heuristic of your choosing! Below is an example of how precision@k
for document retrieval (from a vector store) is used to identify clusters of chatbot queries that are failing to provide a good answer. The neat thing about this feature is that you can use Phoenix to build your own EDA heuristic! Care about rouge score or LLM-assisted evaluations? You can now use these to analyze your embeddings and to discover anomalies by simply sorting your clusters! This feature gives you, the data scientist, a powerful tool to formulate bespoke heuristics for identifying clusters of low performance, quality, and/or drift. We hope you like it!
context_retrieval.mp4
What's Changed
- docs: dolly vs. pythia by @axiomofjoy in #818
- feat: data quality metric by cluster by @RogerHYang in #804
- feat(dimensions): Add the ability to filter by data_type by @mikeldking in #822
- feat(embeddings): metric selector by @mikeldking in #821
- fix: nan bug for gql by @RogerHYang in #832
- feat: add stand-alone clusters endpoint for GraphQL query by @RogerHYang in #831
- feat(embeddings): cluster sorting by @mikeldking in #830
- chore: make placeholder text more obvious by @mikeldking in #833
- fix: change float16 to float32 as dtype for the nan series by @RogerHYang in #837
- fix: return nan on NotImplementedError (when binning on np.float16) by @RogerHYang in #838
- docs: sync 06-09-2023 by @mikeldking in #840
- feat(gql): add prediction id to event metadata by @RogerHYang in #843
- fix: coerce lists to arrays by @RogerHYang in #845
- feat: add performance metrics to each cluster by @RogerHYang in #828
- feat: accuracy timeseries by @RogerHYang in #842
- feat(embeddings): cluster data quality metrics by @mikeldking in #846
- docs: Update DEVELOPMENT.md with pypi publish changes. by @mikeldking in #849
- fix(embeddings): always place clusters with empty metrics at the bottom by @mikeldking in #850
- fix: show not found error when server is no longer running by @mikeldking in #853
- fix: guess whether a column contains any vector or all scalars by @RogerHYang in #854
- chore: camel-case metrics by @mikeldking in #856
- fix: skip empty interval bin with infinity endpoints (when all data are missing values) by @RogerHYang in #857
- feat(embeddings): cluster performance metrics by @mikeldking in #855
- fix(embeddings): force re-render clusters when opacity changes by @mikeldking in #858
- feat: show prediction id in selection details by @RogerHYang in #860
- fix: hide data quality metrics if empty by @mikeldking in #861
- fix: use random init when spectral init (the default) cannot be used by @RogerHYang in #862
- fix: replace NaT (Not a Time) with now (when dataset is empty) by @RogerHYang in #863
- fix(ui): cleanup event details for llm use-case by @mikeldking in #865
Full Changelog: 0.0.23...v0.0.24
0.0.23
❇️ HDBSCAN Tuning! ❇️
Dynamically adjust HDBSCAN parameters to get your clusters just right.
hdbscan_short.mp4
What's Changed
- ci: convert numpy scalars before graphql sees them by @RogerHYang in #788
- feat: add dimension filters to graphql model endpoint by @RogerHYang in #796
- ci: rename function by @RogerHYang in #797
- fix: numba.jit() deprecation warning by @pbadhe in #799
- feat: hdbscan tuning by @mikeldking in #798
- fix: single point selection selecting the wrong ID by @mikeldking in #803
- fix(embeddings): keep point-selection possible during move by @mikeldking in #805
- feat(embeddings): show the dataset in the table by @mikeldking in #812
Full Changelog: v0.0.22...0.0.23
v0.0.23rc1
What's Changed
- fix: single point selection selecting the wrong ID by @mikeldking in #803
Full Changelog: v0.0.23rc0...v0.0.23rc1
v0.0.23rc0
✨ HDBSCAN Tuning
hdbscan.mp4
What's Changed
- ci: convert numpy scalars before graphql sees them by @RogerHYang in #788
- feat: add dimension filters to graphql model endpoint by @RogerHYang in #796
- ci: rename function by @RogerHYang in #797
- fix: numba.jit() deprecation warning by @pbadhe in #799
- feat: hdbscan tuning by @mikeldking in #798
Full Changelog: v0.0.22...v0.0.23rc0
v0.0.22
Fixes UMAP retrieval. 0.0.21 nd 0.0.20 have been yanked
What's Changed
- fix: bug with np.floating by @RogerHYang in #787
Full Changelog: v0.0.21...v0.0.22
v0.0.21
This is a release to unblock conda
What's Changed
- ci: re-pin version for strawberry-graphql by @RogerHYang in #783
- feat: raise the max point size by @mikeldking in #786
Full Changelog: v0.0.20...v0.0.21
v0.0.20
✨ New Dimension Details for Tabular data!
Troubleshoot data quality and drift problems of your tabular features and tags!
v0.0.20.mp4
What's Changed
- docs: Update DEVELOPMENT.md with node/npm deps by @mikeldking in #707
- chore: main to docs sync 05-10-23 by @mikeldking in #712
- chore: add cla and automation by @axiomofjoy in #713
- chore: sync main into docs 05-11-2023 by @mikeldking in #717
- docs: taylor swift colab by @kryskirkland in #719
- chore: cleanup notebooks by @mikeldking in #720
- ci: enable autofix for ruff pre-commit hook by @RogerHYang in #723
- fix: add random sampling for umap by @RogerHYang in #724
- fix: retain filtered index at timestamp creation by @RogerHYang in #726
- feat: graphql endpoint for histograms by @RogerHYang in #722
- docs: update llm-observability.md by @eltociear in #690
- fix: buggy formula for drift ratio by @RogerHYang in #729
- feat: add quantile metrics to graphql endpoints for data quality by @RogerHYang in #728
- feat: chart stat asides by @mikeldking in #715
- feat(dimension): dimension segments bar chart by @mikeldking in #731
- docs: Create CODE_OF_CONDUCT.md by @mikeldking in #735
- feat(dimension): drift breakdown chart to isolate the difference in dimension segments distribution by @mikeldking in #736
- ci: upgrade mypy and strawberry by @RogerHYang in #738
- feat: support ISO8601 formatted strings in timestamp column by @pbadhe in #737
- docs: sync 5-18-2023 by @mikeldking in #739
- chore: "docs: sync 5-18-2023" by @mikeldking in #740
- docs: sync 5-18-2023 by @mikeldking in #741
- feat(dimension): Quantiles stream chart by @mikeldking in #749
- ci: allow python 3.11 by @RogerHYang in #762
- feat: graphql endpoint for hdbscan by @RogerHYang in #761
- ci: replace float(...) with numpy constants by @RogerHYang in #764
- ci: remove unused code by @RogerHYang in #765
- fix: missing validation (vectors should have same length between datasets) by @RogerHYang in #763
- fix: segment summary for single dataset by @RogerHYang in #769
- feat: add number fommatting functions by @mikeldking in #766
- build: Cross-platform support npm scripts by @pbadhe in #772
- fix(dimension): UX fixes for dimension details by @mikeldking in #771
- fix: buggy categories (need to convert everything to string) by @RogerHYang in #767
- feat(dimension): show reference dataset data quality stats by @mikeldking in #774
- ci: use 2 bytes each instead of 8 for nan series by @RogerHYang in #773
New Contributors
- @kryskirkland made their first contribution in #719
- @eltociear made their first contribution in #690
- @pbadhe made their first contribution in #737
Full Changelog: v0.0.19...v0.0.20
v0.0.19
What's Changed
- fix: saving fixtures locally (for development) by @RogerHYang in #656
- docs: add docs badge to readme by @axiomofjoy in #659
- refactor: move to time axis by @mikeldking in #677
- feat(ui): time formatting on x axis by @mikeldking in #679
- ci: lift casting to float by @mikeldking in #681
- feat(gql): distinguish discrete vs continuous data by @mikeldking in #685
- docs: Update README.md w/ twitter links by @mikeldking in #686
- docs: anthropic evals exploration use-case by @mikeldking in #693
- ci: fix formatting by @RogerHYang in #695
- feat: add dataset role to graphql endpoints for data quality metrics by @RogerHYang in #694
- feat(gql): basic segment interface by @mikeldking in #696
- feat: dimension details route by @mikeldking in #665
- docs: sync docs to main 05-09-2023 by @mikeldking in #699
- feat(embeddings): raise dataset sample size max to 10000 by @mikeldking in #705
- fix(metrics): coerce timestamp indexes to be UTC for Ubuntu by @RogerHYang in #711
Full Changelog: v0.0.18...v0.0.19
v0.0.18rc1
What's Changed
- fix: saving fixtures locally (for development) by @RogerHYang in #656
- docs: add docs badge to readme by @axiomofjoy in #659
- refactor: move to time axis by @mikeldking in #677
- feat(ui): time formatting on x axis by @mikeldking in #679
- ci: lift casting to float by @mikeldking in #681
- feat(gql): distinguish discrete vs continuous data by @mikeldking in #685
- docs: Update README.md w/ twitter links by @mikeldking in #686
- docs: anthropic evals exploration use-case by @mikeldking in #693
- ci: fix formatting by @RogerHYang in #695
- feat: add dataset role to graphql endpoints for data quality metrics by @RogerHYang in #694
- feat(gql): basic segment interface by @mikeldking in #696
- feat: dimension details route by @mikeldking in #665
- docs: sync docs to main 05-09-2023 by @mikeldking in #699
- feat(embeddings): raise dataset sample size max to 10000 by @mikeldking in #705
Full Changelog: v0.0.18...v0.0.18rc1
v0.0.18
What's Changed
- chore: add conda-forge badge by @mikeldking in #648
- chore: fix phoenix URL by @mikeldking in #649
- docs: llm notebooks by @axiomofjoy in #651
- feat: show cluster count in tab header by @mikeldking in #653
- feat: show human-friendly names for datasets in the UI by @mikeldking in #654
Full Changelog: v0.0.17...v0.0.18