-
Notifications
You must be signed in to change notification settings - Fork 21
/
CHANGES
261 lines (234 loc) · 12.9 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
v 2.0.0
=======
Support for arbitrary 'database' and 'admin' headers throughout the object chain
Reworked the options system:
- an APIOptions object inherited at each "spawn" operation, with overrides
- environment-dependent defaults if nothing supplied
- "spawn" operations expose top-level aliases for selected settings
- per-method overrides. For timeouts, `max_time_ms` becomes an alias for more specific timeout types
- collection and database's `info` method gets timeout parameters
- (not user-facing) classes in the hierarchy other than DataAPIClient have breaking changes in their constructor (now options-first and keyword-arg-only)
Major restructuring of the codebase into directories
- BaseCursor not exported by astrapy.cursors anymore
- some imports (supposedly not used externally) changed slightly
- adapted and limited scope of test_imports.py
Removed AstraDBDatabaseAdmin's `from_api_endpoint` static method (reason: unused)
Rename main branch from 'master' ==> `main`
Cursors' state described by the `CursorState` enum
Removal of several deprecated modules/features from previous versions:
- 'core' (i.e. pre-1.0) library
- 'collection.bulk_write' and the associated result and exception classes
- 'vector=', 'vectorize=' and 'vectors=' parameters from collection methods
- 'retrieved' (=> `consumed`) and 'collection' (=> `data_source`) for cursors
- 'set_caller' method of `DataAPIClient`, `AstraDBAdmin`, `DataAPIDatabaseAdmin`, `AstraDBDatabaseAdmin`, `[Async]Database`, `[Async]Collection`
- 'caller_name' and 'caller_version' parameters where `callers` is now expected
- 'id' and 'region' to DataAPIClient's 'get_database' (and async version). Use `api_endpoint` which is now the one positional parameter.
- 'region' parameter of AstraDBDatabaseAdmin.get[_async]_database (was ignored already in the method)
- 'max_time_ms' parameter in AstraDBDatabaseAdmin's constructor and its `from_astra_db_admin` method (was already ignored in the body)
- The 'max_time_ms' parameter in DataAPIClient's 'get_database' method.
- Accordingly, the syntax `client[api_endpoint]` also does not accept a database ID anymore.
- 'namespace' parameter of several methods of: DataAPIClient, admin objects, Database and Collection (use `keyspace`)
- 'namespace' property of CollectionInfo, DatabaseInfo, CollectionNotFoundException, CollectionAlreadyExistsException (use `keyspace`)
- 'namespace' property of Databases and Collections (switch to `keyspace`)
- 'update_db_namespace' parameter to for admin keyspace CRD methods (use `update_db_keyspace`)
- 'use_namespace' for Databases (switch to `use_keyspace`)
- 'delete_all' method of Collection and AsyncCollection (use `delete_many({})`)
Removal of unused imports from toplevel __init__.py (ids, constants, cursors)
v. 1.5.2
========
Bugfix: `Database.get_collection` uses callers inheritance (same for async)
v. 1.5.1
========
Switching to endpoint as the only/primary way of specifying databases:
- AstraDBClient tolerates (deprecated, removal in 2.0) id[/region] in get_database
- (internal-use constructors and utilities only accept API Endpoint)
- AstraDBAdmin is the only place where id[/region] will remain an allowed path in 2.0
- all tests adapted to reflect this simplification
Admins: resilience against DevOps responses omitting 'keyspace'/'keyspaces'
AstraDBAdmin: added filters and automatic pagination to [async_]list_databases
Consistent handling of deletedCount=-1 from the API (always returned as-is)
Cursors: alignment and rework
- states are an enum; state names reworked for clarity (better cursor `__repr__`)
- _copy and _to_sync methods always return a clean pristine cursor
- "retrieved" property deprecated (removal 2.0). Use `consumed`.
- "collection" property deprecated (removal 2.0). Use `data_source`.
Deprecation of all `set_caller` (=> to be set at constructor-time) (removal in 2.0)
Callers and user-agent string:
- remove RAGStack automatic detection
- Deprecate caller_name/caller_version parameters in favour of "callers" pair list
- (minor) breaking change: passing only one of caller_name/caller_version to _copy/with_options will override the whole one-item callers pair list
Repo housekeeping
- using ruff for imports and formatting (instead of isort+black) by @cbornet
- add ruff rules UP(pyupgrade) by @cbornet
- remove `cassio` unused dependency
v. 1.5.0
========
Deprecation of "namespace-" terminology, replaced by "keyspace-" (removal in 2.0)
- deprecation of all *namespace* method names
- deprecation of the `namespace=` named argument to all methods
- deprecation of the `update_db_namespace` parameter to create_*space
Deprecation of collection bulk_write method (removal in 2.0)
APICommander logs warnings received from the Data API
Full removal of "core library" from the current API:
- DevOps API accessed through APICommander everywhere
- Admin objects use APICommander consistently
- [Async]Database and [Async]Collection directly use APICommander
- Cursor library uses APICommander directly
- Core library imports triggers a submodule-wide deprecation warning
- (simplification of the vector/vectorize deprecator utility)
Widened exception hierarchy with:
- DevOpsAPIHttpException
- DevOpsAPITimeoutException
- DevOpsAPIFaultyResponseException
Rearrangement into separate modules for:
- constants, strings, magic numbers and settings
- request low-level tools
- payload/response transformations
- (sometimes with temporary duplication to avoid depending on 'core')
Testing:
- testing on HCD targets Data API v 1.0.16
- added tests for APICommander
- improved tests for admin classes
Logging of API requests made more uniform and easier to read
Replaced collections.abc.Iterator => typing.Iterator for python3.8 compatibility
v. 1.4.2
========
Method 'update_one' of [Async]Collection: now invokes the corresponding API command.
Better URL-parsing error messages for the API endpoint (with guidance on expected format)
Improved __repr__ for: token/auth-related items, Database/Client classes, response+info objects
DataAPIErrorDescriptor can parse 'extend error' in the responses
Introduced DataAPIHttpException (subclass of both httpx.HTTPStatusError and DataAPIException)
testing on HCD:
- DockerCompose tweaked to invoke `docker compose`
- HCD 1.0.0 and Data API 1.0.15 as test targets
relaxed dependency on "uuid6" to most recent releases
core:
- prefetched find iterators: fix second-thread hangups in some cases (by @cbornet)
- added 'options' parameter to [Async]AstraDBCollection.update_one
v. 1.4.1
========
FindEmbeddingProvidersResult and descendant dataclasses:
- add handling of optional 'hint' and 'displayName' fields for parameters
- knowedge of optional-as-null vs optional-as-possibly-absent ancillary fields
Replace bson dependency with pymongo (#297, by @caseyclements)
v. 1.4.0
========
DatabaseAdmin classes retain a reference to the Async/Database instance that spawned it, if any
- introduced a spawner_database parameter to database admin constructors
- database admin can retroactively set the db's working namespace upon creation of same
- Idiom `database = client.get_database(...); database.get_database_admin().create_namespace("the_namespace", update_db_namespace=True)`
Database (and AsyncDatabase) classes admit null namespace:
- default to "default_keyspace" only for Astra, otherwise null
- as long as null, most operations are unavailable and error out
- a `use_namespace` method to (mutably) set the working namespace on a database instance
AstraDBDatabaseAdmin class is fully region-aware:
- can be instantiated with an endpoint (also `id` parameter aliased to `api_endpoint`)
- requires a region to be specified with an ID, unless auto-guess can be done
VectorizeOps: support for find_embedding_providers Database method
Support for multiple-header embedding api keys:
- `EmbeddingHeadersProvider` classes for `embedding_api_key` parameter
- AWS header provider in addition to the regular one-header one
- adapt CI to cover this setup
Testing:
- restructure CI to fully support HCD alongside Astra DB
- add details for testing new embedding providers
v. 1.3.1
========
Fixed bug in parsing endpoint domain names containing hyphens (#287), by @bradfordcp
Added isort for source code formatting
Updated abstractions diagram in README for non-Astra environments
v. 1.3.0
========
Integration testing covers Astra and nonAstra smoothly:
- idiomatic library
- vectorize_idiomatic
- nonAstra admin, i.e. namespace crud
Add the TokenProvider abstract class => and StaticTokenProvider, UsernamePasswordTokenProvider
Introduce CHANGES file.
Add __eq__ and _copy methods to APICommander class
Allow delete_many({}) with empty filter
Implement include_sort_vector option to Collection.find and get_sort_vector to cursors
Add Content-Type header to all API requests
Added HCD and CASSANDRA Environment values (besides the other non-Astra DSE and OTHER)
Clearer string repr of cursors ('retrieved' => 'yielded so far')
Deprecation of collection delete_all method in favour of delete_many(filter={})
Introduction of a custom deprecation decorator for async method removal tests
Deprecation of vector,vectors and vectorize params from collections and Operations
Remove several long-deprecated methods from **core API** (i.e. internal changes):
AstraDBCollection.delete => delete_one
AstraDBCollection.upsert => upsert_one
AsyncAstraDBCollection.upsert => upsert_one
AstraDB.truncate_collection => AstraDBCollection.clear
AsyncAstraDB.truncate_collection => AsyncAstraDBCollectionclear
Add support for null tokens in the core library
v. 1.2.1
========
Raise default chunk size for insert_many to 50
Improvements in docstrings, testing, support for latest responses from vectorize.
v. 1.2.0
========
Non-Astra environment awareness:
astrapy.constants.Environment enum for the "environment" parameter to client, etc
flexibility and adaptive defaults for data api url
environment knowledge trickles throughout all classes (client, admins, databases)
DataAPIDatabaseAdmin class (i.e. for namespace CRUD)
(internal) astrapy.api_commander.APICommander
(internal) astrapy.api_options.{BaseAPIOptions, CollectionAPIOptions}
$vectorize support:
"service options" for creating/retrieving collections, covering $vectorize needs
embedding_api_key parameter to collection (for "header" usage)
collection-level timeout parameter (overridable in single method calls)
expand "projection" type to include slice projections
client.get_database can accept an API endpoint directly
insert_many and bulk_write default to ordered=False
v. 1.1.0
========
Adds estimated_document_count method to Collection class.
v. 1.0.0
========
(Introduction of the "idiomatic API", relegating "core" to become non-user-facing.)
Split classes, modules, tests to keep the "idiomatic" layer well separate and not touch the "astrapy" layer
DDL and some DML methods to m1
Passing secondary keyspace to action workflows
Fix bug in copy and to_[a]sync when set_caller is later used
Full cross-namespace management in DDL
More DDL methods and signature adjustments for Database
Sl overridable copy methods
Cursor/AsyncCursor, find and distinct
Remove all 'unsupported' clutter; implement find_one (a/sync)
DML for idiomatic + necessary changes around
Bulk_write method
More management methods
Sl collateral commands
Commandcursor (+async), used in list_collections
Sl collection options + cap-aware count_documents
Fix #245: Delete CHANGES.md
Collection.drop + database/collection info and related methods/properties (metadata)
Adapt to latest choices in API semantics
More syntax changes as discussed + all docstrings
Full support for dotted key names in distinct
Refactor to feature 'idiomatic' first, with full back-compat with 'astrapy'
Add sorting in hashing for distinct and factor it away
Exception management with a hierarchy of Exception classes
Full timeout support
Sl api refinements
ObjectIDs and UUIDs handled throughout (+ tests)
Admin interfaces and classes
Docstrings for all the admin/client parts + minor improvements to docstrings around
Readme overhaul
Exporting logger for back-compatibility
Sl adjustments
Custom payload serialization for httpx to block NaNs
Improved docstrings (minor stuff)
Methods repr/str to all objects for graceful display
Full test suite on client/admin classes
Abstract DatabaseAdmin, admin standard utility conversion/methods + tests thereof
Collection options is a dataclass and not a dict anymore
Pdoc annotations to control auto-docs
Logging, create_database signature
Added async support for admin, as alternate methods on original classes
Use MultiCallTimeoutManager in create_collection methods
V1.0.0 ("pm convergenge m1") gets to master
(prior to 1.0.0)
================
What is now dubbed the "core API" and is not supposed to be used directly at all.