Skip to content

Commit

Permalink
Merge pull request #63 from seapagan/add-list-etc
Browse files Browse the repository at this point in the history
  • Loading branch information
seapagan authored Jan 28, 2025
2 parents 13ed6fd + 105bfbb commit a23821d
Show file tree
Hide file tree
Showing 17 changed files with 900 additions and 90 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ repos:
- id: end-of-file-fixer

- repo: https://github.com/renovatebot/pre-commit-hooks
rev: 39.133.3
rev: 39.137.2
hooks:
- id: renovate-config-validator
files: ^renovate\.json$
Expand Down
11 changes: 3 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@ time).
The ideal use case is more for Python CLI tools that need to store data in a
database-like format without needing to learn SQL or use a full ORM.

Full documentation is available on the [Documentation
Website](https://sqliter.grantramsay.dev)
Full documentation is available on the [Website](https://sqliter.grantramsay.dev)

> [!CAUTION]
> This project is still in the early stages of development and is lacking some
Expand All @@ -29,11 +28,6 @@ Website](https://sqliter.grantramsay.dev)
> minimum and the releases and documentation will be very clear about any
> breaking changes.
>
> Also, structures like `list`, `dict`, `set` etc are not supported **at this
> time** as field types, since SQLite does not have a native column type for
> these. This is the **next planned enhancement**. These will need to be
> `pickled` first then stored as a BLOB in the database.
>
> See the [TODO](TODO.md) for planned features and improvements.
- [Features](#features)
Expand All @@ -46,7 +40,8 @@ Website](https://sqliter.grantramsay.dev)
## Features

- Table creation based on Pydantic models
- Supports `date` and `datetime` fields. List/Dict/Set fields are planned.
- Supports `date` and `datetime` fields
- Support for complex data types (`list`, `dict`, `set`, `tuple`) stored as BLOBs
- Automatic primary key generation
- User defined indexes on any field
- Set any field as UNIQUE
Expand Down
17 changes: 10 additions & 7 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,16 @@ Items marked with :fire: are high priority.
data.
- add more tests where 'auto_commit' is set to False to ensure that commit is
not called automatically.
- :fire: support structures like, `list`, `dict`, `set`, `tuple` etc. in the
model. These will need to be `pickled` first then stored as a BLOB in the
database
- :fire: similarly - perhaps add a `JSON` field type to allow storing JSON data
in a field, and an `Object` field type to allow storing arbitrary Python
objects? Perhaps a `Binary` field type to allow storing arbitrary binary data?
(just uses the existing `bytes` mapping but more explicit)
- :fire: perhaps add a `JSON` field type to allow storing JSON data in a field,
and an `Object` field type to allow storing arbitrary Python objects? Perhaps
a `Binary` field type to allow storing arbitrary binary data? (just uses the
existing `bytes` mapping but more explicit)
- Consider performance optimizations for field validation:
- Benchmark shows ~50% overhead for field assignments with validation
- Potential solutions:
- Add a "fast mode" configuration option
- Create bulk update methods that temporarily disable validation
- Optimize validation for specific field types
- on update, check if the model has actually changed before sending the update
to the database. This will prevent unnecessary updates and leave the
`updated_at` correct. However, this will always require a query to the
Expand Down
8 changes: 8 additions & 0 deletions demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ class UserModel(BaseDBModel):
name: str
content: Optional[str]
admin: bool = False
list_of_str: list[str]
a_set: set[str]

class Meta:
"""Override the table name for the UserModel."""
Expand All @@ -49,16 +51,22 @@ def main() -> None:
name="John Doe",
content="This is information about John Doe.",
admin=True,
list_of_str=["a", "b", "c"],
a_set={"x", "y", "z"},
)
user2 = UserModel(
slug="jdoe2",
name="Jane Doe",
content="This is information about Jane Doe.",
list_of_str=["x", "y", "z"],
a_set={"linux", "mac", "windows"},
)
user3 = UserModel(
slug="jb",
name="Yogie Bear",
content=None,
list_of_str=[],
a_set={"apple", "banana", "cherry"},
)
try:
db.insert(user1)
Expand Down
42 changes: 42 additions & 0 deletions docs/guide/fields.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,45 @@ This is exactly the same as using the `fields()` method with a single field, but
very specific and obvious. **There is NO equivalent argument to this in the
`select()` method**. An exception **WILL** be raised if you try to use this method
with more than one field.

## Complex Data Types

SQLiter supports storing complex Python data types in the database. The following types are supported:

- `list[T]`: Lists of any type T
- `dict[K, V]`: Dictionaries with keys of type K and values of type V
- `set[T]`: Sets of any type T
- `tuple[T, ...]`: Tuples of any type T

These types are automatically serialized and stored as BLOBs in the database. Here's an example of using complex types:

```python
from typing import Any
from sqliter import Model

class UserPreferences(Model):
tags: list[str] = [] # List of string tags
metadata: dict[str, Any] = {} # Dictionary with string keys and any value type
friends: set[int] = set() # Set of user IDs
coordinates: tuple[float, float] = (0.0, 0.0) # Tuple of two floats

# Create and save an instance
prefs = UserPreferences(
tags=["python", "sqlite", "orm"],
metadata={"theme": "dark", "notifications": True},
friends={1, 2, 3},
coordinates=(51.5074, -0.1278)
)
prefs.save()

# Query and use the complex types
loaded_prefs = UserPreferences.get(prefs.id)
print(loaded_prefs.tags) # ['python', 'sqlite', 'orm']
print(loaded_prefs.metadata["theme"]) # 'dark'
print(1 in loaded_prefs.friends) # True
print(loaded_prefs.coordinates) # (51.5074, -0.1278)
```

The complex types are automatically validated using Pydantic's type system, ensuring that only values of the correct type can be stored. When querying, the values are automatically deserialized back into their original Python types.

Note that since these types are stored as BLOBs, you cannot perform SQL operations on their contents (like searching or filtering). If you need to search or filter based on these values, you should consider storing them in a different format or in separate tables.
13 changes: 10 additions & 3 deletions docs/guide/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ the table.

The following field types are currently supported:

Basic Types:

- `str`
- `int`
- `float`
Expand All @@ -45,9 +47,14 @@ The following field types are currently supported:
- `datetime`
- `bytes`

More field types are planned for the near future, since I have the
serialization/ deserialization locked in. This will include `list`, `dict`,
`set`, and possibly `JSON` and `Object` fields.
Complex Types:

- `list[T]` - Lists of any type T
- `dict[K, V]` - Dictionaries with keys of type K and values of type V
- `set[T]` - Sets of any type T
- `tuple[T, ...]` - Tuples of any type T

Complex types are automatically serialized and stored as BLOBs in the database. For more details on using complex types, see the [Fields Guide](fields.md#complex-data-types).

### Adding Indexes

Expand Down
5 changes: 0 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,6 @@ database-like format without needing to learn SQL or use a full ORM.
> minimum and the releases and documentation will be very clear about any
> breaking changes.
>
> Also, structures like `list`, `dict`, `set` etc are not supported **at this
> time** as field types, since SQLite does not have a native column type for
> these. This is the **next planned enhancement**. These will need to be
> `pickled` first then stored as a BLOB in the database.
>
> See the [TODO](todo/index.md) for planned features and improvements.
## Features
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ lint.ignore = [
"FBT002",
"FBT003",
"B006",
"S301", # in this library we use 'pickle' for saving and loading list etc
] # These rules are too strict even for us 😝
lint.extend-ignore = [
"COM812",
Expand Down
4 changes: 2 additions & 2 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ mkdocs-material==9.5.49
mkdocs-material-extensions==1.3.1
mkdocs-minify-plugin==0.8.0
mock==5.1.0
mypy==1.14.0
mypy==1.14.1
mypy-extensions==1.0.0
nodeenv==1.9.1
packaging==24.2
Expand Down Expand Up @@ -71,7 +71,7 @@ regex==2024.11.6
requests==2.32.3
rich==13.9.4
rtoml==0.12.0
ruff==0.8.4
ruff==0.9.3
shellingham==1.5.4
simple-toml-settings==0.8.0
six==1.17.0
Expand Down
4 changes: 4 additions & 0 deletions sqliter/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,8 @@
bytes: "BLOB",
datetime.datetime: "INTEGER", # Store as Unix timestamp
datetime.date: "INTEGER", # Store as Unix timestamp
list: "BLOB",
dict: "BLOB",
set: "BLOB",
tuple: "BLOB",
}
34 changes: 28 additions & 6 deletions sqliter/model/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from __future__ import annotations

import datetime
import pickle
import re
from typing import (
Any,
Expand Down Expand Up @@ -58,7 +59,7 @@ class BaseDBModel(BaseModel):
model_config = ConfigDict(
extra="ignore",
populate_by_name=True,
validate_assignment=False,
validate_assignment=True,
from_attributes=True,
)

Expand Down Expand Up @@ -181,7 +182,9 @@ def serialize_field(cls, value: SerializableField) -> SerializableField:
"""
if isinstance(value, (datetime.datetime, datetime.date)):
return to_unix_timestamp(value)
return value # Return value as-is for non-datetime fields
if isinstance(value, (list, dict, set, tuple)):
return pickle.dumps(value)
return value # Return value as-is for other fields

# Deserialization after fetching from the database

Expand All @@ -205,12 +208,31 @@ def deserialize_field(
A datetime or date object if the field type is datetime or date,
otherwise returns the value as-is.
"""
field_type = cls.__annotations__.get(field_name)
if value is None:
return None

if field_type in (datetime.datetime, datetime.date) and isinstance(
value, int
# Get field type if it exists in model_fields
field_info = cls.model_fields.get(field_name)
if field_info is None:
# If field doesn't exist in model, return value as-is
return value

field_type = field_info.annotation

if (
isinstance(field_type, type)
and issubclass(field_type, (datetime.datetime, datetime.date))
and isinstance(value, int)
):
return from_unix_timestamp(
value, field_type, localize=return_local_time
)
return value # Return value as-is for non-datetime fields

origin_type = get_origin(field_type) or field_type
if origin_type in (list, dict, set, tuple) and isinstance(value, bytes):
try:
return pickle.loads(value)
except pickle.UnpicklingError:
return value

return value
20 changes: 18 additions & 2 deletions sqliter/sqliter.py
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,13 @@ def insert(
raise RecordInsertionError(table_name) from exc
else:
data.pop("pk", None)
return model_class(pk=cursor.lastrowid, **data)
# Deserialize each field before creating the model instance
deserialized_data = {}
for field_name, value in data.items():
deserialized_data[field_name] = model_class.deserialize_field(
field_name, value, return_local_time=self.return_local_time
)
return model_class(pk=cursor.lastrowid, **deserialized_data)

def get(
self, model_class: type[BaseDBModel], primary_key_value: int
Expand Down Expand Up @@ -540,7 +546,17 @@ def get(
field: result[idx]
for idx, field in enumerate(model_class.model_fields)
}
return model_class(**result_dict)
# Deserialize each field before creating the model instance
deserialized_data = {}
for field_name, value in result_dict.items():
deserialized_data[field_name] = (
model_class.deserialize_field(
field_name,
value,
return_local_time=self.return_local_time,
)
)
return model_class(**deserialized_data)
except sqlite3.Error as exc:
raise RecordFetchError(table_name) from exc
else:
Expand Down
Loading

0 comments on commit a23821d

Please sign in to comment.