Added `ExposedModel` to produce predictions #24

MatsMoll · 2024-04-02T19:48:03Z

The main feature was to add a way to define exposed models.

This makes it easier to find out where a model contract can be used, and how to use the model.
Therefore, the main purpose is to answer the following questions:

Where are our models located? An API, model registry, etc?
Which kind of format do we expect to provide in order to use the model? JSON column wise, row wise, protobuf?

@model_contract(
    input_features=[...],
    exposed_model=ExposedModel.in_memory_mlflow(
        model_name="taxi_eta",
        model_alias="Champion",

        prediction_column="predicted_eta",
        predicted_at_column="predicted_at",
        model_version_column="model_version",
        model_contract_version_tag=None  # Since there are no versions, will this mean the literal array
    )
)
class TaxiRegressionModel:
    trip_id = String().as_entity()
    predicted_eta = Float()
    predicted_at = EventTimestamp()
    model_version = String().as_model_version()

However, this also means that we can potentially use the modells with the following:

store = await FeatureStore.from_dir()
preds = await store.model("taxi_eta").predict_over({
    "trip_id": [...],
}).to_polars()

Or store them in the output source directly?

store = await FeatureStore.from_dir()
await store.model("taxi_eta").predict_over({
    "trip_id": [...],
}).upsert_into_output_source()

This also lead to the development of some generated Ollama contracts:

from aligned.exposed_models.ollama import ollama_classification_contract, ollama_embedding_contract

Classification = ollama_classification_contract(
    contract_name="some_classification",
    inputs=[features.x, features.y, features.z],
    prompt_template="You are ... you have {x}, {y}, and {z}, is it true or false?",
    ground_truth=features.a,
    entities=[features.row_id],
    output_source=FileSource.parquet(...),
    endpoint="ollama_endpoint",
)

Other improvements:

Added a loaded_columns to not load more data than needed.

MatsMoll added 13 commits April 1, 2024 23:08

Added a way to describe exposed models

423e8b0

Added mlflow exposed model

d45fc43

Added dtype info to scan csv

6f9fe9f

Fixed parsing bugs

a56538e

Added a way to load predictions from different model versions

f642af6

Added stack source and other minor improvments

a46cca0

Fixed circular import

23d5061

A lot of goodies

bd2a12a

A lot of improvments

a03b3ef

Updated tests

6ca6388

Updated readme and added a few more tests

05ea6c3

Fixed tests

a04a095

Updated Readme

9eab0df

MatsMoll merged commit bc88a2a into main Apr 16, 2024
1 check passed

MatsMoll deleted the matsei/exposed-models branch April 16, 2024 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added `ExposedModel` to produce predictions #24

Added `ExposedModel` to produce predictions #24

MatsMoll commented Apr 2, 2024 •

edited

Loading

Added ExposedModel to produce predictions #24

Added ExposedModel to produce predictions #24

Conversation

MatsMoll commented Apr 2, 2024 • edited Loading

Added `ExposedModel` to produce predictions #24

Added `ExposedModel` to produce predictions #24

MatsMoll commented Apr 2, 2024 •

edited

Loading