Skip to content

Commit 065baf8

Browse files
MarleneKress79789tkiliasckunki
authored
#167: Added Exasol Text-AI Installer Wrapper (#178)
* #168: Installed Span-UDFs in transformers_extension_wrapper * Change poetry version in actions to 2.0.1 * Split slow checks into SaaS and Text-AI checks with larger runners --------- Co-authored-by: Torsten Kilias <tkilias@users.noreply.github.com> Co-authored-by: Christoph Kuhnke <github@kuhnke.net>
1 parent ed8b0e8 commit 065baf8

File tree

13 files changed

+528
-77
lines changed

13 files changed

+528
-77
lines changed

.github/workflows/build-and-publish.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@ jobs:
1717
uses: actions/checkout@v4
1818

1919
- name: Setup Python & Poetry Environment
20-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
20+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
21+
with:
22+
poetry-version: 2.0.1
2123

2224
- name: Build Artifacts
2325
run: poetry build

.github/workflows/check-release-tag.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ jobs:
1414
uses: actions/checkout@v4
1515

1616
- name: Setup Python & Poetry Environment
17-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
17+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
18+
with:
19+
poetry-version: 2.0.1
1820

1921
- name: Check Tag Version
2022
# make sure the pushed/created tag matched the project version

.github/workflows/checks.yml

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,9 @@ jobs:
1919
fetch-depth: 0
2020

2121
- name: Setup Python & Poetry Environment
22-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
22+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
23+
with:
24+
poetry-version: 2.0.1
2325

2426
- name: Check Version(s)
2527
run: poetry run version-check version.py
@@ -34,7 +36,9 @@ jobs:
3436
uses: actions/checkout@v4
3537

3638
- name: Setup Python & Poetry Environment
37-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
39+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
40+
with:
41+
poetry-version: 2.0.1
3842

3943
- name: Build Documentation
4044
run: |
@@ -54,8 +58,9 @@ jobs:
5458
uses: actions/checkout@v4
5559

5660
- name: Setup Python & Poetry Environment
57-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
61+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
5862
with:
63+
poetry-version: 2.0.1
5964
python-version: ${{ matrix.python-version }}
6065

6166
- name: Run lint
@@ -82,8 +87,9 @@ jobs:
8287
uses: actions/checkout@v4
8388

8489
- name: Setup Python & Poetry Environment
85-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
90+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
8691
with:
92+
poetry-version: 2.0.1
8793
python-version: ${{ matrix.python-version }}
8894

8995
- name: Run type-check
@@ -98,8 +104,9 @@ jobs:
98104
- name: SCM Checkout
99105
uses: actions/checkout@v4
100106
- name: Setup Python & Poetry Environment
101-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
107+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
102108
with:
109+
poetry-version: 2.0.1
103110
python-version: ${{ matrix.python-version }}
104111
- name: Run security linter
105112
run: poetry run nox -s lint:security
@@ -144,8 +151,9 @@ jobs:
144151
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
145152
146153
- name: Setup Python & Poetry Environment
147-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
154+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
148155
with:
156+
poetry-version: 2.0.1
149157
python-version: ${{ matrix.python-version }}
150158

151159
- name: Calculate Test Coverage

.github/workflows/gh-pages.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,9 @@ jobs:
1616
fetch-depth: 0
1717

1818
- name: Setup Python & Poetry Environment
19-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
19+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
20+
with:
21+
poetry-version: 2.0.1
2022

2123
- name: Build Documentation
2224
run: |

.github/workflows/merge-gate.yml

Lines changed: 46 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,16 +12,52 @@ jobs:
1212
name: Fast
1313
uses: ./.github/workflows/checks.yml
1414

15-
slow-checks:
16-
name: SaaS Tests
15+
slow-checks-approval:
16+
name: Slow Check Approval
1717
runs-on: ubuntu-24.04
1818
environment: manual-approval
19-
2019
# Replace the steps below with the required actions
2120
# and/or add additional jobs if required
2221
# Note:
2322
# If you add additional jobs, make sure they are added as a requirement
2423
# to the approve-merge job's input requirements (needs).
24+
steps:
25+
- run: echo "Manual Approval"
26+
27+
saas-tests:
28+
name: SaaS Tests
29+
runs-on: ubuntu-24.04
30+
needs: [ slow-checks-approval ]
31+
32+
steps:
33+
- name: SCM Checkout
34+
uses: actions/checkout@v4
35+
with:
36+
fetch-depth: 0
37+
38+
- name: Setup Python & Poetry Environment
39+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
40+
with:
41+
poetry-version: 2.0.1
42+
43+
- name: Allow unprivileged user namespaces
44+
run: |
45+
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
46+
47+
- name: Tests
48+
env:
49+
SAAS_HOST: ${{ secrets.INTEGRATION_TEAM_SAAS_STAGING_HOST }}
50+
SAAS_ACCOUNT_ID: ${{ secrets.INTEGRATION_TEAM_SAAS_STAGING_ACCOUNT_ID }}
51+
SAAS_PAT: ${{ secrets.INTEGRATION_TEAM_SAAS_STAGING_PAT }}
52+
run: poetry run pytest -rA --setup-show --backend=saas test/integration/test_cloud_storage.py
53+
54+
large-runner-tests:
55+
name: Text AI Tests
56+
runs-on:
57+
labels: int-linux-x64-4core-ubuntu24.04-1
58+
environment: text-ai-prerelease
59+
needs: [ slow-checks-approval ]
60+
2561
steps:
2662
- name: SCM Checkout
2763
uses: actions/checkout@v4
@@ -44,25 +80,26 @@ jobs:
4480
sudo rm -rf /opt/ghc
4581
4682
- name: Setup Python & Poetry Environment
47-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
83+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
84+
with:
85+
poetry-version: 2.0.1
4886

4987
- name: Allow unprivileged user namespaces
5088
run: |
5189
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
5290
5391
- name: Tests
5492
env:
55-
SAAS_HOST: ${{ secrets.INTEGRATION_TEAM_SAAS_STAGING_HOST }}
56-
SAAS_ACCOUNT_ID: ${{ secrets.INTEGRATION_TEAM_SAAS_STAGING_ACCOUNT_ID }}
57-
SAAS_PAT: ${{ secrets.INTEGRATION_TEAM_SAAS_STAGING_PAT }}
58-
run: poetry run pytest --backend=saas test/integration
93+
TXAIE_PRE_RELEASE_URL: ${{ vars.ZIP_URL }}
94+
TXAIE_PRE_RELEASE_PASSWORD: ${{ secrets.ZIP_PASSWORD }}
95+
run: poetry run pytest -rA --setup-show test/integration/test_text_ai_extension_wrapper.py
5996

6097
# This job ensures inputs have been executed successfully.
6198
approve-merge:
6299
name: Allow Merge
63100
runs-on: ubuntu-latest
64101
# If you need additional jobs to be part of the merge gate, add them below
65-
needs: [ fast-checks, slow-checks ]
102+
needs: [ fast-checks, saas-tests, large-runner-tests ]
66103

67104
# Each job requires a step, so we added this dummy step.
68105
steps:

.github/workflows/report.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,9 @@ jobs:
2121
fetch-depth: 0
2222

2323
- name: Setup Python & Poetry Environment
24-
uses: exasol/python-toolbox/.github/actions/python-environment@0.18.0
24+
uses: exasol/python-toolbox/.github/actions/python-environment@0.20.0
25+
with:
26+
poetry-version: 2.0.1
2527

2628
- name: Download Artifacts
2729
uses: actions/download-artifact@v4.1.8

doc/changes/unreleased.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@
77
* #146: Add interface for text_ai_extension_wrapper
88
* #169: Added a function that creates a bucket-fs PathLike object
99
* #173: Added an option to upload an SLC from file.
10+
* #167: Added partial implementation for text_ai_extension_wrapper
11+
* #168: Installed Span-UDFs in transformers_extension_wrapper
1012
* #175: Added function to download and decrypt Text-AI pre-release.
1113

1214
## Refactoring

exasol/nb_connector/ai_lab_config.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ class AILabConfig(Enum):
3535
te_models_bfs_dir = auto()
3636
te_hf_connection = auto()
3737
te_models_cache_dir = auto()
38+
txaie_bfs_connection = auto()
39+
txaie_models_bfs_dir = auto()
40+
txaie_models_cache_dir = auto()
41+
txaie_slc_file_local_path = auto()
3842
sme_aws_bucket = auto()
3943
sme_aws_role = auto()
4044
sme_aws_connection = auto()

exasol/nb_connector/text_ai_extension_wrapper.py

Lines changed: 108 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
from os import PathLike
2+
3+
from exasol.nb_connector.extension_wrapper_common import deploy_language_container, encapsulate_bucketfs_credentials
4+
from exasol.nb_connector.language_container_activation import ACTIVATION_KEY_PREFIX
15
from typing import Optional, Generator
26
from contextlib import contextmanager
37
from pathlib import Path
@@ -8,10 +12,45 @@
812
from exasol.nb_connector.secret_store import Secrets
913
from exasol.nb_connector.ai_lab_config import AILabConfig as CKey
1014

15+
# Models will be uploaded into directory BFS_MODELS_DIR in BucketFS.
16+
#
17+
# Models downloaded from the Huggingface archive to a local drive will be
18+
# cached in directory MODELS_CACHE_DIR.
19+
#
20+
# TXAIE uses the same directories as TE (see function initialize_te_extension)
21+
# as both extensions are using Huggingface Models. This also avoids confusion,
22+
# and ensures backwards compatibility.
23+
from exasol.nb_connector.transformers_extension_wrapper import BFS_MODELS_DIR, MODELS_CACHE_DIR
24+
25+
26+
PATH_IN_BUCKET = "TXAIE"
27+
""" Location in BucketFS bucket to upload data for TXAIE, e.g. its language container. """
28+
1129
LANGUAGE_ALIAS = "PYTHON3_TXAIE"
1230

1331
LATEST_KNOWN_VERSION = "???"
1432

33+
ACTIVATION_KEY = ACTIVATION_KEY_PREFIX + "txaie"
34+
"""
35+
Activation SQL for the Text AI Extension will be saved in the secret store
36+
with this key.
37+
38+
TXAIE brings its own Script Language Container (SLC) which needs to be
39+
activated by a dedicated SQL statement `ALTER SESSION SET SCRIPT_LANGUAGES`. Applications
40+
can store the language definition in the configuration store (SCS) from the Notebook
41+
Connector's class `exasol.nb_connector.secret_store.Secrets`.
42+
43+
Using `ACTIVATION_KEY` as defined key, TXAIE can provide convenient interfaces
44+
accepting only the SCS and retrieving all further data from the there.
45+
"""
46+
47+
BFS_CONNECTION_PREFIX = "TXAIE_BFS"
48+
"""
49+
Prefix for Exasol CONNECTION objects containing a BucketFS location and
50+
credentials.
51+
"""
52+
53+
1554

1655
@contextmanager
1756
def download_pre_release(conf: Secrets) -> Generator[tuple[Path, Path], None, None]:
@@ -49,6 +88,7 @@ def download_pre_release(conf: Secrets) -> Generator[tuple[Path, Path], None, No
4988
# Find and return the project wheel and the SLC
5089
project_wheel = next(tmp_path.glob("*.whl"))
5190
slc_tar_gz = next(tmp_path.glob("*.tar.gz"))
91+
conf.save(CKey.txaie_slc_file_local_path, str(slc_tar_gz))
5292
yield project_wheel, slc_tar_gz
5393

5494

@@ -67,16 +107,18 @@ def deploy_licence(conf: Secrets,
67107
Optional. Content of a licence given as a string.
68108
69109
"""
70-
pass
110+
raise NotImplementedError('Currently this is not implemented, '
111+
'will be changed once the licensing process is finalized.')
112+
71113

72114

73115
def initialize_text_ai_extension(conf: Secrets,
74116
container_file: Optional[Path] = None,
75-
version: Optional[str] = LATEST_KNOWN_VERSION,
117+
version: Optional[str] = None,
76118
language_alias: str = LANGUAGE_ALIAS,
77119
run_deploy_container: bool = True,
78-
run_deploy_scripts: bool = True,
79-
run_upload_models: bool = True,
120+
run_deploy_scripts: bool = False,
121+
run_upload_models: bool = False,
80122
run_encapsulate_bfs_credentials: bool = True,
81123
allow_override: bool = True) -> None:
82124
"""
@@ -89,7 +131,8 @@ def initialize_text_ai_extension(conf: Secrets,
89131
90132
If given a container_file path instead, installs the given container in the Bucketfs.
91133
92-
If neither is given, attempts to install the latest version from ???.
134+
If neither is given, checks if txaie_slc_file_local_path is set and installs this SLC if found,
135+
otherwise attempts to install the latest version from t.b.d.
93136
94137
This function doesn't activate the language container. Instead, it gets the
95138
activation SQL using the same API and writes it to the secret store. The name
@@ -122,4 +165,63 @@ def initialize_text_ai_extension(conf: Secrets,
122165
allow_override:
123166
If True allows overriding the language definition.
124167
"""
125-
pass
168+
169+
# Create the name of the Exasol connection object
170+
db_user = str(conf.get(CKey.db_user))
171+
bfs_conn_name = "_".join([BFS_CONNECTION_PREFIX, db_user])
172+
# As soon as the official release of TXAIE is available, the hard-coded value for
173+
# container_name can be replaced by TXAIELanguageContainerDeployer.SLC_NAME,
174+
# see https://github.com/exasol/notebook-connector/issues/179.
175+
container_name = "exasol_text_ai_extension_container_release.tar.gz"
176+
177+
178+
def from_ai_lab_config(key: CKey) -> Path | None:
179+
entry = conf.get(key)
180+
return Path(entry) if entry else None
181+
182+
if run_deploy_container:
183+
if version:
184+
install_text_ai_extension(version)
185+
# Can run_upload_models, run_deploy_scripts,
186+
# run_encapsulate_bfs_credentials, etc. be ignored here?
187+
return
188+
189+
container_file = container_file or from_ai_lab_config(CKey.txaie_slc_file_local_path)
190+
if not container_file:
191+
install_text_ai_extension(LATEST_KNOWN_VERSION)
192+
else:
193+
deploy_language_container(
194+
conf=conf,
195+
path_in_bucket=PATH_IN_BUCKET,
196+
language_alias=language_alias,
197+
activation_key=ACTIVATION_KEY,
198+
container_file=container_file,
199+
container_name=container_name,
200+
allow_override=allow_override,
201+
)
202+
203+
204+
205+
if run_upload_models:
206+
# Install default Hugging Face models into the Bucketfs using
207+
# Transformers Extensions upload model functionality.
208+
raise NotImplementedError('Implementation is waiting for TE release.')
209+
210+
211+
if run_deploy_scripts:
212+
raise NotImplementedError('Currently there are no Text-AI specific scripts to deploy.')
213+
214+
215+
if run_encapsulate_bfs_credentials:
216+
encapsulate_bucketfs_credentials(
217+
conf, path_in_bucket=PATH_IN_BUCKET, connection_name=bfs_conn_name
218+
)
219+
220+
# Update secret store
221+
conf.save(CKey.txaie_bfs_connection, bfs_conn_name)
222+
conf.save(CKey.txaie_models_bfs_dir, BFS_MODELS_DIR)
223+
conf.save(CKey.txaie_models_cache_dir, MODELS_CACHE_DIR)
224+
225+
226+
def install_text_ai_extension(version: str) -> None:
227+
raise NotImplementedError('Implementation is waiting for decision on where the releases will be hosted.')

exasol/nb_connector/transformers_extension_wrapper.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def deploy_scripts(conf: Secrets,
6363
activation_sql = get_activation_sql(conf)
6464
conn.execute(activation_sql)
6565

66-
scripts_deployer = ScriptsDeployer(language_alias, conf.get(CKey.db_schema), conn)
66+
scripts_deployer = ScriptsDeployer(language_alias, conf.get(CKey.db_schema), conn, install_all_scripts=True)
6767
scripts_deployer.deploy_scripts()
6868

6969

0 commit comments

Comments
 (0)