Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CI run time #2215

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
0088b0c
Test CI compile time improvements
G-D-Petrov Feb 26, 2025
215f1a7
Refactor tests for improved readability and maintainability; skip uni…
G-D-Petrov Feb 27, 2025
52dbe15
Update conftest.py
G-D-Petrov Feb 27, 2025
dcae9d4
Enhance test configurations and improve fixture definitions for bette…
G-D-Petrov Feb 27, 2025
ab7102d
Update pytest xdist mode to use automatic distribution for improved t…
G-D-Petrov Feb 28, 2025
1104558
test with bigger win machine
G-D-Petrov Feb 28, 2025
ea7ca50
Fix build workflow: update Windows distro naming and simplify pytest …
G-D-Petrov Feb 28, 2025
2d9a07b
Update Windows distro naming in build workflow for consistency
G-D-Petrov Feb 28, 2025
1dc715d
Refactor tests to use arctic_library_v1 for consistency and improved …
G-D-Petrov Feb 28, 2025
b9aa11d
Test with session fixtures
G-D-Petrov Feb 28, 2025
dbf63ac
Refactor test fixtures to use session scope for improved performance …
G-D-Petrov Feb 28, 2025
31d4657
Fix test skip reasons, improve library creation in tests, and mark sl…
G-D-Petrov Mar 3, 2025
6ff176b
Refactor storage fixtures and tests for improved performance and cons…
G-D-Petrov Mar 3, 2025
3a5396b
Enhance MongoDB and S3 storage fixtures with improved heartbeat setti…
G-D-Petrov Mar 3, 2025
28ff538
Refactor GitHub Actions workflow by removing EC2 runner jobs and simp…
G-D-Petrov Mar 3, 2025
2d21519
Refactor S3 and MongoDB storage fixtures for improved client handling…
G-D-Petrov Mar 4, 2025
b0686fa
Refactor random integer generation and add pytest environment check; …
G-D-Petrov Mar 4, 2025
b4c3cca
Test with worksteal
G-D-Petrov Mar 4, 2025
981573b
Remove 'worksteal' distribution mode from pytest xdist configuration …
G-D-Petrov Mar 4, 2025
53e323e
Enhance storage fixture management and add dynamic library fixture; s…
G-D-Petrov Mar 11, 2025
79df70a
Handle S3 bucket cleanup error for Windows 3.7; prevent unnecessary e…
G-D-Petrov Mar 11, 2025
ae1abf6
Increase the memory threshold for test_mem_leak_read_all_arctic_lib
G-D-Petrov Mar 11, 2025
fc0f6b1
Improve memory leak tracking by filtering out unknown frame information
G-D-Petrov Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ jobs:
include:
- python_deps_ids: [""]
matrix_override: ${{fromJson(needs.common_config.outputs.linux_matrix)}}
pytest_xdist_mode: "--dist worksteal"
pytest_xdist_mode: "-n auto"
- python3: 8
python_deps_ids: ["", -compat38]
matrix_override:
Expand Down Expand Up @@ -246,6 +246,7 @@ jobs:
cmake_preset_type: ${{needs.common_config.outputs.cmake_preset_type_resolved}}
matrix: ${{needs.common_config.outputs.windows_matrix}}
persistent_storage: ${{ inputs.persistent_storage }}
pytest_xdist_mode: "-n auto"
pytest_args: ${{inputs.pytest_args}}

persistent_storage_verify_linux:
Expand Down
46 changes: 12 additions & 34 deletions .github/workflows/build_steps.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ on:
job_type: {required: true, type: string, description: Selects the steps to enable}
cmake_preset_type: {required: true, type: string, description: release/debug}
matrix: {required: true, type: string, description: JSON string to feed into the matrix}
compile-override: {required: false, type: string, description: Parameter to override the agent that is used for compiling e.g. compile-on-ec2}
cibw_image_tag: {required: false, type: string, description: Linux only. As built by cibw_docker_image.yml workflow}
cibw_version: {required: false, type: string, description: build-python-wheels only. Must match the cibw_image_tag}
python_deps_ids: {default: '[""]', type: string, description: build-python-wheels test matrix parameter. JSON string.}
Expand All @@ -14,18 +13,7 @@ on:
pytest_xdist_mode: {default: "", type: string, description: additional argument to pass for pytest-xdist}
pytest_args: {default: "", type: string, description: a way to rewrite the pytest args to change what tests are being run}
jobs:
start_ec2_runner:
if: inputs.compile-override == 'compile-on-ec2'
uses: ./.github/workflows/ec2_runner_jobs.yml
secrets: inherit
with:
job_type: start

compile:
needs: [start_ec2_runner]
if: |
always() &&
!cancelled()
strategy:
matrix:
# Declaring the dummy fields here to aid the Github Actions linting in VSCode and to provide documentation
Expand All @@ -45,10 +33,18 @@ jobs:
include:
- ${{fromJSON(inputs.matrix)[0]}} # The items after 0 are for tests only

runs-on: ${{ needs.start_ec2_runner.status != 'failure' && needs.start_ec2_runner.outputs.label || matrix.distro}}
runs-on: ${{ matrix.distro}}
container: ${{ (matrix.os == 'linux' && inputs.job_type != 'build-python-wheels') && matrix.container || null}}
env:
SCCACHE_GHA_VERSION: ${{vars.SCCACHE_GHA_VERSION || 1}} # Setting this env var enables the caching
# 0 - uses S3 Cache, 1 - does uses GHA cache
# this was extract PRs can use the GHA cache
SCCACHE_GHA_VERSION: ${{secrets.AWS_S3_ACCESS_KEY && 0 || 1}}
SCCACHE_BUCKET: arcticdb-ci-sccache-bucket
SCCACHE_ENDPOINT: http://s3.eu-west-1.amazonaws.com
SCCACHE_REGION: eu-west-1
SCCACHE_S3_USE_SSL: false
AWS_ACCESS_KEY_ID: ${{secrets.AWS_S3_ACCESS_KEY}}
AWS_SECRET_ACCESS_KEY: ${{secrets.AWS_S3_SECRET_KEY}}
VCPKG_NUGET_USER: ${{secrets.VCPKG_NUGET_USER || github.repository_owner}}
VCPKG_NUGET_TOKEN: ${{secrets.VCPKG_NUGET_TOKEN || secrets.GITHUB_TOKEN}}
VCPKG_MAN_NUGET_USER: ${{secrets.VCPKG_MAN_NUGET_USER}} # For forks to download pre-compiled dependencies from the Man repo
Expand All @@ -63,6 +59,7 @@ jobs:
VCPKG_BINARY_SOURCES VCPKG_NUGET_USER VCPKG_NUGET_TOKEN VCPKG_MAN_NUGET_USER VCPKG_MAN_NUGET_TOKEN
CMAKE_C_COMPILER_LAUNCHER CMAKE_CXX_COMPILER_LAUNCHER CMAKE_BUILD_PARALLEL_LEVEL ARCTIC_CMAKE_PRESET
ARCTICDB_BUILD_DIR TEST_OUTPUT_DIR ARCTICDB_VCPKG_INSTALLED_DIR ARCTICDB_VCPKG_PACKAGES_DIR
SCCACHE_BUCKET SCCACHE_ENDPOINT AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY SCCACHE_REGION SCCACHE_S3_USE_SSL
ARCTICDB_DEBUG_FIND_PYTHON: ${{vars.ARCTICDB_DEBUG_FIND_PYTHON}}
python_impl_name: ${{inputs.python3 > 0 && format('cp3{0}', inputs.python3) || 'default'}}
CIBW_BUILD: ${{format('cp3{0}-{1}', inputs.python3, matrix.cibw_build_suffix)}}
Expand Down Expand Up @@ -101,16 +98,12 @@ jobs:
CMAKE_BUILD_PARALLEL_LEVEL: ${{vars.CMAKE_BUILD_PARALLEL_LEVEL}}

# ========================= Leader steps =========================
- name: Remove GitHub default packages (baseimage) # To save space
if: inputs.job_type == 'build-python-wheels' && matrix.os == 'linux'
uses: jlumbroso/free-disk-space@main

- name: Remove GitHub default packages (manylinux) # To save space
if: inputs.job_type == 'cpp-tests' && matrix.os == 'linux'
run: |
du -m /mnt/usr/local/lib/ | sort -n | tail -n 50
nohup rm -rf /mnt/usr/local/lib/android &

- name: Find and remove ccache # See PR: #945
if: matrix.os == 'windows'
run: rm $(which ccache) || true
Expand Down Expand Up @@ -257,23 +250,8 @@ jobs:
test_dirs: ${{env.test_dirs}}
cibw_build: ${{env.CIBW_BUILD}}

stop-ec2-runner:
needs: [start_ec2_runner, compile]
if: |
always() &&
inputs.compile-override == 'compile-on-ec2'
uses: ./.github/workflows/ec2_runner_jobs.yml
secrets: inherit
with:
job_type: stop
label: ${{ needs.start_ec2_runner.outputs.label }}
ec2-instance-id: ${{ needs.start_ec2_runner.outputs.ec2-instance-id }}

python_tests:
if: |
always() &&
!failure() &&
!cancelled() &&
inputs.job_type == 'build-python-wheels'
needs: [compile]
strategy:
Expand Down
1 change: 0 additions & 1 deletion cpp/CMakePresets.json
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,6 @@
"generator": "Ninja",
"environment": { "cmakepreset_expected_host_system": "Windows" },
"cacheVariables": {
"ARCTICDB_USE_PCH": "ON",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you removed this? Might make @vasil-pashov sad

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sccache doesn't work with precompiled headers, so this is making the build much slower on Windows.
AFAIK this is used only in the CI, I am correct @vasil-pashov ?

"VCPKG_OVERLAY_TRIPLETS": "custom-triplets",
"VCPKG_TARGET_TRIPLET": "x64-windows-static-msvc"
}
Expand Down
14 changes: 10 additions & 4 deletions python/arcticdb/storage_fixtures/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,14 @@ def slow_cleanup(self, failure_consequence=""):
self.libs_from_factory.clear()

arctic = self.create_arctic()
for lib in self.libs_names_from_arctic[:]:
libs = set(self.libs_names_from_arctic[:])
with handle_cleanup_exception(self, arctic, consequence=failure_consequence):
# There are some tests that add libraries without using the factory directly (e.g. test_move_storage)
# so make sure that we capture all of the libraries
for lib in arctic.list_libraries():
libs.add(lib)

for lib in libs:
with handle_cleanup_exception(self, lib, consequence=failure_consequence):
arctic.delete_library(lib)

Expand All @@ -155,7 +162,7 @@ def replace_uri_field(cls, uri: str, field: ArcticUriFields, replacement: str, s
regex = cls._FIELD_REGEX[field]
match = regex.search(uri)
assert match, f"{uri} does not have {field}"
return f"{uri[:match.start(start)]}{replacement}{uri[match.end(end):]}"
return f"{uri[: match.start(start)]}{replacement}{uri[match.end(end) :]}"


class StorageFixtureFactory(_SaferContextManager):
Expand Down Expand Up @@ -188,5 +195,4 @@ def enforcing_permissions_context(self, set_to=True):
self.enforcing_permissions = saved

@abstractmethod
def create_fixture(self) -> StorageFixture:
...
def create_fixture(self) -> StorageFixture: ...
23 changes: 15 additions & 8 deletions python/arcticdb/storage_fixtures/mongo.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from arcticc.pb2.storage_pb2 import EnvironmentConfigsMap
from arcticdb.version_store.helper import add_mongo_library_to_env
from arcticdb.adapters.prefixing_library_adapter_decorator import PrefixingLibraryAdapterDecorator
from arcticdb.util.test import is_pytest_running

# All storage client libraries to be imported on-demand to speed up start-up of ad-hoc test runs
if TYPE_CHECKING:
Expand All @@ -28,11 +29,20 @@
log_level = os.getenv("ARCTICDB_mongo_test_fixture_loglevel")
if log_level:
log_level = log_level.upper()
assert log_level in {"DEBUG", "INFO", "WARN", "ERROR"}, \
"Log level must be one of DEBUG, INFO, WARN, ERROR"
assert log_level in {"DEBUG", "INFO", "WARN", "ERROR"}, "Log level must be one of DEBUG, INFO, WARN, ERROR"
logger.setLevel(getattr(logging, log_level))


def get_mongo_client(mongo_uri: str) -> "MongoClient":
from pymongo import MongoClient

if is_pytest_running():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused, why can't we just always supply the heartbeat?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am greatly reducing the frequency of the heartbeat because we don't really care about it in the testing.
But it is still needed for actual usage of mongo, so better to keep it to the default value.

# Avoids some issues with the PyMongo driver during tests cleanup
return MongoClient(mongo_uri, heartbeatFrequencyMS=86400000)
else:
return MongoClient(mongo_uri)


class MongoDatabase(StorageFixture):
"""Each fixture is backed by its own Mongo database, to make clean up easier."""

Expand All @@ -50,11 +60,9 @@ def __init__(self, mongo_uri: str, name: Optional[str] = None, client: Optional[
Optionally reusing a client already connected to ``mongo_uri``.
"""
super().__init__()
from pymongo import MongoClient

assert mongo_uri.startswith("mongodb://")
self.mongo_uri = mongo_uri
self.client = client or MongoClient(mongo_uri)
self.client = client or get_mongo_client(mongo_uri)
if not name:
while True:
logger.debug("Searching for new name")
Expand Down Expand Up @@ -114,13 +122,11 @@ def __init__(self, data_dir: Optional[str] = None, port=0, executable="mongod"):
self._client = None

def _safe_enter(self):
from pymongo import MongoClient

cmd = [self._executable, "--port", str(self._port), "--dbpath", self._data_dir]
self._p = GracefulProcessUtils.start(cmd)
self.mongo_uri = f"mongodb://localhost:{self._port}"
wait_for_server_to_come_up(f"http://localhost:{self._port}", "mongod", self._p)
self._client = MongoClient(self.mongo_uri)
self._client = get_mongo_client(self.mongo_uri)

def __exit__(self, exc_type, exc_value, traceback):
if self._client:
Expand All @@ -142,6 +148,7 @@ def __str__(self):

def is_mongo_host_running(host):
import requests

try:
res = requests.get(f"http://{host}")
except requests.exceptions.ConnectionError:
Expand Down
42 changes: 25 additions & 17 deletions python/arcticdb/storage_fixtures/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@

import werkzeug
from moto.moto_server.werkzeug_app import DomainDispatcherApplication, create_backend_app
import botocore.exceptions

from .api import *
from .utils import (
Expand Down Expand Up @@ -202,12 +203,11 @@ def create_test_cfg(self, lib_name: str) -> EnvironmentConfigsMap:


class GcpS3Bucket(S3Bucket):

def __init__(
self,
factory: "BaseS3StorageFixtureFactory",
bucket: str,
native_config: Optional[NativeVariantStorage] = None,
self,
factory: "BaseS3StorageFixtureFactory",
bucket: str,
native_config: Optional[NativeVariantStorage] = None,
):
super().__init__(factory, bucket, native_config=native_config)
self.arctic_uri = self.arctic_uri.replace("s3", "gcpxml", 1)
Expand Down Expand Up @@ -527,8 +527,8 @@ def __call__(self, environ, start_response):
with self.lock:
# Mock ec2 imds responses for testing
if path_info in (
"/latest/dynamic/instance-identity/document",
b"/latest/dynamic/instance-identity/document",
"/latest/dynamic/instance-identity/document",
b"/latest/dynamic/instance-identity/document",
):
start_response("200 OK", [("Content-Type", "text/plain")])
return [b"Something to prove imds is reachable"]
Expand Down Expand Up @@ -563,11 +563,11 @@ def __call__(self, environ, start_response):
if environ["REQUEST_METHOD"] == "POST" and environ["QUERY_STRING"] == "delete":
response_body = (
b'<?xml version="1.0" encoding="UTF-8"?>'
b'<Error>'
b'<Code>NotImplemented</Code>'
b'<Message>A header or query you provided requested a function that is not implemented.</Message>'
b'<Details>POST ?delete is not implemented for objects.</Details>'
b'</Error>'
b"<Error>"
b"<Code>NotImplemented</Code>"
b"<Message>A header or query you provided requested a function that is not implemented.</Message>"
b"<Details>POST ?delete is not implemented for objects.</Details>"
b"</Error>"
)
start_response(
"501 Not Implemented", [("Content-Type", "text/xml"), ("Content-Length", str(len(response_body)))]
Expand Down Expand Up @@ -633,7 +633,8 @@ def __init__(
self.use_internal_client_wrapper_for_testing = use_internal_client_wrapper_for_testing
# This is needed because we might have multiple factories in the same test
# and we need to make sure the bucket names are unique
self.unique_id = "".join(random.choices(string.ascii_letters + string.digits, k=5))
# set the unique_id to the current UNIX timestamp to avoid conflicts
self.unique_id = str(int(time.time()))

def _start_server(self):
port = self.port = get_ephemeral_port(2)
Expand Down Expand Up @@ -737,7 +738,14 @@ def cleanup_bucket(self, b: S3Bucket):
self._live_buckets.remove(b)
if len(self._live_buckets):
b.slow_cleanup(failure_consequence="The following delete bucket call will also fail. ")
self._s3_admin.delete_bucket(Bucket=b.bucket)
try:
self._s3_admin.delete_bucket(Bucket=b.bucket)
except botocore.exceptions.ClientError as e:
# There is a problem with xdist on Windows 3.7
# where we try to clean up the bucket but it's already gone
is_win_37 = platform.system() == "Windows" and sys.version_info[:2] == (3, 7)
if e.response["Error"]["Code"] != "NoSuchBucket" and not is_win_37:
raise e
else:
requests.post(
self._iam_endpoint + "/moto-api/reset", verify=False
Expand All @@ -750,7 +758,7 @@ def cleanup_bucket(self, b: S3Bucket):

class MotoNfsBackedS3StorageFixtureFactory(MotoS3StorageFixtureFactory):
def create_fixture(self) -> NfsS3Bucket:
bucket = f"test_bucket_{self._bucket_id}"
bucket = f"test_bucket_{self.unique_id}_{self._bucket_id}"
self._s3_admin.create_bucket(Bucket=bucket)
self._bucket_id += 1
out = NfsS3Bucket(self, bucket)
Expand All @@ -766,7 +774,7 @@ def _start_server(self):
self._iam_endpoint = f"{self.http_protocol}://localhost:{port}"

self.ssl = (
self.http_protocol == "https"
self.http_protocol == "https"
) # In real world, using https protocol doesn't necessarily mean ssl will be verified
if self.ssl_test_support:
self.ca, self.key_file, self.cert_file, self.client_cert_file = get_ca_cert_for_testing(self.working_dir)
Expand All @@ -792,7 +800,7 @@ def _start_server(self):
wait_for_server_to_come_up(self.endpoint, "moto", self._p)

def create_fixture(self) -> GcpS3Bucket:
bucket = f"test_bucket_{self._bucket_id}"
bucket = f"test_bucket_{self.unique_id}_{self._bucket_id}"
self._s3_admin.create_bucket(Bucket=bucket)
self._bucket_id += 1
out = GcpS3Bucket(self, bucket)
Expand Down
19 changes: 11 additions & 8 deletions python/arcticdb/util/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -397,12 +397,7 @@ def random_integers(size, dtype, min_value: int = None, max_value: int = None):
min_value = max(iinfo.min, platform_int_info.min)
if max_value is None:
max_value = min(iinfo.max, platform_int_info.max)
return np.random.randint(
min_value,
max_value,
size=size,
dtype=dtype
)
return np.random.randint(min_value, max_value, size=size, dtype=dtype)


def get_wide_dataframe(size=10000, seed=0):
Expand Down Expand Up @@ -880,7 +875,9 @@ def generic_resample_test(
resample_args["offset"] = offset

if PANDAS_VERSION >= Version("1.1.0"):
expected = original_data.resample(rule, closed=closed, label=label, **resample_args).agg(None, **pandas_aggregations)
expected = original_data.resample(rule, closed=closed, label=label, **resample_args).agg(
None, **pandas_aggregations
)
else:
expected = original_data.resample(rule, closed=closed, label=label).agg(None, **pandas_aggregations)
if drop_empty_buckets_for:
Expand All @@ -902,6 +899,7 @@ def generic_resample_test(
else:
assert_frame_equal(expected, received, check_dtype=False)


def equals(x, y):
if isinstance(x, tuple) or isinstance(x, list):
assert len(x) == len(y)
Expand All @@ -916,4 +914,9 @@ def equals(x, y):
assert isinstance(y, np.ndarray)
assert np.allclose(x, y)
else:
assert x == y
assert x == y


def is_pytest_running():
"""Check if code is currently running as part of a pytest test."""
return "PYTEST_CURRENT_TEST" in os.environ
Loading
Loading