Skip to content

Comments

Make CPU SHM pool importable without NumPy and modernize core deps/CI/telemetry#2

Open
nanocubit wants to merge 1 commit intomasterfrom
codex-95ym08
Open

Make CPU SHM pool importable without NumPy and modernize core deps/CI/telemetry#2
nanocubit wants to merge 1 commit intomasterfrom
codex-95ym08

Conversation

@nanocubit
Copy link
Owner

Motivation

  • Allow basic CPU shared-memory smoke checks (allocate/free/coalescing) to run in constrained environments where installing NumPy or heavy GPU deps is not possible.
  • Reduce heavy imports at module-import time and split expensive GPU/Ray dependencies into optional profiles to make core installs and CI CPU jobs lighter.
  • Improve observability and robustness of runtime/server/worker paths with structured logging and telemetry hooks for allocation/lease events and failures.

Description

  • Made NumPy optional in zerolink/core/cpu/shm_pool.py by using try/except ImportError and added a runtime guard in get_numpy_array that raises a clear RuntimeError if NumPy is absent, and replaced np.dtype typing with Any to avoid import-time failures.
  • Reworked dependency layout: moved heavy deps out of base install, added dependency profile files (requirements-gpu.txt, requirements-ray.txt, requirements-cgpu.txt, requirements-dev.txt), added numpy>=1.24.0 to core requirements.txt, and updated pyproject.toml and setup.py to support extras (gpu, ray, cgpu, dev, full) and to allow installation without Torch/CUDA toolchain.
  • Updated CI (.github/workflows/ci.yml) to split test-core, test-gpu-profile, and test-ray-profile jobs, pin newer Python versions and run CPU-only tests with --noconftest where appropriate.
  • Introduced structured logging and telemetry improvements: renamed pynexus metrics to zerolink_*, added new metrics and helper functions (observe_runtime_latency, record_alloc_failure, record_lease_event) in zerolink/monitoring/telemetry.py, and wired telemetry/logging calls into runtime, server and worker code paths (zerolink/runtime/unified.py, zerolink/server/main_server.py, zerolink/workers/gpu_worker.py).
  • Hardened runtime/server/worker behavior: safer SHM/socket defaults (/tmp/zerolink.sock, zerolink_cpu_pool), JSON-structured log messages, better MainServer/MainIPCLeaseManager2P lifecycle and lease bookkeeping, safer handling of missing GPU components, and improved graceful shutdown logic.
  • Cleanups and minor refactors: small docs/README updates, fix name replacements (pynexus -> zerolink), defensive parsing of requirements in setup.py, optional CUDA extension build when Torch is available, and tests conftest adjustments to be CPU-safe.

Testing

  • Compiled the modified module with python -m py_compile zerolink/core/cpu/shm_pool.py which succeeded.
  • Ran a CPU SHM coalescing smoke script using SharedMemoryPool (create pool, allocate two blocks, free both, assert coalescing, cleanup) which printed shm coalescing ok and returned successfully.
  • Executed protocol unit tests with pytest tests/test_protocol.py -q and again with pytest --noconftest tests/test_protocol.py -q, both runs passed (3 passed).

Codex Task

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 58ab5b38aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

from torch.utils.cpp_extension import BuildExtension, CUDAExtension


def parse_requirements(filename: str) -> list[str]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fix Python 3.8 annotation that breaks setup import

The new return annotation list[str] requires PEP 585 (Python ≥3.9). Because setup.py is executed at import time during installation, this will raise TypeError: 'type' object is not subscriptable on Python 3.8, even though python_requires='>=3.8' still advertises 3.8 support. Use typing.List[str] or postpone evaluation with from __future__ import annotations to keep 3.8 installs working.

Useful? React with 👍 / 👎.

@nanocubit nanocubit self-assigned this Feb 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant