[Bug]: Cannot connect to milvus running on a k8s cluster - TypeError: no default reduce due to non-trivial cinit #2697

andreab67 · 2025-03-16T13:25:40Z

Is there an existing issue for this?

I have searched the existing issues

Describe the bug

I have a setup where Milvus is deployed on my home Kubernetes cluster. the cluster has 4 nodes:

F:\Test>kubectl get nodes
NAME            STATUS   ROLES           AGE   VERSION
k8scontroller   Ready    control-plane   40h   v1.32.3
k8snode1        Ready    worker          40h   v1.32.3
k8snode2        Ready    worker          40h   v1.32.3
k8snode3        Ready    worker          40h   v1.32.3

F:\Test>kubectl get pods -n milvus
NAME                                                   READY   STATUS      RESTARTS        AGE
milvus-1742067666-minio-provisioning-rhnnc             0/1     Completed   0               17h
milvus-1742070012-attu-69f6ff669b-vgs9w                1/1     Running     0               16h
milvus-1742070012-data-coordinator-6568d8cf6c-bt7cl    1/1     Running     0               16h
milvus-1742070012-data-node-65f79c54bd-5gkcv           1/1     Running     0               16h
milvus-1742070012-etcd-0                               1/1     Running     0               16h
milvus-1742070012-etcd-1                               1/1     Running     0               16h
milvus-1742070012-etcd-2                               1/1     Running     0               16h
milvus-1742070012-etcd-pre-upgrade-d54cf               0/1     Completed   0               16h
milvus-1742070012-index-coordinator-6df4c47b99-kdwn4   1/1     Running     0               16h
milvus-1742070012-index-node-759fdf54b6-2s4m8          1/1     Running     0               16h
milvus-1742070012-kafka-controller-0                   1/1     Running     2 (3h30m ago)   16h
milvus-1742070012-minio-57768f94cc-7xb82               1/1     Running     0               16h
milvus-1742070012-minio-provisioning-4mxgh             0/1     Completed   0               16h
milvus-1742070012-proxy-76bc496d8c-phr28               1/1     Running     0               16h
milvus-1742070012-query-coordinator-767bbfb68c-gc5d7   1/1     Running     0               16h
milvus-1742070012-query-node-5f967fccd6-jxzzm          1/1     Running     0               16h
milvus-1742070012-root-coordinator-6dbf7695c5-bp7ph    1/1     Running     0               16h

I can successfully connect to attu and I can see my database from a browser in the same network.

I have extracted the ca certificate of the k8s cluster and put it into a directory called CA:

 Directory of C:\CA

03/16/2025  07:07 AM    <DIR>          .
03/15/2025  04:57 PM             2,100 andrea-ca.crt
03/15/2025  07:02 PM             2,278 client.crt
03/15/2025  07:02 PM             3,292 client.key
03/16/2025  07:06 AM               576 k8s-ca.crt

As you see I tried several things on top of k8s-ca.crt.

I am not able to connect from python to milvus.

This is my code prototype:

import PyPDF2
from ebooklib import epub, ITEM_DOCUMENT
from bs4 import BeautifulSoup

import os
import sys
import grpc
from pymilvus import MilvusClient, connections, Collection, CollectionSchema, FieldSchema, DataType, utility


with open(r"C:\CA\k8s-ca.crt", "rb") as f:
    trusted_certs = f.read()

credentials = grpc.ssl_channel_credentials(root_certificates=trusted_certs)

connections.connect(
    alias="default",
    uri="https://milvus.andrea-house.com",  # Note: host and port only.
    token="admin:Milvus",
    db_name="default",
    channel_credentials=credentials
)

client = MilvusClient("default")

def read_pdf(file_path):
    """Extract text from a PDF file."""
    text = ""
    with open(file_path, "rb") as file:
        pdf_reader = PyPDF2.PdfReader(file)
        for page in pdf_reader.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + "\n"
    return text

def read_epub(file_path):
    """Extract text from an EPUB file."""
    book = epub.read_epub(file_path)
    text = ""
    for item in book.get_items():
        # Only process document items
        if item.get_type() == ITEM_DOCUMENT:
            soup = BeautifulSoup(item.get_content(), features="html.parser")
            text += soup.get_text() + "\n"
    return text

def chunk_text(text, max_chunk_size=500):
    """
    Split the text into chunks each with a maximum number of words.
    Adjust max_chunk_size as needed.
    """
    words = text.split()
    chunks = []
    for i in range(0, len(words), max_chunk_size):
        chunk = " ".join(words[i:i + max_chunk_size])
        chunks.append(chunk)
    return chunks

def create_milvus_collection(collection_name, dim):
    """
    Create a Milvus collection with a given dimension if it doesn't exist.
    The collection includes:
      - An auto-generated primary key "id"
      - A "embedding" field to store vector embeddings
      - A "text" field to store the corresponding text chunk
    """
    if utility.has_collection(collection_name):
        collection = Collection(collection_name)
        print(f"Collection '{collection_name}' already exists.")
        return collection
    else:
        fields = [
            FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
            FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=dim),
            FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=65535)
        ]
        schema = CollectionSchema(fields, description="Book chunks and embeddings")
        collection = Collection(name=collection_name, schema=schema)
        print(f"Created collection '{collection_name}'.")
        return collection

def main(file_path):
    # Determine file extension and extract text accordingly
    ext = os.path.splitext(file_path)[1].lower()
    if ext == ".pdf":
        book_text = read_pdf(file_path)
    elif ext == ".epub":
        book_text = read_epub(file_path)
    else:
        print("Unsupported file format. Please provide a PDF or EPUB file.")
        return

    if not book_text.strip():
        print("No text was extracted from the file.")
        return

    # Chunk the book text for manageable embedding generation
    chunks = chunk_text(book_text, max_chunk_size=500)
    print(f"Extracted {len(chunks)} text chunks from the book.")

    # Load the sentence transformer model to generate embeddings
    model = SentenceTransformer("all-MiniLM-L6-v2")
    embeddings = model.encode(chunks, convert_to_numpy=True)

    # Connect to Milvus (update host/port if necessary)
    connections.connect("default", host="milvus.andrea-house.com", port="443")

    # Create (or get) a collection named "python" with the proper embedding dimension
    dim = embeddings.shape[1]
    collection = create_milvus_collection("python", dim)

    # Prepare data for insertion; note that the auto_id field ("id") is skipped
    data = [
        embeddings.tolist(),  # embedding field
        chunks                # text field
    ]

    # Insert data into the collection and flush to persist
    insert_result = collection.insert(data)
    collection.flush()
    print("Data inserted into Milvus successfully.")
    print(f"Insert result: {insert_result}")

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python load_book.py <path_to_book>")
        sys.exit(1)
    file_path = sys.argv[1]
    main(file_path)

This is my error:

Traceback (most recent call last):
  File "F:\Test\load_book.py", line 16, in <module>
    connections.connect(
  File "C:\Python311\Lib\site-packages\pymilvus\orm\connections.py", line 390, in connect
    kwargs_copy = copy.deepcopy(kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
                             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
                             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\copy.py", line 161, in deepcopy
    rv = reductor(4)
         ^^^^^^^^^^^
  File "<stringsource>", line 2, in grpc._cython.cygrpc.SSLChannelCredentials.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__

I have been looking into encapsulating the connection and trying to disable deep copy.

No luck...

Thanks for looking at this.

Regards

Andrea

Expected Behavior

I should be able to connect to Milvus as I do from my browser

Steps/Code To Reproduce behavior

Try to connect to a Milvus instance running inside a kubernetes cluster using an airgapped environment

Environment details

- Hardware/Software conditions (OS, CPU, GPU, Memory): Kubernetes cluster for Milvus - Airgapped
- Method of installation (Docker, or from source): - used "milvus" chart repository https://milvus.io/docs/install_cluster-helm.md
- Milvus version (v0.3.1, or v0.4.0): 2.5.6
- Milvus configuration (Settings you made in `server_config.yaml`): I swapped the ingress controller certificate with one I generated from andrea-ca.crt - I have andrea-ca.crt added to my trusted root authority in my windows system.

Anything else?

I am using Private CAs generated by CloudFlare CFSSL - https://github.com/cloudflare/cfssl

No response

The text was updated successfully, but these errors were encountered:

XuanYang-cn · 2025-03-19T03:43:04Z

@andreab67 Please refer to this doc https://milvus.io/docs/tls.md

If you're going to connect to a Milvus server with TLS, here's how to connect with Milvus, you don't need to read the certificate files, just pass in the corresponding paths.

# One way TLS
connections.connect(
    ...
    secure=True,
    server_pem_path="path_to/server.pem",
    server_name="localhost"
)

# Two way TLS
connect(
...
    client_pem_path="path_to/client.pem",
    client_key_path="path_to/client.key",
...
)

andreab67 added the kind/bug Something isn't working label Mar 16, 2025

andreab67 changed the title ~~[Bug]: Cannot connect to milvus running on a k8s cluster - not on the cloud~~ [Bug]: Cannot connect to milvus running on a k8s cluster - TypeError: no default reduce due to non-trivial cinit Mar 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Cannot connect to milvus running on a k8s cluster - TypeError: no default reduce due to non-trivial cinit #2697

[Bug]: Cannot connect to milvus running on a k8s cluster - TypeError: no default reduce due to non-trivial cinit #2697

andreab67 commented Mar 16, 2025 •

edited

Loading

XuanYang-cn commented Mar 19, 2025

[Bug]: Cannot connect to milvus running on a k8s cluster - TypeError: no default reduce due to non-trivial cinit #2697

[Bug]: Cannot connect to milvus running on a k8s cluster - TypeError: no default reduce due to non-trivial cinit #2697

Comments

andreab67 commented Mar 16, 2025 • edited Loading

Is there an existing issue for this?

Describe the bug

Expected Behavior

Steps/Code To Reproduce behavior

Environment details

Anything else?

XuanYang-cn commented Mar 19, 2025

andreab67 commented Mar 16, 2025 •

edited

Loading