Skip to content

Latest commit

 

History

History
77 lines (59 loc) · 3.07 KB

README.md

File metadata and controls

77 lines (59 loc) · 3.07 KB

SkyhookDM-Arrow Docker

Docker image containing SkyhookDM built on top of Arrow along with C++ and Python API clients.

SkyhookDM-Arrow: skyhook image pulls

SkyhookDM-Arrow-Benchmark: skyhook benchmark image pulls

Triggering release builds

./release_skyhook.sh [tag]
./release_skyhook_benchmark.sh [tag]

Deploying SkyhookDM-Arrow on a Rook cluster

  • Change the Ceph image tag in the Rook CRD here to the image built from this dir (or you can quickly use uccross/skyhookdm-arrow:vX.Y.Z as the image tag) to change your Rook Ceph cluster to the vX.Y.Z version of SkyhookDM Arrow.

  • After the cluster is updated, we need to deploy a Pod with the PyArrow (with SkyhookFileFormat API) library installed to start interacting with the cluster. This can be achieved by following these steps:

    1. Update the ConfigMap with configuration options to be able to load the arrow CLS plugins.
    kubectl apply -f cls.yaml
    1. Create a Pod with PyArrow pre-installed for connecting to the cluster and running queries.
    kubectl apply -f client.yaml
    1. Create a CephFS on the Rook cluster.
    kubectl create -f filesystem.yaml
    1. Copy the Ceph configuration and Keyring from some OSD/MON Pod to the playground Pod.
    # copy the ceph config
    kubectl -n [namespace] cp [any-osd/mon-pod]:/var/lib/rook/[namespace]/[namespace].config ceph.conf
    kubectl -n [namespace] cp ceph.conf rook-ceph-playground:/etc/ceph/ceph.conf
    
    # copy the keyring
    kubectl -n [namespace] cp [any-osd/mon-pod]:/var/lib/rook/[namespace]/client.admin.keyring keyring
    kubectl -n [namespace] cp keyring rook-ceph-playground:/etc/ceph/keyring

    NOTE: You would need to change the keyring path in the ceph config to /etc/ceph/keyring manually.

    1. Check the connection to the cluster from the client Pod.
    # get a shell into the client pod
    kubectl -n [namespace] exec -it rook-ceph-playground bash
    
    # check the connection status
    $ ceph -s
    1. Now, install ceph-fuse and mount CephFS into some path in the client Pod using it. [In a later release ceph-fuse will come installed in the SkyhookDM image itself.]
    yum install ceph-fuse
    mkdir -p /mnt/cephfs
    ceph-fuse --client_fs cephfs /mnt/cephfs 

    NOTE: The client_fs name can be different. Please check the filesystem.yaml file for the filesystem name you are using.

    1. Download some example dataset into /path/to/cephfs/mount. For example,
    cd /mnt/cephfs
    wget https://raw.githubusercontent.com/JayjeetAtGithub/zips/main/nyc.zip
    unzip nyc.zip
    1. Modify the example python script according to your needs and execute.
    python3 example.py