Replies: 14 comments 21 replies
-
Background infoFirst, let go through Engula’s components again There are a detail design docs engula/design.md at main · engula/engula (github.com) worthy to be read, Engula can be used both as a Library and as a Service, when it be a library, user can free combine components(e.g. all component in a standalone process), but this thread we focus on “as a Service” in k8s so the components we focus on should be:
because it’s “as a Service”, we should simply divide them as different process as needed(avoid thinking about sth like deploy journal/storage in the same process), maybe this image is the most common pattern to mapping components to processes. EngineEngine Service provided final Engula service to users, for example “KV Engine” provide kv operation to user, for different usage user can call Engine Service by remote call or embedded local call, it will:
so it can have multiple copy and have leader or lease memory state. Kernel ServiceKernel Service maintain cluster metadata. it will:
so it can have multiple copy and have leader persistent state. Journal ServiceJournal provides journal service and save rocksdb-like WAL, it will
so it have multiple replica and use local disk to persistent state Storage ServiceStorage provides storage service and save rocksdb-like SST, it will
so it has persistent state in local disk and maybe write to external service like s3. Background ServiceBackground is job-like service and handle works like compaction, clean obsolete .. it will:
so it has no state(but maybe need some deduplicate works) and can be trigger as needed, depends on job implement it also can be partition and run in parallel process/nodes. In SummaryEngine will accept user’s requests and find storage/journal from kernelService and use them directly; KernelService maintain and provide metadata for storage/journal also engine; Background run as need and do some background job; journal/ storage brainless handle request from Engine. Engine, Kernel, Journal, (Storage?) need have replicas; Engine, Kernel need elect leader(journal storage maybe or not); Engine / Journal / Storage could be performance bottleneck components and scale up as needed. |
Beta Was this translation helpful? Give feedback.
-
Package & Distribute imageThe first step for deploying Engula service to k8s should be how to package & distribute binary. Package as imageOption 1: Single image with different argsafter #148, current Engula cli command is: /bin/engula journal start [opts]
/bin/engula storage start [opts]
/bin/engula kernel start [opts] all commands start with so maybe we can follow current command-subcommand solution and release docker image with for example, journal services can be containers:
- args:
- journal
- start
command:
- /bin/engula
image: uhub.service.ucloud.cn/engula/engula:0.3
name: engula-journal-serivce
... storage services can be containers:
- args:
- storage
- start
command:
- /bin/engula
image: uhub.service.ucloud.cn/engula/engula:0.3
name: engula-storage-serivce
... and kernel also can be run as the same way....background maybe can more flexible but it will bind to engine implement, it seems also be use in same way containers:
- args:
- engine
- start
command:
- /bin/engula
image: uhub.service.ucloud.cn/engula/luna:0.3
name: engula-hashengine-serivce
- args:
- background
- start
command:
- /bin/engula
image: uhub.service.ucloud.cn/engula/luna:0.3
name: engula-hashengine-background-serivce
... cons:
Alternatives
Distribute imagemaybe we can public & release images to hub.docker.com (owned by @huachaohuang ) for each version.
for distributing images, maybe we also need handle cross-platform problems.. maybe we can publish amd64 first ( it seems apple-m1 also can run amd64 image for dev test) |
Beta Was this translation helpful? Give feedback.
-
Deploy topology & deploy flowmaybe it’s the most complex and controversial issue... now we have runnable images for each component, how to setup & run it in k8s? after some learning, operator pattern seems be popular to handle service deploy.
Let's assume that operator pattern is needed firstly- - Operator ServiceIf we choose operator pattern, we could naturally choose to supply CRDs for each component:
and we need implement “reconcile logic” to watch CRDs and keep real deployment as CRD wanted, so the first question is which component suitable to handle this job? Option 1: Reuse kernel service processKernel Service has richest knowledge about cluster, it also has roler to monitor & scale up/down other components(engine, journal, storage, background), it also be only component need communicate with deploy vendor API(e.g. k8s), it’s nature to take care of CRDs reconcile logic? Pros:
Cons:
Option 2: Start another Operator ProcessWe also can choose to start another “operator process”, like ./bin/engula operator start [opt] and package it in engula/engula docker image and start by container args, it can be simple run as k8s deployment with replica = 1, after it starts, it will:
It’s a CRD watcher + proxy component, all change decision is still made by kernel-server before deploy engula-cluster, we need “install engula-operator first”..(maybe can do command with Pros:
Cons:
and it still many questions need to resolve even if we select those way:
Option 3: welcome to help us and give more advise ~Run component as Podsafter we have a process act as operator, the next thing is run component as k8s pods, for each component: KernelService, Journal, Engine, Storagebased on real implements, it may or may not have persistent state, but it should be have a leader, so it should be something stateful-like deployment(can identified pod by pod id), but need more fine-grained control when add or remove new node(maybe sth like etcd-operator, not only add a new pod but also call ctl to add membership?) it’s related to component implements and maybe we need more investigate other project how to handle this :) Background ServiceBackground can choose be a long-term running service pod or something like k8s deamonset/job, maybe it’s based on background implement, but it doesn’t have state or membership, it also need more investigate. How prepare and special PV?component like Journal, Engine, Storage need mount local disk, so we need pass-in storage-class when cluster boostrap(maybe we see boostrap problem later section) Access Podmaybe we need sth like headless-service for statefulset to help access pods(but we need self-defined statefulset), for example, engine need access pod1 directly by
Label Podsto help index pods, we’d better attach some metadata as pod label(or annotation?), maybe we need:
Deploy FlowAnother controversial point is deploy flow, maybe we can separate it by
BootstrapFor a empty k8s cluster, we can image how can we install a engula cluster
Install works should do a number of works:
Alternatives “start kernel then let kernel start others” is an idea that was mentioned in another offline discussion, but not sure is it really need? it makes engula’s operator little complex and different to others. maybe we also can direct apply a Scale up & downIn engula, only storage / journal / engine need to scale up/down, KernelService monitor whole cluster and call operator to add/remove node.
just previous indicate, scale up & down maybe more complex than statefulset, this part need more investigate RestartKernel need recognize it's restart and not a bootstrap(maybe by persist info or query k8s exists resource?) Engine need start recovery after kernel service ready. In SummaryThat’s a lot of questions for this part:
|
Beta Was this translation helpful? Give feedback.
-
Kube Service maybe needed for Engulabecause we like poll-pattern in KernelService( poll storage / journal / background instead of storage / journal / background upload info to kernel service), kernelService need to discovery all component’s nodes, so Engula need operator to provide:
First two items has be included in previous replies, they just create & modify CR.. This will focus on last item, the simple query API in logic is : fn lookup(&self, component_type: ComponentType) -> Vec<String> it will lookup resource by
Alternatives the alternative is operator expose watch api, and maintain the index memory cache in Kernel Client :) |
Beta Was this translation helpful? Give feedback.
-
Otherbuilder tools choose?kubebuilder maybe the best choose to build operator, expect we need write a Go project. The another choose is use kube-rs, it use derive macro to do some CRD logic like kubebuilder and also have watcher/reflectors/controller to help write controller, maybe it’s not mature as kubebuilder and still need write some scaffolding code to build controller but after some test maybe it also an option(its code is very rusty :). how to upgrade or destroy?not very clear now, WIP - - security problem?In previous, we discuss some problems about k8s-api privilege, but there are more things need take care: like communicate encrypted or certificate authentication for multi-tenants configuration management?not very clear now, WIP - - |
Beta Was this translation helpful? Give feedback.
-
@zojw Thanks for starting this discussion! It helps to explain some internal details that haven't been covered in other documents. I think we have a lot of work to do here. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
My two cents here.
If the kernel service is deployed per engula deployment, then It may be better to have a separate process to deal with it. Because the operator is cluster scoped. Kubebuilder is preferred since you have multiple CRDs compared with kube-rs. Kube-rs does not have the dynamic typed client, thus managing multiple CRDs in one operator is hard.
I think my concern here is if we need such an operator in long term. ( or what we can get from an operator) We do not need an operator if the helm chart works well for us. |
Beta Was this translation helpful? Give feedback.
-
BTW, it seems we can simple use upper-level CRD(EngulaCluster) manage 5 statefulsets per components( kernel/engine/journal/storage/background)~ maybe we only need to define component CRDs when statefulset is unable to meet the requirement~? after some investigation, we need statefulset support
the last one seems be a problem in current k8s version, but can also be supported by kubernetes/enhancements#2255 in beta version mentioned by @gaocegege in discord, maybe we can use beta to develop first(and hope it can ready when engula host service ready :), or maybe we can choose use or impl some extension like https://github.com/pingcap/advanced-statefulset after that maybe we can use helm(OperatorSDK?) or kubebuilder handle EngluaCluster directly. For 0.3 version, maybe we can support it?
both engula and engula's k8s deployment should need long time to stable, maybe we can choose simple impl first and refine(or rewrite) them later |
Beta Was this translation helpful? Give feedback.
-
In my personal opinion, I think it's not a great time to implement a full-function based on the following reason The Engula project is not stable yet. In another word, we don't have some stable abstraction for data flow, the user circumstance, and so on. So I think if we choose to implate the K8S operator based on the complicated design right now(Yes I think the design is too complicated to implement), I think it's a bad choice. My suggestion is maybe we can following steps:
|
Beta Was this translation helpful? Give feedback.
-
I joined PingCAP hackathon 2021 recently, and I am glad to work on this operator as our topic in this hackathon. Thus I try to know more details here. I'd appreciate it if anyone could help me. Here are the questions:
Thanks 🥂 🍻 |
Beta Was this translation helpful? Give feedback.
-
Wow, I see some very interesting discussions, thanks to everyone here. I don't know much about helm and operator yet, so I think I simply provide some information that may be useful. As of v0.2, Engula provides the following functionalities:
For v0.3, Engula plans to focus on the following two things:
For a long-term plan, my ideas so far are as follows (I may start another discussion about them, so just provide some unorganized information here):
|
Beta Was this translation helpful? Give feedback.
-
Just FYI, https://github.com/cow-on-board/engula-operator/ We will hack on this project during the hackathon(from 2022.01.01 to 2022.01.07). We want to use this opportunity to learn Rust thus kube-rs is used. |
Beta Was this translation helpful? Give feedback.
-
Updated: Based on #214 (comment), I'd tried to build another demo using kubebuilder https://github.com/zojw/engula-operator (with zojw/engula branch) more details can be seen in its README.md, it can deploy v0.2 with some dirty modifications but it's still a long way from real k8s deployment, I found engula itself need to make more things clear before do further works :
for the current stage, it better temporarily focuses on v0.3 function and come back to the operator works later~ but it's always welcome to help review Demo(it' first time I try to write operator) or give advice for k8s deployment 😄 |
Beta Was this translation helpful? Give feedback.
-
Hi, we are currently preparing to start the design development of the “Deploy Engula on K8S” works. This is an issue that involves many aspects of k8s or Engula questions. So let’s start the threads as root discussion.
To give 🧱s, maybe we can init thread with multiple reply to facilitate discussion:
Beta Was this translation helpful? Give feedback.
All reactions