Deploy Engula on K8S Discussion #214

zojw · 2021-12-21T09:03:15Z

zojw
Dec 21, 2021

Hi, we are currently preparing to start the design development of the “Deploy Engula on K8S” works. This is an issue that involves many aspects of k8s or Engula questions. So let’s start the threads as root discussion.

To give 🧱s, maybe we can init thread with multiple reply to facilitate discussion:

Background: help k8s experts quickly know Engula’s component
Package & Distribute image: how to package and distribute images for each components
Deploy topology & deploy flow: how run image as k8s resource
Service may be needed for Engula: what function need k8s provide to Engula
Others: builder tools choose? how to update or destory? security problem? configuration management?

zojw · 2021-12-21T09:04:03Z

zojw
Dec 21, 2021
Author

Background info

First, let go through Engula’s components again

There are a detail design docs engula/design.md at main · engula/engula (github.com) worthy to be read, Engula can be used both as a Library and as a Service, when it be a library, user can free combine components(e.g. all component in a standalone process), but this thread we focus on “as a Service” in k8s

so the components we focus on should be:

Journal
Storage
Background
Kernel Service
Engine

because it’s “as a Service”, we should simply divide them as different process as needed(avoid thinking about sth like deploy journal/storage in the same process), maybe this image is the most common pattern to mapping components to processes.

Engine

Engine Service provided final Engula service to users, for example “KV Engine” provide kv operation to user, for different usage user can call Engine Service by remote call or embedded local call, it will:

Handle Requests from user
Write & Maintain Memory State(like rocksdb’s memtable)
Lookup Journal & Storage Meta(include location) from Kernel
Remote write data to right Journal Service directly
Flush immutable memtable to remote Storage Service directly
It can has several Engine replica nodes
Data maybe partitioned by some rule
For each partition it need a leader and replicate data to its follower to reduce recovery time
It need scale up as needed

so it can have multiple copy and have leader or lease memory state.

Kernel Service

Kernel Service maintain cluster metadata. it will:

Poll & control Journal, Storage, Background Service
Helps Engine & Background Service to find Journal & Storage’s metadata & location
Maintain persistent metadata (like rocksdb’s manifest + storage/journal location)
Metadata should be HA, so it has replicas
Only updated by Flush/Compaction and poll
Due to kernel service has full runtime state for whole cluster, it’s suitable for decide to when and how to scale up/down other components

so it can have multiple copy and have leader persistent state.

Journal Service

Journal provides journal service and save rocksdb-like WAL, it will

Append data to requested log stream in local disk
Use some consensus algorithm to replicate to others to ensure HA
Read & replay when Engine restart, and truncate when unused
Provide self’s log stream meta to Kernel service when receive poll request
It need scale up as needed

so it have multiple replica and use local disk to persistent state

Storage Service

Storage provides storage service and save rocksdb-like SST, it will

Save object in request bucket(folder) in local disk or s3 or both
Service read requests that cannot served by memtable/cache in engine
Provide self’s log stream meta to Kernel service when receive poll request
it also can have some replicas in other zone to pre-reply local SST to reduce recovery time
It need scale up as needed

so it has persistent state in local disk and maybe write to external service like s3.

Background Service

Background is job-like service and handle works like compaction, clean obsolete .. it will:

Start by kernel as needed
Read & Write Storage & Journal

so it has no state(but maybe need some deduplicate works) and can be trigger as needed, depends on job implement it also can be partition and run in parallel process/nodes.

In Summary

Engine will accept user’s requests and find storage/journal from kernelService and use them directly; KernelService maintain and provide metadata for storage/journal also engine; Background run as need and do some background job; journal/ storage brainless handle request from Engine.

Engine, Kernel, Journal, (Storage?) need have replicas; Engine, Kernel need elect leader(journal storage maybe or not); Engine / Journal / Storage could be performance bottleneck components and scale up as needed.

4 replies

tisonkun Dec 22, 2021

I don't see a definition of kernel-client here. It seems more than a client to kernel in the picture but arbitrary clients to another components. It's reasonable that the engine talks directly to journal for persisting data while discovering the journal service with help of the kernel, but a better name than kernel-client will help.

zojw Dec 22, 2021
Author

Emmm, ~~maybe~~ engine-toolbox ?🌝…but yes, it’s a library inside engine, should not disappear in this process-view picture

tisonkun Dec 23, 2021

@zojw IMO the "client" exists for process-view picture only, and in library use, it's an in-memory interface. You might think of inline it into the component, as you don't mention it more in the comment above, or simply "client" (which it's trivial to me). engine-toolbox is a worse name, why the background process needs an engine-toolbox?

tisonkun Dec 23, 2021

but a better name than kernel-client will help

I recall this sentence. When you take a look at our architecture graph, there are connections directly between components, you don't have to emphasize the "clients", which are not all "kernel-client" additionally.

zojw · 2021-12-21T09:04:19Z

zojw
Dec 21, 2021
Author

Package & Distribute image

The first step for deploying Engula service to k8s should be how to package & distribute binary.

Package as image

Option 1: Single image with different args

after #148, current Engula cli command is:

/bin/engula journal start [opts]
/bin/engula storage start [opts]
/bin/engula kernel start [opts]

all commands start with /bin/engula and start different component by different subcommand(like journal, storage, kernel)...

so maybe we can follow current command-subcommand solution and release docker image with /bin/engula, and use can provide different argument to start container for different service.

for example, journal services can be

containers:
  - args:
    - journal
    - start
    command:
    - /bin/engula
    image: uhub.service.ucloud.cn/engula/engula:0.3
    name: engula-journal-serivce
  ...

storage services can be

containers:
  - args:
    - storage
    - start
    command:
    - /bin/engula
    image: uhub.service.ucloud.cn/engula/engula:0.3
    name: engula-storage-serivce
  ...

and kernel also can be run as the same way....background maybe can more flexible but it will bind to engine implement, it seems also be use in same way

containers:
  - args:
    - engine
    - start
    command:
    - /bin/engula
    image: uhub.service.ucloud.cn/engula/luna:0.3
    name: engula-hashengine-serivce
 - args:
    - background
    - start
    command:
    - /bin/engula
    image: uhub.service.ucloud.cn/engula/luna:0.3
    name: engula-hashengine-background-serivce
  ...

cons:

image size is bigger
still cannot unify engine and background
container contains more useless function seems to be the less security?

Alternatives

build different binaries & images for different components
welcome other ideas 😄 (I'm a freshman for k8s/docker

Distribute image

maybe we can public & release images to hub.docker.com (owned by @huachaohuang ) for each version.

it seems we also can use github workflow to publish it

for distributing images, maybe we also need handle cross-platform problems..

maybe we can publish amd64 first ( it seems apple-m1 also can run amd64 image for dev test)

1 reply

tisonkun Dec 23, 2021

engula is an one-for-all binary, for the current situation, perhaps packaging such one image is enough. Distribute image by GitHub Packages looks interesting, I propose to owned it by a maintainers instead of single person, and will make a demo on publishing images with GitHub Packages.

zojw · 2021-12-21T09:04:41Z

zojw
Dec 21, 2021
Author

Deploy topology & deploy flow

maybe it’s the most complex and controversial issue...

now we have runnable images for each component, how to setup & run it in k8s?

after some learning, operator pattern seems be popular to handle service deploy.

but do we really need operator pattern for engula? :)

Let's assume that operator pattern is needed firstly- -

Operator Service

If we choose operator pattern, we could naturally choose to supply CRDs for each component:

Kernel
Storage
Journal
Background
Engine

maybe a better or simple CRD definitions can use?

and we need implement “reconcile logic” to watch CRDs and keep real deployment as CRD wanted, so the first question is which component suitable to handle this job?

Option 1: Reuse kernel service process

Kernel Service has richest knowledge about cluster, it also has roler to monitor & scale up/down other components(engine, journal, storage, background), it also be only component need communicate with deploy vendor API(e.g. k8s), it’s nature to take care of CRDs reconcile logic?

Pros:

Without deploy addition proxy-like component
Kernel is cluster brain，it bootstrap & manage other components’s deployment, it’s the decision maker

Cons:

Kernel need self-bootstrap by a kernel
Kernel has multiple nodes and reconcile should only pinned to its leader node?
KernelService Pod have privilege to access k8s API maybe has addition risk in security?
Not sure, “one operator manage multiple engula-cluster” vs “each operator manager for each engula-cluster”, which one is better?

Option 2: Start another Operator Process

We also can choose to start another “operator process”, like

./bin/engula operator start [opt]

and package it in engula/engula docker image and start by container args, it can be simple run as k8s deployment with replica = 1, after it starts, it will:

handle CRDs reconcile logic
export API to kernel to change CRDs
export API to kernel to query or watch Components locations
- Kernel find storage/journal/background
- Engine to find Kernel(any kernel node can know current leader)

It’s a CRD watcher + proxy component, all change decision is still made by kernel-server

before deploy engula-cluster, we need “install engula-operator first”..(maybe can do command with ./bin/engula operator install [opts] in kubeconfig accessible node)

Pros:

One operator can mange multiple engula-cluster instance
Only one place can access to kubeconfig

Cons:

Introduce another component
Maybe introduce new HA problem need take care for new component
Not sure, “one operator manage multiple engula-cluster” vs “each operator manager for each engula-cluster”, which one is better?

and it still many questions need to resolve even if we select those way:

How do we choose between “one operator, multiple cluster” and “one operator, one cluster”
Does operator need be different namespace?
instead of simple one-liner install operator, maybe there are more tweak option need give to user?

Option 3: welcome to help us and give more advise ~

Run component as Pods

after we have a process act as operator, the next thing is run component as k8s pods, for each component:

KernelService, Journal, Engine, Storage

based on real implements, it may or may not have persistent state, but it should be have a leader, so it should be something stateful-like deployment(can identified pod by pod id), but need more fine-grained control when add or remove new node(maybe sth like etcd-operator, not only add a new pod but also call ctl to add membership?)

it’s related to component implements and maybe we need more investigate other project how to handle this :)

Background Service

Background can choose be a long-term running service pod or something like k8s deamonset/job, maybe it’s based on background implement, but it doesn’t have state or membership, it also need more investigate.

How prepare and special PV?

component like Journal, Engine, Storage need mount local disk, so we need pass-in storage-class when cluster boostrap(maybe we see boostrap problem later section)

Access Pod

maybe we need sth like headless-service for statefulset to help access pods(but we need self-defined statefulset), for example, engine need access pod1 directly by pod1.svc:port

not sure, is it the pratice to use pod ip or DNS - -?

Label Pods

to help index pods, we’d better attach some metadata as pod label(or annotation?), maybe we need:

operator-ns: current pod created by which operator in which namespace?
service-name: which headless-service can access this pod, so indexer can use pod-name + svc-name to construct accessible URI
port: which service pod, it expose...we also can get this from spec.container[x], maybe it’s just more convenience

Deploy Flow

Another controversial point is deploy flow, maybe we can separate it by

bootstrap
scale up & down
restart

Bootstrap

For a empty k8s cluster, we can image how can we install a engula cluster

we need install operator function first (maybe engula operator install [opts] )
prepare ebs/storageClass if use AWS EKS
deploy KernelService by apply CustomResource::Kernel(maybe give a complex config file or arguments at the same time) to k8s cluster
operator observe CustomResource::Kernel and deploy KernelService Pods
Base on configuration KernelService call operator to deploy init components(engine, storage, journal, background)
Then KernelService continually monitor other components, add or remove as needed.

Install works should do a number of works:

install CRD
configure ServiceAccount & ClusterRole and mount serviceAccount for operator pod
deploy operator as (deployment?)

Alternatives

“start kernel then let kernel start others” is an idea that was mentioned in another offline discussion, but not sure is it really need? it makes engula’s operator little complex and different to others.

maybe we also can direct apply a Engula-Cluster CustomResource that contains kernel, journal, storage, background, and we pre-deploy it with replica=0 if no need for it?

Scale up & down

In engula, only storage / journal / engine need to scale up/down, KernelService monitor whole cluster and call operator to add/remove node.

Question 1: After KernelService make decision, operator or k8s api or kernel service itself maybe unavailable, kernelService should better maintain a copy of desired CR info?(e.g. storage should added as 4 storages)

just previous indicate, scale up & down maybe more complex than statefulset, this part need more investigate

Restart

Kernel need recognize it's restart and not a bootstrap(maybe by persist info or query k8s exists resource?)

Engine need start recovery after kernel service ready.

In Summary

That’s a lot of questions for this part:

Do we need operator pattern?
Do we need another operator node or reuse kernel an operator?
What’s relationship between operator and cluster-instances?
How to write reconcile logic to manage each components or have some usable resources?
Can we access pod by DNS or IP?
Do need bootstrap kernel first then deploy other components? or we can deploy all component first and just scale up/down them?
Maybe we need persist some desired cluster info to survival between component restart? (also create cluster info? maybe like RocksDB’s OPTION file)
Not mentioned earlier but quite important:
- Not sure how to pass in configuration, by args or by configmap?
- ....

0 replies

zojw · 2021-12-21T09:05:06Z

zojw
Dec 21, 2021
Author

Kube Service maybe needed for Engula

because we like poll-pattern in KernelService( poll storage / journal / background instead of storage / journal / background upload info to kernel service), kernelService need to discovery all component’s nodes, so Engula need operator to provide:

Deploy components
Scale up & down component
Query & Watch component’s all pod add+port

First two items has be included in previous replies, they just create & modify CR..

This will focus on last item, the simple query API in logic is :

fn lookup(&self, component_type: ComponentType) -> Vec<String>

it will lookup resource by (current-namespace, current-cluster-name, component-type) condition..

For query API implement, we can choose maintain the index memory cache in operator base on k8s watch API, so query can be handle by local memory instead of touch k8s API every call.

Alternatives

the alternative is operator expose watch api, and maintain the index memory cache in Kernel Client :)

0 replies

zojw · 2021-12-21T09:05:20Z

zojw
Dec 21, 2021
Author

Other

builder tools choose?

kubebuilder maybe the best choose to build operator, expect we need write a Go project.

The another choose is use kube-rs, it use derive macro to do some CRD logic like kubebuilder and also have watcher/reflectors/controller to help write controller, maybe it’s not mature as kubebuilder and still need write some scaffolding code to build controller but after some test maybe it also an option(its code is very rusty :).

how to upgrade or destroy?

not very clear now, WIP - -

security problem?

In previous, we discuss some problems about k8s-api privilege, but there are more things need take care: like communicate encrypted or certificate authentication for multi-tenants

configuration management?

not very clear now, WIP - -

0 replies

huachaohuang · 2021-12-21T10:34:22Z

huachaohuang
Dec 21, 2021
Maintainer

@zojw Thanks for starting this discussion! It helps to explain some internal details that haven't been covered in other documents. I think we have a lot of work to do here.

0 replies

wph95 · 2021-12-21T14:55:03Z

wph95
Dec 21, 2021

docker hub not friendly for public repo (have rate limit), GitHub package is better to choose
personally, design helm chart more early than design operator

3 replies

zojw Dec 21, 2021
Author

thanks for the reply, let me take more looks 😄

tisonkun Dec 21, 2021

personally, design helm chart more early than design operator

Do you mean "early" or "easily"? It seems too early to carry a complex operator design before we clarify the storage engine mode (ref).

zojw Dec 21, 2021
Author

It seems too early to carry a complex operator design before we clarify the storage engine mode

@tisonkun yes, it’s also the reason why I write longest #214 (comment) first and want to make consensus about it first via disscussion

gaocegege · 2021-12-22T02:53:33Z

gaocegege
Dec 22, 2021

My two cents here.

Operator Service

If the kernel service is deployed per engula deployment, then It may be better to have a separate process to deal with it. Because the operator is cluster scoped.

Kubebuilder is preferred since you have multiple CRDs compared with kube-rs. Kube-rs does not have the dynamic typed client, thus managing multiple CRDs in one operator is hard.

personally, design helm chart more early than design operator

I think my concern here is if we need such an operator in long term. ( or what we can get from an operator) We do not need an operator if the helm chart works well for us.

6 replies

zojw Dec 22, 2021
Author

BTW, I think you can have a higher level CRD that manages the others. For example, EngulaCluster. It owns other components and eases the process of creating the cluster. Or users need to create many CRs manually.

yes, I like root EngulaCluster CRD too~ but there are some opinions that want to "setup kernel first(apply kernel CRD), then kernel service setup other components(apply other CRDs)", so it will be Kernel CRD --->{ Storage CRD, Journal CRD}, but can not see the relationship between them in each CRDs..maybe only can see it in kernel start configuration file(configuration file should contain root CRD-like info)... it's flexible but seems a little stranger.

after some retrospect, maybe the reason we choose "kernel deploy then kernel deploy others" is we want dynamic choose component impl in deploy time? but do we really need dynamic choose that? @huachaohuang @PatrickNicholas @tisonkun if not, maybe we'd better use root CRD to deploy full cluster first(maybe we want to not start some resource at first, we can choose to deploy a resource with 0 replica), and kernel only take charge to scale up/down the resource

zojw Dec 22, 2021
Author

@gaocegege Thanks for the replies, I will learn more, special for "what helm can not do"(I never use helm 😄 ) and "managing multiple CRDs" problem, thanks a lot ~

gaocegege Dec 22, 2021

but there are some opinions that want to "setup kernel first(apply kernel CRD), then kernel service setup other components(apply other CRDs)", so it will be Kernel CRD --->{ Storage CRD, Journal CRD}, but can not see the relationship between them in each CRDs..maybe only can see it in kernel start configuration file(configuration file should contain root CRD-like info)... it's flexible but seems a little stranger.

Then I think it is better to have an upper-level CRD that deals with the deployment order. WDYT

gaocegege Dec 22, 2021

dynamic choose component impl in deploy time

Could I ask why it is needed? If you need this feature, do not waste time on helm chart. It does not support this case.

zojw Dec 22, 2021
Author

Could I ask why it is needed? If you need this feature, do not waste time on helm chart. It does not support this case.

After some discussion, it's uncommon usage, maybe we can use upper-level CRD solution first~

I guess "kernel create CRD object" maybe only useful to when we need scale up by setup a new type storage? ... - - but maybe we also can do that in another way...let's ignore it 😶‍🌫️

zojw · 2021-12-22T14:45:06Z

zojw
Dec 22, 2021
Author

BTW, it seems we can simple use upper-level CRD(EngulaCluster) manage 5 statefulsets per components( kernel/engine/journal/storage/background)~

maybe we only need to define component CRDs when statefulset is unable to meet the requirement~?

after some investigation, we need statefulset support

scale-out addition pod
scale-in specified pod(for example: database-A store in [p1, p3], database-B store in [p2, p4] and we want to delete p3)

the last one seems be a problem in current k8s version, but can also be supported by kubernetes/enhancements#2255 in beta version mentioned by @gaocegege in discord, maybe we can use beta to develop first(and hope it can ready when engula host service ready :), or maybe we can choose use or impl some extension like https://github.com/pingcap/advanced-statefulset

after that maybe we can use helm(OperatorSDK?) or kubebuilder handle EngluaCluster directly.

For 0.3 version, maybe we can support it?

Install EngulaCluster CRD
Deploy EngulaCluster CR Object(or helm release) with specified journal/storage/kernel/engine/background deployment in Yaml
Operator(or helm) deploy a statefulset for each journal/storage/kernel/engine/background
Provide API for Engula(kernel) to query pod URL by component type
Provide API for Engula(kernel) to add new pod or remove specified pod for the component type

both engula and engula's k8s deployment should need long time to stable, maybe we can choose simple impl first and refine(or rewrite) them later

2 replies

zojw Dec 22, 2021
Author

but I still have no idea about how choosen between helm and kubebuilder - -

gaocegege Dec 23, 2021

Both operator sdk and kubebuilder work for you. Prefer kubebuilder since it is a kubernetes sig project.

Zheaoli · 2021-12-23T06:11:39Z

Zheaoli
Dec 23, 2021

In my personal opinion, I think it's not a great time to implement a full-function based on the following reason

The Engula project is not stable yet. In another word, we don't have some stable abstraction for data flow, the user circumstance, and so on. So I think if we choose to implate the K8S operator based on the complicated design right now(Yes I think the design is too complicated to implement), I think it's a bad choice.

My suggestion is maybe we can following steps:

Support auto build and publish the nightly docker image
Add a single docker-compose file to support that people can run full-function Engula locally by using docker images.
Add a simple Kubernetes deployment to support run simple but full-function Engula on Kubernetes
Support helm charts
Now, we need an operator for Kubernetes!

1 reply

zojw Dec 23, 2021
Author

@Zheaoli Thanks for your information~ agree that engula is still unstable and not suitable for doing complex implementation at this stage too. so in #214 (comment) also want to simplify it as "deploy each component in k8s" and support discovery + scale up & down API

the reason why we try to implement an "all separated components in k8s" prototype at this stage is some englua's remote implements(e.g. remote compaction, read/write multiple storages or kernel place_lookup function) could be helped & demonstrated by the it, in current stage design, engula need k8s provide some function(discovery + scaling) to running as distribute service(but long-term we can not rely on k8s to do that~)

so I propose will do 1(tison is trying to do that #214 (reply in thread)) and 3(maybe used simple helm or op - -) mentioned by you first~ welcome to contribute or review, thanks 😄

gaocegege · 2021-12-23T08:00:56Z

gaocegege
Dec 23, 2021

I joined PingCAP hackathon 2021 recently, and I am glad to work on this operator as our topic in this hackathon. Thus I try to know more details here. I'd appreciate it if anyone could help me.

Here are the questions:

When will the kernel create the background service? What does this process do in engula?
How to run the engine process? I did not find the command to run it (kernel, journal and storage were found in the binary)
How to test if the engula works? When I run kernel, journal, and storage, there is no output in STDOUT.

Thanks 🥂 🍻

1 reply

zojw Dec 23, 2021
Author

Let me try to provide the information that I know, other folks can help correct or add~

When will the kernel create the background service? What does this process do in engula?

There are should be two choices for background service

pre-deploy long-running service and export gRPC API to the kernel, kernel invoke it when need run the task
deploy when it's needed, maybe sth like k8s job, kernel deploy and run a new job to run the task

in my personal opinion, option 1 should be easy to implement...but it's highly related to engine impl #221, need @huachaohuang to confirm it

background service will run tasks like "compaction SST" or "clean up obsolete files", it will be triggered by Engine's write flow(after memtable full's flush or after latest compaction finish but find need more compaction), Engine will invoke kernel to find a Backgroud Service to run those works.

the background hasn't implemented in v0.2 but v0.3 will include it in Luna Engine #221

How to run the engine process? I did not find the command to run it (kernel, journal, and storage were found in the binary)

The current new Luna Engine is still WIP, so it does not expose a runnable command, but this example https://github.com/engula/engula/blob/main/examples/hash_engine.rs#L46 demonstrate sth about it, modify it and export GET/PUT should get a simple testable engine service :)

How to test if the engula works? When I run kernel, journal, and storage, there is no output in STDOUT.

current, we can test v0.2 remote mode by https://engula.io/posts/tutorial-0.2/#remote-kernel ，main in last part is "engine-service"

huachaohuang · 2021-12-23T17:39:38Z

huachaohuang
Dec 23, 2021
Maintainer

Wow, I see some very interesting discussions, thanks to everyone here. I don't know much about helm and operator yet, so I think I simply provide some information that may be useful.

As of v0.2, Engula provides the following functionalities:

An engula binary to start standalone kernel/journal/storage services (no background service is implemented yet)
An engula library (crate) for users to build something with the hash engine, which can connect to the kernel service for data storage.

For v0.3, Engula plans to focus on the following two things:

A basic implementation of the Luna engine, which provides some transactional key-value APIs, and will also include some background jobs like compactions (then we will have a background service).
A very naive k8s deployment solution that can allow users to easily start a set of different services (a kernel/journal/storage/background)

For a long-term plan, my ideas so far are as follows (I may start another discussion about them, so just provide some unorganized information here):

For a specific storage engine and composition, the deployment topology should be fixed. For example, we don't need to consider things like users may use a storage implementation A and then switch to another storage implementation B. The topology/composition should be decided before deployment.
For a specific deployment (called a universe), we can let all users share the same set of kernel/journal/storage/background units, instead of creating a dedicated set of units for every user/instance. This is the most cost-effective solution because we don't need to pre-allocate resources to every user, which is not economic if we have a lot of "small" users. But it also means that we can not simply rely on k8s for resource allocation and isolation. We will need to implement some multi-tenant mechanisms in the underlying components. This is challenging but achievable because the access patterns of these units are very simple. For example, journal/storage units simply read/write some data without complicated computation. You can think that the journal/storage/background will also be serverless itself. But a lot of implementation details remain to be discussed, though. However, I think the kernel is still responsible for scaling the other components. So the kernel will need a way to interact with k8s to add/remove specific pods, and start some pods on-demand to run background jobs.
Nevertheless, each engine instance will still run on a dedicated set of units. This is because the engine can run very complicated computation that is very hard to control if all users share the same set of units. For example, a SQL running on an engine unit can easily ruin all other users if they share the same set of engine units.

0 replies

gaocegege · 2021-12-24T01:48:24Z

gaocegege
Dec 24, 2021

Just FYI, https://github.com/cow-on-board/engula-operator/

We will hack on this project during the hackathon(from 2022.01.01 to 2022.01.07). We want to use this opportunity to learn Rust thus kube-rs is used.

0 replies

zojw · 2021-12-29T09:47:45Z

zojw
Dec 29, 2021
Author

Updated: Based on #214 (comment), I'd tried to build another demo using kubebuilder

https://github.com/zojw/engula-operator (with zojw/engula branch)

more details can be seen in its README.md, it can deploy v0.2 with some dirty modifications

but it's still a long way from real k8s deployment, I found engula itself need to make more things clear before do further works :

how do multiple nodes for one component work together?(for example, does the journal has a leader concept? or write quorum?) --- it will affect to node's join or decommission logic
how do multiple nodes for one component work together?(for example, gossip/consensus join mechanism? or rely on another service?] --- it will affect on do we need sth like "bootstrap one need then let others join", if so operator should maintain stage info in CR/status
more complete kernel server impl, we still need more works to rearrange API to fix sth like How to determine whether a stream and bucket is exists in grpc implementation? #194 or "kernel poll storage/journal metadata logic" to make place_lookup real works
grpc related implement need more take care about fail logic( for example, kernel started before storage)

for the current stage, it better temporarily focuses on v0.3 function and come back to the operator works later~

but it's always welcome to help review Demo(it' first time I try to write operator) or give advice for k8s deployment 😄

3 replies

zojw Dec 29, 2021
Author

@gaocegege maybe it might help for the hackathon, especially for adaptor works in https://github.com/zojw/engula/tree/kube-dns (it's a little dirty modification to make it works without modifying much code - -

gaocegege Dec 29, 2021

Thanks for your work!

huachaohuang Dec 29, 2021
Maintainer

Awesome, we have the first working operator!

Deploy Engula on K8S Discussion #214

Replies: 14 comments · 21 replies

zojw Dec 21, 2021 Author

Background info

Engine

Kernel Service

Journal Service

Storage Service

Background Service

In Summary

zojw Dec 22, 2021 Author

zojw Dec 21, 2021 Author

Package & Distribute image

Package as image

Option 1: Single image with different args

Alternatives

Distribute image

zojw Dec 21, 2021 Author

Deploy topology & deploy flow

Operator Service

Option 1: Reuse kernel service process

Option 2: Start another Operator Process

Option 3: welcome to help us and give more advise ~

Run component as Pods

KernelService, Journal, Engine, Storage

Background Service

How prepare and special PV?

Access Pod

Label Pods

Deploy Flow

Bootstrap

Scale up & down

Restart

In Summary

zojw Dec 21, 2021 Author

Kube Service maybe needed for Engula

zojw Dec 21, 2021 Author

Other

builder tools choose?

how to upgrade or destroy?

security problem?

configuration management?

huachaohuang Dec 21, 2021 Maintainer

zojw Dec 21, 2021 Author

zojw Dec 21, 2021 Author

zojw Dec 22, 2021 Author

zojw Dec 22, 2021 Author

zojw Dec 22, 2021 Author

zojw Dec 22, 2021 Author

zojw Dec 22, 2021 Author

zojw Dec 23, 2021 Author

zojw Dec 23, 2021 Author

huachaohuang Dec 23, 2021 Maintainer

zojw Dec 29, 2021 Author

zojw Dec 29, 2021 Author

huachaohuang Dec 29, 2021 Maintainer

Replies: 14 comments 21 replies

zojw
Dec 21, 2021
Author

zojw Dec 22, 2021
Author

zojw
Dec 21, 2021
Author

zojw
Dec 21, 2021
Author

zojw
Dec 21, 2021
Author

zojw
Dec 21, 2021
Author

huachaohuang
Dec 21, 2021
Maintainer

zojw Dec 21, 2021
Author

zojw Dec 21, 2021
Author

zojw Dec 22, 2021
Author

zojw Dec 22, 2021
Author

zojw Dec 22, 2021
Author

zojw
Dec 22, 2021
Author

zojw Dec 22, 2021
Author

zojw Dec 23, 2021
Author

zojw Dec 23, 2021
Author

huachaohuang
Dec 23, 2021
Maintainer

zojw
Dec 29, 2021
Author

zojw Dec 29, 2021
Author

huachaohuang Dec 29, 2021
Maintainer