|
| sigwinch28 wrote:
| I find myself conflicted between two approaches at work:
|
| 1. Write a provider/extension/whatever for a tool like Terraform
| or Pulumi. I live in a world where the infrastructure doesn't
| move underneath my feet. I am the source of truth. I feel like I
| only need to reconcile changes when _I_ make changes to my IaC
| repositories.
|
| 2. I could write something that exists in a control plane, like
| Kubernetes operators or Crossplane. I live in a world where I
| look at the world, find the delta between current state and
| desired state, then try to reconcile. This is an endless loop.
|
| I feel like these are different approaches with the same goal.
| Why should I decide either way beyond tossing a coin?
|
| Some use cases:
|
| - an internal enterprise DNS system which is not standards-
| compliant with the world at large
|
| - an internal certificate authority and certificate issuing
| system.
| debarshri wrote:
| A better way to write an operator these days is to use
| kubebuilder [1].
|
| My complaint is that I have seen orgs write operators for random
| stuff, often reinventing the wheel. Lot of operators in orgs are
| result of resume driven development. Having said that it often
| comes handy for complex orchestration.
|
| [1] https://github.com/kubernetes-sigs/kubebuilder
| casperc wrote:
| What would be a good example where an operator would make
| sense?
| EdwardDiego wrote:
| I worked on an operator that manages Kafka in K8s. If you
| want to upgrade the brokers in a Kafka cluster, you generally
| do a rolling upgrade to ensure availability.
|
| The operator will do this for you, you just update the
| version of the broker in the CR spec, it notices, and then
| applies the change.
|
| Likewise, some configuration options can be applied at
| runtime, some need the broker to be restarted to be applied,
| the operator knows which are which, and will again manage the
| process of a rolling restart if needed to apply the change.
|
| You can also define topics and users as custom resources, so
| have a nice Gitops approach to declaring resources.
| debarshri wrote:
| There is whole list of public operators that you can find in
| operator hub [1].
|
| [1] https://operatorhub.io/
| spenczar5 wrote:
| Operators make sense when you need to automatically modify
| resources in response to changes in the cluster's state.
|
| An example that has come up for me is an operator for a Kafka
| Schema Registry. This is a service that needs some
| credentials in a somewhat obscure format so it can
| communicate very directly with a Kafka broker. If the
| broker's certificates (or CA) are modified, then the Schema
| Registry needs to have new credentials generated, and needs
| to be restarted. But the registry shouldn't (obviously) have
| direct access to the broker's certificates. Instead, there's
| a more-privileged subsystem which orchestrates that dance;
| that's the operator.
| sleepybrett wrote:
| kubernetes itself is a collection of controllers/operators.
| It takes manifests like pods and uses that information to
| create the workload in your container runtime on a node with
| the resources it needs.
| debarshri wrote:
| A good example from my perspective is when you are delivering
| an application as 3rd party vendor and you wish to automate
| lot of operational stuff like backup, scaling based on
| events, automating stuff based on cluster events. It starts
| becoming very valuable. I am sure there are many more use
| cases for.
| jrockway wrote:
| I would not write an operator to do any of these things. To
| me an "operator" strongly implies the existence of a CRD
| and the need to manage it. So for autoscaling, HPA/VPA are
| built into k8s. Backups should be an application-level
| feature; when the "take a backup" RPC or time arrives, take
| a backup and dump it in configured object storage.
| Automating stuff based on cluster events also doesn't
| require an operator; call client.V1().Whatever().Watch and
| do what you need to do.
|
| The only moderately justifiable operator I've ever seen is
| cert-manager. Even then, one wonders what it would be like
| if it just updated a Secret every 3 months based on a hard-
| coded config passed to it, and skipped the CRDs.
| jhoelzel wrote:
| - creating databases for your app on the fly.
|
| - scaling up and down applications because of time instead of
| demand. or based on non metric based actions
|
| - Extending kubernetes to understand your workload
|
| - Automating configuration and management of complex
| applications
|
| - Managing legacy applications that cannot be easily
| containerized or migrated to the cloud.
|
| if you love k8s youll love operators
|
| the list is endless!
| dilyevsky wrote:
| With respect, being "in love" with a technology is not a
| good way to go about it - it leads to tunnel vision
| remram wrote:
| An operator operates something, e.g. it actively makes
| changes. If you want to deploy an application, a Helm Chart
| is the correct way. It will allow you to have deterministic
| deployment, that you can duplicate multiple times in your
| cluster, and you can dry-run it and see the generated
| manifests.
|
| An operator is needed when you can't just deploy and forget
| about it. An example is the Prometheus operator, which will
| track annotations created by users to configure the scraping
| configuration of your Prometheus instances. Another example
| is cert-manager, which gets certificates into secrets based
| on Certificate and Ingress objects, renews them automatically
| before expiry, and does that by creating ingresses picked up
| by your ingress controller.
|
| The advantage of an operator is that it will react to stuff
| happening in the cluster. The drawback is that it reacts to
| stuff happening, potentially doing unexpected things because
| changes happen at any time and you can't dry-run them.
| Another drawback is that they are usually global, so you
| can't run multiple versions at the same time for different
| namespaces (mainly because custom resource definitions are
| global).
|
| Unfortunately many people think packaging an application =
| creating an operator, and that operator does nothing a chart
| couldn't do.
| stasmo wrote:
| The CockRoach DB example in the article is a perfect
| example of an unnecessary CRD. Acquiring certificates
| within an Kubernetes cluster is a common requirement for
| lots of applications and there are lots of solutions out
| there. Is it really necessary to spend time writing your
| own operator? Now you have a second helm chart and an
| operator to maintain. Now you have to explain to people
| which chart to use. You could get rid of the non-operator
| chart but now I have operators within the cluster acquiring
| certificates in 5 or 6 different ways. Do I have to
| configure the credentials for 6 operators so they can make
| Route53 DNS challenge records?
|
| Edit: maybe we could shift left and ask the app developers
| to add certificate acquisition directly into the app
| source.
| outworlder wrote:
| > Do I have to configure the credentials for 6 operators
| so they can make Route53 DNS challenge records?
|
| A certificate for service to service communication does
| not have to correspond to a public endpoint.
| mdaniel wrote:
| > that operator does nothing a chart couldn't do.
|
| Or is can be _actively harmful_ when they don 't do any
| error checking whatsoever, causing it to be less accurate
| that `helm template` would be. Related, it's also one more
| thing to monitor because it can decide to start vomiting
| errors for whatever random reason
| dpkirchner wrote:
| Neither of those cases really need an operator --
| Prometheus and cert-manager both have code that watches for
| changes on ingresses/services/custom resources and reacts
| to changes (using permissions granted via RBAC). I've used
| both without an operator and still use Prometheus without
| one.
| cacois wrote:
| I've found operator-sdk [1] (which uses kubebuilder under the
| hood) to be a better starting point for operator development.
|
| [1] https://github.com/operator-framework/operator-sdk
| MuffinFlavored wrote:
| Can you give me an example use case you've ran into where you
| need to write a custom k8s operator/API?
| [deleted]
| darren0 wrote:
| I'm not sure why this is a top post. The definitions of
| controller and operator are completely wrong. The example code is
| for creating a custom api server which is only done in the most
| advanced of advanced use cases. The implementation of the
| apiserver is too naive to demonstrate they have any understanding
| of the complexity that supporting watch will cause.
| mfer wrote:
| The article has a description of what an operator is wrong. The
| definition of an operator originally was...
|
| > An Operator is an application-specific controller that
| extends the Kubernetes API to create, configure, and manage
| instances of complex stateful applications on behalf of a
| Kubernetes user. It builds upon the basic Kubernetes resource
| and controller concepts but includes domain or application-
| specific knowledge to automate common tasks.
|
| This is the original definition of an operator [1]. People no
| use them for stateless things and domain specific work has
| taken off.
|
| You can look at the Kubernetes docs [2] to see refinements on
| it...
|
| > Kubernetes' operator pattern concept lets you extend the
| cluster's behaviour without modifying the code of Kubernetes
| itself by linking controllers to one or more custom resources.
| Operators are clients of the Kubernetes API that act as
| controllers for a Custom Resource.
|
| [1]
| https://web.archive.org/web/20190113035722/https://coreos.co...
|
| [2] https://kubernetes.io/docs/concepts/extend-
| kubernetes/operat...
| richardwhiuk wrote:
| You don't need to implement a custom API server to implement
| an operator - you can just watch a CR.
| jhoelzel wrote:
| for an operator you do, what you mean is a controller =)
| [deleted]
| timelapse wrote:
| > The definitions of controller and operator are completely
| wrong.
|
| mind clarifying?
| devkulkarni wrote:
| We have an FAQ about Operators here: https://github.com/cloud-
| ark/kubeplus/blob/master/Operator-F...
|
| It should be helpful if you are new to the Operator concept.
|
| Operators are generally useful for handling domain-specific
| actions - for example, performing database backups, installing
| plugins on Moodle/Wordpress, etc. If you are looking for
| application deployment then a Helm chart should be sufficient.
| kimbernator wrote:
| I didn't really enjoy my experience with the few operators I've
| worked with, mainly because they require the maintainer to build
| in some sort of access to basic kubernetes functionality. I see
| the benefit of operators, but I hated that in order to do
| something as simple as define memory/CPU limits to certain
| containers I would need to open a PR to the repo and wait weeks,
| sometimes months, for a new release.
|
| It's frustrating to be a kubernetes admin but not have access to
| basic configuration options because the maintainers of even some
| very high-profile operators (looking at you, AWX) neglected to
| build in access to basic functionality.
| evancordell wrote:
| This is a common frustration of mine as well!
|
| In the latest release of the spicedb-operator[0], I added a
| feature that allows users to specify arbitrary patches over
| operator-managed resources directly in the API (examples in the
| link).
|
| There are some other projects like Kyverno and Gatekeeper that
| try to do this generically with mutating webhooks, but
| embedding a `patches` API into the operator itself gives the
| operator a chance to ensure the changes are within some
| reasonable guardrails.
|
| [0]: https://github.com/authzed/spicedb-
| operator/releases/tag/v1....
| remram wrote:
| The SpiceDB operator looks like a prime example of something
| that should have been a Helm Chart. Migrations can be run in
| the containers.
|
| Operators are just the non-containerized daemons of the
| Kubernetes OS. We did all this work to run everything in
| neatly encapsulated containers, and then everyone wants to
| run stuff globally on the whole cluster. What's the point? Do
| we just containerize clusters and start over?
| xyzzy_plugh wrote:
| I'm not sure what you're on about. Operators don't need to
| run in cluster at all. And even then, they can absolutely
| run as containers. And as far as permissions go, that's up
| to you. They're just regular service accounts.
| evancordell wrote:
| I get the sentiment. We held off on building an operator
| until we felt there was actually value in doing so (for the
| most part, Deployments cover the operational needs pretty
| well).
|
| Migrations can be run in containers (and they are, even
| with the operator), but it's actually a lot of work to run
| them at the right time, only once, with the right flags, in
| the right order, waiting for SpiceDB to reach a specific
| spot in phased migrations, etc.
|
| Moving from v1.13.0 to v1.14.0 of SpiceDB requires a multi-
| phase migration to avoid downtime[0], as could any phased
| migration for any stateful workload. The operator will walk
| you through them correctly, without intervention. Users who
| aren't running on Kubernetes or aren't using the operator
| often have problems running these steps correctly.
|
| The value is in this automation, but also in the API
| interface itself. RDS is just some automation and an API on
| top of EC2, and I think RDS has value over running postgres
| on EC2 myself directly.
|
| As for helm charts, this is just my opinion, but I don't
| think they're a good way to distribute software to end
| users. The interface for a helm chart becomes polluted over
| time in the same way that most operator APIs become
| polluted over time, as more and more configuration is
| pulled up to the top. I think helm is better suited to
| managing configuration you write yourself to deploy on your
| own clusters (I realize I'm in the minority here).
|
| [0]:
| https://github.com/authzed/spicedb/releases/tag/v1.14.0
| ojhughes wrote:
| Adding the patch api is neat! I've solved this in the past by
| embedding the entire PodSpec etc into the CRD
| remram wrote:
| Did you call your CRD "Deployment"?
| sklarsa wrote:
| I might have to borrow that! Very clever
| hintymad wrote:
| > I would need to open a PR to the repo and wait weeks,
| sometimes months, for a new release.
|
| Just curious, is this a limitation of the Operators framework,
| or that of your system's implementation? My knee-jerk reaction
| is that any implementation should absolutely not require
| opening ticket. After all, Amazon's API mandate happened 20
| years ago, and Netflix followed suit to achieve phenomenal
| productivity for their engineers. I have a hard time imagining
| why any engineer would think that gatekeeping configuration
| with PR is a good idea(a UI with proper automation and approval
| process that hides generated PR for specific use cases is a
| different matter)
| IceWreck wrote:
| Not a kubernetes expert, but my understanding is that that
| operators are regular programs that run in a kubernetes
| container and interact with the kubernetes API to
| launch/manage other containers and custom kubernetes
| resources.
|
| An operator (or its custom resource) can be configured by
| Kubernetes YAML/API and its upto the creator of the operator
| to specify the kind of configuration. If the operator creator
| did not specify options to set cpu/memory limits on the pods
| managed by the operator, then you can't do anything. You have
| to add that feature into the operator and then make a pull
| request and wait for it to be upstreamed.
|
| Or fork it instead. Same thing for helm charts (except
| forking and patching them is easier than forking an
| operator).
| fedreg wrote:
| Here's another example of a custom rust operator,
| https://github.com/mach-kernel/databricks-kube-operator
|
| Written by a co-worker to help manage our databricks projects
| across clusters. Works wonderfully!!
| alexott wrote:
| But why such complexity? Is it easier to maintain than
| terraform code?
| EdwardDiego wrote:
| Yes. Terraform doesn't actively manage resources, opererators
| do.
| jhoelzel wrote:
| Oh i love operators they usually tie the entire cluster together
| and lead to amazing things! Think of Kubernetes as an advanced
| API server that can be extended endlessly and operators are the
| way to do it.
|
| There really is no magic, is all there and with go the images are
| usually what? like 10 mb?
|
| It's essential to have a solid understanding of Kubernetes
| architecture, concepts such as custom resources and controllers,
| and the tools and APIs available for working with Operators.
|
| Dont use rust though, use and sdk like the operator sdk or
| kubebuilder. Its native to k8s and you will have a much easier
| time too.
| Thaxll wrote:
| Using Rust for that is a bad idea, just use the official and
| native SDKs ( in Go ). Rust does not have any equivalent to
| https://sdk.operatorframework.io/
| jzelinskie wrote:
| Since Go got generics, working with the Kubernetes API could
| become far more ergonomic. It's been pulling teeth until now. I'm
| eager to see how the upstream APIs change over time.
|
| In the mean time, one of the creators of the Operator
| Framework[0] built a bunch of useful patterns using generics that
| we used to build the SpiceDB Operator[1] called controller-
| idioms[2].
|
| Does anyone know of other efforts to improve the status quo?
|
| [0]: https://operatorframework.io
|
| [1]: https://github.com/authzed/spicedb-operator
|
| [2]: https://github.com/authzed/controller-idioms
| crabbone wrote:
| I've written (well, participated in development of) two
| Kubernetes operators, and support about a dozen of them (in our
| own deployment of Kubernetes): Jupyter, PostgreSQL, a bunch of
| Prometheus operators and a handful of proprietary ones.
|
| In my years of working with Kubernetes I cannot shake the feeling
| that it's, basically, an MLM. It carefully obscures it's
| functionality by hiding behind opaque definitions. It doesn't
| really work, when push comes to shove. And, most importantly, it
| survives in a parasitic kind of way: by piggybacking on those who
| develop all kinds of extensions, be it operators, custom
| networking or storage plugins, authentication and so on.
|
| My problem is I cannot find who stands at the top of the pyramid.
| There's Cloudnative Foundation, but all it does is selling
| certifications nobody really needs... so, that cannot possibly be
| it. No big name doesn't really benefit from this in an obvious
| way...
|
| So... anyways, when I hear people argue about how to implement
| this or another extension of Kubernetes, it rings the same as
| when people argue about styles of agile, or code readability etc.
| nonsense. There isn't a good way. There is not acceptance
| criteria. The whole system is flawed to no end.
| _muff1nman_ wrote:
| This article is mistaken from the get-go as an operator is not
| the same as an apiservice. Rather an operator is a wider term for
| something that includes a controller. See
| https://kubernetes.io/docs/concepts/extend-kubernetes/operat...
|
| Also it's important for people reading this article - an
| apiservice (which this article talks about) is very rarely
| something that should be done. An operator is more appropriate
| for nearly all cases except for when you truly need your state
| stored outside of the internal Kubernetes etcd datastore.
| reedjosh wrote:
| Custom Resource + Controller = Operator. Good call!
|
| > Operators are clients of the Kubernetes API that act as
| controllers for a Custom Resource.
| jhoelzel wrote:
| exactly! controlling refers to directing or regulating the
| behavior of something, while operating refers to the actual
| execution or manipulation.
| tenac23 wrote:
| After reading the comments we updated the article
| rdtsc wrote:
| You have a problem: orchestrating some thing in kube, so you
| write some custom operator logic running alongside your main
| product; but now you have two problems to worry about.
|
| I've seen just as much if not more issues with debugging the
| operator logic itself as with the main pods/deployments it was
| trying to manage.
|
| So just from a practical point of view, I think it should be a
| last resort after everything else fails (helm charts, etc).
___________________________________________________________________
(page generated 2023-03-09 23:01 UTC) |