proxy70

	[HN Gopher] My VM is lighter (and safer) than your container (2017) ___________________________________________________________________ My VM is lighter (and safer) than your container (2017) Author : gaocegege Score : 234 points Date : 2022-09-08 12:26 UTC (10 hours ago)
	web link (dl.acm.org)
	w3m dump (dl.acm.org)
	\| lamontcg wrote: \| Containers should really be viewed as an extension of packages \| (like RPM) with a bit of extra sauce with the layered filesystem, \| a chroot/jail and cgroups for some isolation between different \| software running on the same server. \| \| Back in 2003 or so we tried doing this with microservices that \| didn't need an entire server with multiple different software \| teams running apps on the same physical image to try to avoid \| giving entire servers to teams that would be only using a few \| percent of the metal. This failed pretty quickly as software bugs \| would blow up the whole image and different software teams got \| really grounchy at each other. With containerization the chroot \| means that the software carries along all its own deps and the \| underlying server/metal image can be managed seperately, and the \| cgroups means that software groups are less likely to stomp on \| each other due to bugs. \| \| This isn't a cloud model of course, it was all on-prem. I don't \| know how kubernetes works in the cloud where you can conceivably \| be running containers on metal sharing with other customers. I \| would tend to assume that under the covers those cloud vendors \| are using Containers on VMs on Metal to provide better security \| guarantees than just containers can offer. \| \| Containers really shouldn't be viewed as competing with VMs in a \| strict XOR sense. \| nikokrock wrote: \| I don't remember where i read it but as far as i know when \| using Fargate to run containers (with k8s or ecs) AWS will just \| allocate an ec2 instance for you. Your container will never run \| on the same vm as other customer. This explain i think the \| latency you can have to start a container. To improve that you \| need to handle your own ec2 cluster with an autoscaling group \| djhaskin987 wrote: \| Not surprising that VMs running unikernels are as nimble as \| containers, but not quite useful either, at least in general. \| Much easier to just use a stock docker image. \| ricardobeat wrote: \| How does LightVM compare to Firecracker VMs? Could it be used for \| on-demand cloud VMs? \| [deleted] \| r3mc0 wrote: \| Containers and VMs are totally not the same thing. They serve a \| complete other purpose , as multiple containers can be combined \| to create an application/service , VMs always use a complete os \| etc etc anyway the internet is full of the true purpose of \| containers , they were never meant to use as a "VM" and about \| security.. meh everything is insecure until proven differently \| wongarsu wrote: \| VMs can have private networks between each other just as \| containers do. That's pretty much what EC2 is about. \| nijave wrote: \| VMs don't need a full OS. You can run a single process directly \| from the kernel with no init system or other userland \| fnord123 wrote: \| Title is kinda clickbaity (wha-? how can a VM be lighter than a \| container). It's about unikernels. \| JeanSebTr wrote: \| Exactly, unikernels are great for performance and isolation, \| but that can't be compared to a full application stack running \| in a container or VM. \| throwaway894345 wrote: \| > how can a VM be lighter than a container \| \| It's still clickbaity, but the title implies a comparison \| between a very lightweight VM and a heavy-weight container \| (presumably a container based on a full Linux distro). You \| could imagine an analogous article about a tiny house titled \| "my house is smaller than your apartment". \| marcosdumay wrote: \| It is still lighter in memory only. CPU is also a relevant \| thing to compare them. \| turkishmonky wrote: \| Not to mention, in the paper, the lightvm only had an \| advantage on boot times. Menory usage was marginally worse \| than docker, even with the unikernel, and debian on lightvm \| was drastically worse for cpu usage than docker (the \| unikernel cpu usage was neck and neck with the debian docker \| contaner). \| \| I could see it being an improvement over other VM control \| planes, but docker still wins in performance for any \| equivalant comparisons. \| nailer wrote: \| Firecracker VMs are considered lighter than a container and are \| pretty old at this point. \| sidkshatriya wrote: \| I would say that firecracker VMs are _not_ more lightweight \| than Linux containers. \| \| Linux containers are essentially the separation of Linux \| processes via various namespaces e.g. mount, cgroup, process, \| network etc. Because this separation is done by Linux \| internally there are not that many overheads. \| \| VMs provide a different kind of separation one that is \| arguably more secure because it is backed up hardware -- each \| VM thinks it has the whole hardware to itself. When you \| switch between the VM and the host there is quite a \| heavyweight context switch (VMEXIT/VMENTER in Intel \| parlance). It can take a long time compared to just the usual \| context switch from one Linux container (process) to another \| host (process) or another Linux container (process). \| \| But coming back to your point, no firecracker VMs are not \| lighter/lightweight than a Linux container. They are quite \| heavyweight actually. But the firecracker VMM is probably the \| most nimble of all VMMs. \| [deleted] \| kasperni wrote: \| [2017] \| GekkePrutser wrote: \| Sometimes the less strict separation is a feature, not a bug. \| \| Without folder sharing with dockers for example, it would be \| pretty useless. \| 1MachineElf wrote: \| While a flawed comparison, WSL does use a VM in conjunction \| with the 9p protocol to achieve folder sharing. \| liftm wrote: \| 9p-based folder sharing is (used to be?) possible with qemu, \| too. \| ksbrooksjr wrote: \| It looks like it still is supported [1]. I noticed while \| reading the Lima documentation that they're planning on \| switching from SSHFS to 9P [2]. \| \| [1] https://wiki.qemu.org/Documentation/9psetup \| \| [2] https://github.com/lima- \| vm/lima/blob/3401b97e602083cfc55b34e... \| gavinray wrote: \| The issue with unikernels and things like Firecracker are that \| you can't run them on already-virtualized platforms \| \| I researched Firecracker when I was looking for an alternative to \| Docker for deploying FaaS functions on an OpenFaaS-like clone I \| was building \| \| It would have worked great if the target deployment was bare \| metal but if you're asking a user to deploy on IE an EC2 or \| Fargate or whatnot, you can't use these things so all points are \| moot \| \| This is relevant if you're self-hosting or you ARE a service \| provider I guess. \| \| (Yes, I know about Firecracker-in-Docker, but I mean real \| production use) \| eyberg wrote: \| This is a very common misunderstanding in how these actually \| get deployed in real life. \| \| Disclosure: I work with the OPS/Nanos toolchain so work with \| people that deploy unikernels in production. \| \| When we deploy them to AWS/GCP/Azure/etc. we are _not_ managing \| the networking /storage/etc. like a k8s would do - we push all \| that responsibility back onto the cloud layer itself. So when \| you spin up a Nanos instance it spins up as its own EC2 \| instance with only your application - no linux, no k8s, \| nothing. The networking used is the networking provided by the \| vpc. You can configure it all you want but you aren't managing \| it. Now if you have your own infrastructure - knock yourselves \| out but for those already in the public clouds this is the \| preferred route. We essentially treat the vm as the application \| and the cloud as the operating system. \| \| This allows you to have a lot better performance/security and \| it removes a ton of devops/sysadmin work. \| gamegoblin wrote: \| This is a limitation of whatever virtualized instance you're \| running on, not Firecracker itself. Firecracker depends on KVM, \| and AWS EC2 virtualized instances don't enable KVM. But not all \| virtualized instance services disable KVM. \| \| Obviously, Firecracker being developed by AWS and AWS disabling \| KVM is not ideal :) \| \| Google Cloud, for instance, allows nested virtualization, IIRC. \| verdverm wrote: \| Ive used GCP nested virtualization. You pay for that overhead \| in performance so I wouldn't recommend it without more \| investigation. We were trying to simulate using luks and the \| physical is key insert / removal. Would have used it more if \| we could get GPU passthrough working \| shepherdjerred wrote: \| Azure and Digital Ocean allowed nested virt as well! \| gavinray wrote: \| Yeah but imagine trying to convince people to use an OSS tool \| where the catch is that you have to deploy it on special \| instances, only on providers that support nested \| virtualization \| \| Not a great DX, haha I wound up using GraalVM's "Polyglot" \| abilities alongside it's WASM stuff \| Sohcahtoa82 wrote: \| > We achieve lightweight VMs by using unikernels \| \| When I attended Infiltrate a few years ago, there was a talk \| about unikernels. The speaker showed off how incredibly insecure \| many of them were, not even offering support for basic modern \| security features like DEP and ALSR. \| \| Have they changed? Or did the speaker likely just cherry-pick \| some especially bad ones? \| eyberg wrote: \| You are probably talking about this: \| https://research.nccgroup.com/wp-content/uploads/2020/07/ncc... \| \| In short - not a fundamental limitation - just that kernels \| (even if they are small) have a ton of work that goes into \| them. Nanos for instance has page protections, ASLR, virtio-rng \| (if on GCP), etc. \| sieabah wrote: \| The headline reads like a reddit post so I'm going to assume \| the same still holds true. \| wyager wrote: \| It's not clear to me that VMs actually do offer better isolation \| than well-designed containers (i.e. not docker). \| \| It's basically a question of: do you trust the safety of kernel- \| mode drivers (for e.g. PV network devices or emulated hardware) \| for VMs, or do you trust the safety of userland APIs + the \| limited set of kernel APIs available to containers. \| \| On my FreeBSD server, I kind of trust jails with strict device \| rules (i.e. there are only like 5 things in /dev/) over a VM with \| virtualized graphics, networking, etc. \| nijave wrote: \| I think it gets even more complicated with something like \| firecracker where they recommend you run firecracker in a jail \| (and provide a utility to set that up) \| [deleted] \| dirkg wrote: \| Why is a 5yr old article being posted now? If this were to catch \| on, it would've. I just dont see it being used anywhere. \| \| Having a full Linux kernel available is a major benefit that you \| lose, right? \| faeriechangling wrote: \| What I see happening now on the cloud is containers from \| different companies and different security domain running on the \| same VM. I have to think this is fundamentally insecure and that \| VMs are underrated. \| \| I hear people advocate QubesOS for security which is based on XEN \| when it comes to running my client. They say my banking should be \| done in a different VM than my email for instance. Well if that's \| the case, why do we run many containers doing different security \| sensitive functions on the same VM when containers are not really \| considered a very good security boundary? \| \| From a security design perspective I imagine hardware being \| exclusive to a person/organization, vms being exclusive to some \| security function, and containers existing on top of that makes \| more sense from a security function but we seem to be playing \| things more loosely on the server side. \| bgm1975 wrote: \| Doesn't AWS use firecracker with its Fargate container service \| (and lambda too)? \| jupp0r wrote: \| (2017) \| jjtheblunt wrote: \| "orders of magnitude" : \| \| Why does anyone ever write "two orders of magnitude" when 100x is \| shorter? \| \| Of course, this presumes 10 as the magnitude and the N orders to \| be the exponent, but I don't think I've ever, since the 90s, seen \| that stilted phrasing ever used for a base other than 10. \| IshKebab wrote: \| Because two orders of magnitude does not mean 100x. It means on \| the same order as 100x. \| jjtheblunt wrote: \| Do you mean folks using the phrase know big-O, big-omega, \| big-theta, and are thinking along those lines? \| IshKebab wrote: \| It's nothing to do with big-O; it's about logarithms. But \| really I think most people using it just think of it like: \| "which of these is it closest to? 10x, 100x or 1000x?" \| xahrepap wrote: \| This reminds me: in 2015 I went to Dockercon and one booth that \| was fun was VMWare's. Basically they had implemented the Docker \| APIs on top of VMWare so that they could build and deploy VMs \| using Dockerfiles, etc. \| \| I've casually searched for it in the past and it seems to not \| exist anymore. For me, one of the best parts of Docker is \| building a docker-image (and sharing how it was done via git). It \| would be cool to be able to take the same Dockerfiles and pivot \| them to VMs easily. \| All4All wrote: \| Isn't that essentially what Vagrant and Vagrantfiles do? \| hinkley wrote: \| What is your theory for why Docker won and Vagrant didn't? \| \| Mine is that all of the previous options were too Turing \| Complete, while the Dockerfile format more closely follows \| the Principle of Least Power. \| \| Power users always complain about how their awesome tool gets \| ignored while 'lesser' tools become popular. And then they \| put so much energy into apologizing for problems with the \| tool or deflecting by denigrating the people who complain. \| Maybe the problem isn't with 'everyone'. Maybe Power Users \| have control issues, and pandering to them is not a \| successful strategy. \| duskwuff wrote: \| What turned me off from Vagrant was that Vagrant machines \| were never fully reproducible. \| \| Docker took the approach of specifying images in terms of \| how to create them from scratch. Vagrant, on the other \| hand, took the approach of specifying certain details about \| a machine, then trying to apply changes to an existing \| machine to get it into the desired state. Since the \| Vagrantfile didn't (and couldn't) specify everything about \| that state, you'd inevitably end up with some drift as you \| applied changes to a machine over time -- a development \| team using Vagrant could often end up in situations where \| code behaved differently on two developers' machines \| because their respective Vagrant machines had gotten into \| different states. \| \| It helped that Docker images can be used in production. \| Vagrant was only ever pitched as a solution for \| development; you'd be crazy to try to use it in production. \| mmcnl wrote: \| Docker is not fully reproducible either. Try building a \| Docker image from two different machines and then pushing \| it to a registry. It will always overwrite. \| xahrepap wrote: \| Yes, which is what I'm using now. But it doesn't use the \| Docker APIs to allow you to (mostly) reuse a dockerfile to \| build a VM or a container. \| \| not sure if it would be better than Vagrant. But it was still \| very interesting. \| verdverm wrote: \| They might have built it into Google Anthos as part of their \| partnership. I recall seeing a demo where you could deploy & \| run any* VMWare image on Kubernetes without any changes \| mmcnl wrote: \| You are talking about declarative configuration of VMs. Vagrant \| offers that, right? \| P5fRxh5kUvp2th wrote: \| eeeeeh....... \| \| yes, but then again ... no. \| \| I mean ... yes Vagrant does offer that, but no would I ever \| consider Vagrant configuration anything approaching a \| replacement for docker configuration. \| JStanton617 wrote: \| This paper references consistently mischaracterizes AWS Lambda as \| a "Container as a Service" technology, when in fact it is exactly \| the sort of lightweight VM that they are describing - \| https://aws.amazon.com/blogs/aws/firecracker-lightweight-vir... \| [deleted] \| Jtsummers wrote: \| In fairness to this paper, it was written and published before \| that Firecracker article (2017 vs 2018). From another paper on \| Firecracker providing a bit of history: \| \| > When we first built AWS Lambda, we chose to use Linux \| containers to isolate functions, and virtualization to isolate \| between customer accounts. In other words, multiple functions \| for the same customer would run inside a single VM, but \| workloads for different customers always run in different VMs. \| We were unsatisfied with this approach for several reasons, \| including the necessity of trading off between security and \| compatibility that containers represent, and the difficulties \| of efficiently packing workloads onto fixed-size VMs. \| \| And a bit about the timeline: \| \| > Firecracker has been used in production in Lambda since 2018, \| where it powers millions of workloads and trillions of requests \| per month. \| \| https://www.usenix.org/system/files/nsdi20-paper-agache.pdf \| runnerup wrote: \| Thank you for this detail! \| xani_ wrote: \| AWS "just" runs linux but this is using unikernels tho ? \| Jtsummers wrote: \| No, it's using a modified version of the Xen hypervisor and \| the numbers they show are boot times and memory usage for \| both unikernels and pared down Linux systems (via tinyx). \| It's described in the abstract: \| \| > We achieve lightweight VMs by using unikernels for \| specialized applications and with Tinyx, a tool that enables \| creating tailor-made, trimmed-down Linux virtual machines. \| wodenokoto wrote: \| For what it's worth, Google's cloud functions are a container \| service. You can even download the final docker container. \| raggi wrote: \| KVM gVisor is a hybrid model in this context. It shares \| properties with both containers and lightweight VMs. \| oxfordmale wrote: \| Kubernetes says no... \| \| The article is light on detail. Containers and VMs have different \| use cases. If you self host lightweight VMs is likely the better \| path, however, once you in the cloud most managed services only \| provide support for containers. \| nailer wrote: \| > in the cloud most managed services only provide support for \| containers. \| \| Respectfully, comments like these are the reason for Kubernetes \| becoming a meme. \| oxfordmale wrote: \| There is a huge difference running on VMs that you have zero \| access to, and actually owning your own VM infrastructure. \| Yes AWS Lambda runs on Firecracker, however, it could as well \| running on a FireCheese VM platform and you would be none the \| wiser, unless AWS publishes this somewhere. \| \| I am also not running on Kubernetes, because Kubernetes. AWS \| ECS and AWS Batch also only handle containerised \| applications. Even when deploying on EC2 I tend to use \| containers, as it ensures they keep working consistently if \| you apply patches to your EC2 environment. \| lrvick wrote: \| You can also use a firecracker runner in k8s to wrap each \| container in a VM for high isolation and security. \| bongobingo1 wrote: \| I'm quite interested in seeing where slim VM's go. Personally I \| don't use Kubernetes, it just doesn't fit my client work which is \| nearly all single-server and it makes more sense to just run \| podman systemd units or docker-compose setups. \| \| So from that perspective, when I've peeked at firecracker, kata \| containers, etc, the "small dev dx" isn't quite there yet, or \| maybe never will get there since the players target other spaces \| (aws, fly.io, etc). Stuff like a way to share volumes isn't \| supported, etc. Personally I find Dockers architecture a bit \| distasteful and Podmans tooling isn't _quite_ there yet (but very \| close). \| \| Honestly I don't really care about containers vs VMs except the \| VM alleges better security which is nice, and I guess I like \| poking at things but they're were a little too rough for weekend \| poking. \| \| Is anyone doing "small scale" lightweight vm deployments - maybe \| just in your homelab or toy projects? Have you found the \| experience better than containers? \| NorwegianDude wrote: \| I've been using containers since 2007 for isolating workloads. \| I don't really like Docker for production either because of the \| network overhead with the "docker-way" of doing things. \| \| LXD is definetly my favorite container tool. \| pojzon wrote: \| How differently LXD manages isolation in comparison to docker \| ? \| \| I suppose both create netns, bridge, ifs ? \| lstodd wrote: \| It's the same stuff - namespaces, etc. But it doesn't shove \| greasy fingers into network config like docker. More a \| tooling question/approach than tech. \| antod wrote: \| LXC/LXD use the same kernel isolation/security features \| Docker does - namespaces, cgroups, capabilities etc. \| \| After all, it is the kernel functionality lets you run \| something as a container. Docker and LXC/LXD are different \| management / FS packaging layers on top of that. \| staticassertion wrote: \| I assume it's not using seccomp, which Docker uses, \| although seccomp is not Docker specific and you can go \| grab their policy. \| xani_ wrote: \| They went to trash because containers are more convenient to \| use and saving few MBs of disk/memory is not what most users \| care. \| \| The whole idea was pretty much either use custom kernel (which \| inevitably have way less info on how to debug anything in it), \| and re-do all of the network and storage plumbing containers \| already do via the OS they are running one. \| \| OR just very slim linux one which at least people know how to \| use but STILL is more complexity than "just a blob with some \| namespaces in it" and STILL requires a bunch of config and data \| juggling between hypervisor and VM just to share some host \| files to the guest. \| \| Either way to get to the level of "just a slim layer of code \| between hypervisor and your code" you need to do a quite a lot \| of deep plumbing and when anything goes wrong debugging is \| harder. All to get some perceived security and no better \| performance than just... running the binary in a container. \| \| It did percolate into "slim containers" idea where the \| container is just statically compiled binary + few configs and \| while it does have same problems with debuggability, you _can_ \| just attach sidecart to it \| \| I guess next big hype will be "VM bUt YoU RuN WebAsSeMbLy In \| CuStOm KeRnEl" \| evol262 wrote: \| Virtualization is not just "perceived" security over \| containerization. From CPU rings on down, it offers \| dramatically more isolation for security than \| containerization does. \| \| This isn't about 'what most users care' about either. Most \| users don't really care about 99% of what container \| orchestration platforms offer. The providers do absolutely \| care that malicious users cannot punch out to get a shell on \| an Azure AKS controller or go digging around inside /proc to \| figure out what other tenants are doing unless the provider \| is on top of their configuration and regularly updates to \| match CVEs. \| \| "most users" will end up using one of the frameworks written \| by a "big boy" for their stuff, and they'll end up using \| what's convenient for cloud providers. \| \| The goal of microvms is ultimately to remove everything \| you're talking about from the equation. Kata and other \| microvm frameworks aim to be basically jsut another CRI which \| removes the "deep plumbing" you're talking about. The onus is \| on them to make this work, but there's an enormous financial \| payoff, and you'll end up with this whether you think it's \| worthwhile or not. \| convolvatron wrote: \| in a related vein, most of the distinctions that are being \| brought up around containers vs vms (pricing, debugability, \| tooling, overhead) are nothing fundamental at all. they are \| both executable formats that cut at different layers, and \| there is really no reason why features of one can't be \| easily brought to the other. \| \| operating above these abstractions can save us time, but \| please stop confusing the artifacts of implementation with \| some kind of fundamental truth. its really hindering our \| progress. \| evol262 wrote: \| Bringing the features of one to the other is exactly what \| microvms means. \| pojzon wrote: \| With eBPF there is really not much to argue about in \| security space. \| \| You can do everything. \| \| New toolset for containers covers pretty much every \| possible use-case you could even imagine. \| \| The trend will continue in favor of containers and k8s. \| tptacek wrote: \| It is pretty obviously not the case that eBPF means \| shared-kernel containers are comparably as secure as VMs; \| there have been recent Linux kernel LPEs that no system \| call scrubbing BPF code would have caught, without \| specifically knowing about the bug first. \| evol262 wrote: \| Let me know when eBPF can probe into ring-1 hypercalls \| into a different kernel other than generically watching \| timing from vm_enter and vm_exit. \| \| Yes, there is a difference between "eBPF can probe what \| is happening in L0 of the host kernel" and "you can probe \| what is happening in other kernels in privileged ring-1 \| calls". \| \| No, this is not what you think it is. \| staticassertion wrote: \| I'm not sure what you mean with regards to eBPF but the \| difference between a container and a VM is massive with \| regards to security. Incidentally, my company just \| published a writeup about Firecracker: \| https://news.ycombinator.com/item?id=32767784 \| depingus wrote: \| > So from that perspective, when I've peeked at firecracker, \| kata containers, etc, the "small dev dx" isn't quite there yet, \| or maybe never will get there since the players target other \| spaces (aws, fly.io, etc). Stuff like a way to share volumes \| isn't supported, etc. Personally I find Dockers architecture a \| bit distasteful and Podmans tooling isn't quite there yet (but \| very close). \| \| This is pretty much me and my homelab. I haven't visited it in \| a while, but Weave Ignite might be of interest here. \| https://github.com/weaveworks/ignite \| opentokix wrote: \| Tell me you don't understand containers wihout telling me you \| dont understand containers. \| anthk wrote: \| You don't understand vm's neither. Ever used virtual network \| interfaces? \| devmor wrote: \| Yes, when you custom engineer a specific, complex solution for a \| specific use case, it is generally more performative than a \| general-use solution that's simple. ___________________________________________________________________ (page generated 2022-09-08 23:00 UTC)