Yeah, currently we have ~450GB in filesystems, _used_ is only 111GB + 149GB = 260GB, so there's a bit of reserve left.
Given the above, how much disk do you reckon we'd need to run local RAID1? And I assume it has to be two separate drives?
Yes -- your quoted Hetzner machine has 2x512GB, so it's good enough for a RAID1 (mirroring) and our current needs.
I'd suggest to use Docker instead - according to our measurements, a VM layer costs 15-25% of performance, and having everything available in one filesystem makes backups much easier.
I can imagine if we have all the services Dockerized
Hmm, I personally wouldn't do that. All the basic stuff (GitLab, MailMan, postfix [instead of exim], git, sshd access, ...) is IMO easier to administrate in one OS -- even more so if it's something well-known like Debian stable.
With all services docker you'll have to individually check them for security updates and so on.
Most likely not, because you'll only be using popular software that have well-maintained official images.
then in principle, among other benefits, it should be easier to migrate to different physical hosts in the future -- just push and pull a Docker image. Well, not really -- there will be loads of data to go along with each Docker image and service, plus orchestration to be replicated on a new phycal host. But I think those are solved problems and overall made easier by containerizing stuff -- at least the local environment for each service isn't suddenly changing out from under it by moving to a new physical host with a newer or different Linux flavor.
Yeah, well, in the previous job we switched OS from Ubuntu to Redhat with barely an interruption for the service-VMs:
http://web.archive.org/web/20160510131958/http://blogs.linbit.com/p/458/dist...
I'm not sure whether we want to have the complexity of an HA stack, though.
How about for Erik H's stated goal of being able to partition and delegate maintenance of different services to different volunteers? In the case of VMs then each VM would literally have one volunteer in charge of it. Could we do similar with Docker deployment?
I guess so.
In that case, I'd expect the volunteer in charge of each Container to become familiar (and practice doing) backup, migration, restore of their particular container.
More or less guaranteeing that at least 1 out of 10 has some kind of bug in the backup scripts, so we might lose data in a catastrophic event.
I'd prefer having _one_ backup mechanism for everything -- I quite like doing simple rsyncs to another host and doing btrfs snapshots on both sides.
[[ The current btrfs snapshot sending is bandwidth-optimal -- my fear is that when badly broken data gets sent this way that _both_ side's filesystems become unavailable. With rsync there's less coupling -- just the normal POSIX semantics. ]]
And For coordination of several containers running our several services, do you think we could we could use something simple such as docker-compose or would we better resort to a "real" orchestration setup such as kubernetes?
Ouch. No, please let's avoid the complexity.
For your case, K8s is a reasonably easy way to have HA without many issues. You can go very cheap with 3 Hetzner hosts x 50 EUR per month, and use Ubuntu with MicroK8s on the hosts.
And given the services we are aiming to offer, do you think we are talking about standard off-the-shelf Docker images (e.g. for gitlab, nginx, etc) or are we looking at authoring and maintaining our own custom images (i.e. our own Dockerfiles).
Yeah, exactly - the complexity starts creeping in.
I guess I wasn't clear enough in my previous mail.
_My_ idea (wish, recommendation, call it what you like) is a normal UNIX server with some system services[1] -- only in a few places I'd vote for Docker, like Mailman 2 (which isn't needed anymore, Erik already converted all the important mailing lists, right?!!), or when doing builds that require different userspace (like compiling SBCL for FreeBSD, or Windows, or Ubuntu, or RedHat.)
Ad 1: When using systemd, we can use cgroups to isolate and limit their resource usage.
We might need VMs for cross-builds anyway, though, if we want to support that at some time. (Or at least qemu-static-<arch> - might be easier to handle)
I'm not sure what qemu-static-<arch> means.
https://github.com/multiarch/qemu-user-static qemu-user-static is to enable an execution of different multi-architecture containers by QEMU 1 and binfmt_misc
My idea was that we might want to support building (and running tests for) SBCL for other architectures like eg. ARM64 on the machine as well (in the GitLab CI/CD pipelines).
In the meantime it occurred to me that it might be simpler (though having an external dependency!) to ask for access to the gcc compile farm for that stuff.
What platforms would you have in mind to support for cross-builds?
Well, http://sbcl.org/platform-table.html lists
X86, AMD64, PPC, PPC64, PPC64le, SPARC, MIPSbe, MIPSle, ARMel, ARMhf, ARM64, RISC-V32, RISC-V64
but that might be a bit much ;)
I assume each platform would add to CPU/RAM/disk requirements. Query whether to size for that eventuality now or later.
Well, with qemu-user-static this would run "just" as a normal process (tree), like any other CI/CD job and in fact mostly started by GitLab pipelines -- and like anything in the gitlab pipeline it would/should have RAM/CPU limits configured.
Philip could you go ahead and set up one document for this migration effort, in a place you find handy? If cryptpad will do what we need for now then let it be cryptpad. The cryptpad would be not such much for discussion (this mailing list can play that role for now) but more as a live to-do list with up to date status (I don't know if that's formatted like an org-mode list or what).
Ack. I'll get back to this thread with a link when I have one.
In addition to HA (High Availability)
HA adds complexity - and so adds to support load. I'm not sure we need it.
I'm not sure you can afford not to have som for of HA. common-lisp.net already has a reputation of poor reliability and continuing on the same path doesn't seem a very good idea. All the new lispers, and many of the old ones, have moved to other services (mostly Github) for good reasons.
Perhaps the CL foundation wants to vote on it, though.
and external monitoring, I can imagine just a couple reasons we might want to maintain separate hosts:
- To have a dedicated build host just for running gitlab-runner and executors for it -- so that heavy pipeline jobs wouldn't bog down our whole server.
An external host means additional dependencies - plus costs.
I'd rather have gitlab jobs run in a cgroup hierarchy that is restricted so that it _can't_ bog down the main services.
But a backup machine (which might also run monitoring of the main host) would be a requirement anyway.
That brings up the question of do we still support shell logins into the base OS (as we do now) for users as well as administrators?
I believe that is a nice feature - even more so if we provide a few foreign architecture filesystems that people can use for qemu-static emulation.
And do we still enable the "shell" executor for gitlab-runner, if everything is supposed to be dockerized and we're trying to avoid running things on the bare base OS?
Hmmm, that's a good question. Having _every_ gitlab job in a docker means that these become independent from our host OS -- so we can upgrade packages as needed (security fixes, feature updates!) without impacting any CI/CD pipeline.
2. Maybe for a dedicated gitlab host as well, because that program is so freaking heavy.
I suggest switching to a lighter-weight alternative like Gitea or, even better, bailing out of source hosting altogether. It take a lot of work to provide a capable service, which volunteers can hardly provide.
Cost?
3. And we might want hosts of different architectures, even Mac & Windows.
How about the GCC build farm?
My idea for building Windows binaries for SBCL would be via using WINE
that way this job is just another linux process hierarchy...
On that topic, we've had an offer of a donation of an Intel nuc to use as a permanent host (specs unknown). I also know of at least one Mac Mini M1 which could be donated as a build host. The idea of collecting hardware sounds nice - but is there a viable way to co-locate, provision network resources, and physically care for such hardware which may come into the Foundation's possession?
Yeah, that's a good question. Easier said than done!
If these machines get used to build public binaries we would have to provide physical security for them as well, which is non-trivial!