Re: Need Volunteers for Website Working Committee

10 Oct 2023


      ...
...
...
Yeah, currently we have ~450GB in filesystems,
_used_ is only 111GB + 149GB = 260GB,
so there's a bit of reserve left.
Given the above, how much disk do you reckon we'd need to run local 
RAID1?
And I assume it has to be two separate drives? 
Yes -- your quoted Hetzner machine has 2x512GB,
so it's good enough for a RAID1 (mirroring) and our current needs.
...
...
I'd suggest to use Docker instead - according to our measurements,
a VM layer costs 15-25% of performance,
and having everything available in one filesystem makes backups much
easier.
I can imagine if we have all the services Dockerized
Hmm, I personally wouldn't do that.
All the basic stuff (GitLab, MailMan, postfix [instead of exim], git,
sshd access, ...) is IMO easier to administrate in one OS --
even more so if it's something well-known like Debian stable.
With all services docker you'll have to individually check them for
security updates and so on.
Most likely not, because you'll only be using popular software that have well-maintained official images.
...
...
then in
principle, among other benefits, it should be easier to migrate to
different physical hosts in the future -- just push and pull
a Docker image.  Well, not really -- there will be loads of data to
go along with each Docker image and service, plus orchestration  to
be replicated on a new phycal host. But I think those are solved
problems and overall made easier by containerizing stuff -- at least
the local environment for each service isn't suddenly changing out
from under it by moving to a new physical host with a newer or
different Linux flavor. 
Yeah, well, in the previous job we switched OS from Ubuntu to Redhat
with barely an interruption for the service-VMs:
http://web.archive.org/web/20160510131958/http://blogs.linbit.com/p/458/dist...
I'm not sure whether we want to have the complexity of an HA stack,
though.
...
How about for Erik H's stated goal of being able to partition and
delegate maintenance of different services to different volunteers? 
In the case of VMs then each VM would literally have one volunteer
in charge of it.  Could we do similar with Docker deployment?
I guess so.
...
In
that case, I'd expect the volunteer in charge of each Container to
become familiar (and practice doing) backup, migration, restore of
their particular container.
More or less guaranteeing that at least 1 out of 10 has some kind of
bug in the backup scripts, so we might lose data in a catastrophic
event.
I'd prefer having _one_ backup mechanism for everything --
I quite like doing simple rsyncs to another host and doing btrfs
snapshots on both sides.
[[ The current btrfs snapshot sending is bandwidth-optimal --
     my fear is that when badly broken data gets sent this way
     that _both_ side's filesystems become unavailable.
     With rsync there's less coupling -- just the normal POSIX semantics. 
]]
...
  And For coordination of several
containers running our several services, do you think we could we
could use something simple such as docker-compose or would we better
resort to a "real" orchestration setup such as kubernetes?
Ouch.
No, please let's avoid the complexity.
For your case, K8s is a reasonably easy way to have HA without many issues.
You can go very cheap with 3 Hetzner hosts x 50 EUR per month, and use Ubuntu with MicroK8s on the hosts.
...
...
And given the services we are aiming to offer, do you think we are
talking about standard off-the-shelf Docker images (e.g. for gitlab,
nginx, etc) or are we looking at authoring and maintaining our own
custom images (i.e. our own Dockerfiles). 
Yeah, exactly - the complexity starts creeping in.
I guess I wasn't clear enough in my previous mail.
_My_ idea (wish, recommendation, call it what you like) is a normal
UNIX server with some system services[1] -- only in a few places I'd
vote for Docker, like Mailman 2 (which isn't needed anymore, Erik
already converted all the important mailing lists, right?!!),
or when doing builds that require different userspace (like compiling
SBCL for FreeBSD, or Windows, or Ubuntu, or RedHat.)
Ad 1: When using systemd, we can use cgroups to isolate and limit
their resource usage.
...
...
We might need VMs for cross-builds anyway, though,
if we want to support that at some time.
(Or at least qemu-static-<arch> - might be easier to handle)
I'm not sure what qemu-static-<arch> means.
https://github.com/multiarch/qemu-user-static
     qemu-user-static is to enable an execution of different
     multi-architecture containers by QEMU 1 and binfmt_misc
My idea was that we might want to support building (and running tests
for) SBCL for other architectures like eg. ARM64 on the machine as well
(in the GitLab CI/CD pipelines).
In the meantime it occurred to me that it might be simpler
(though having an external dependency!) to ask for access to the gcc
compile farm for that stuff.
...
What platforms would you have in mind to support for cross-builds? 
Well, http://sbcl.org/platform-table.html lists
X86, AMD64, PPC, PPC64, PPC64le, SPARC, MIPSbe, MIPSle, ARMel,
     ARMhf, ARM64, RISC-V32, RISC-V64
but that might be a bit much ;)
...
I assume each platform would add to CPU/RAM/disk requirements. Query
whether to size for that eventuality now or later.
Well, with qemu-user-static this would run "just" as a normal process
(tree), like any other CI/CD job and in fact mostly started by GitLab
pipelines -- and like anything in the gitlab pipeline it would/should
have RAM/CPU limits configured.
...
Philip could you go ahead and set up one document for this migration
effort, in a place you find handy? If cryptpad will do what we need
for now then let it be cryptpad.  The cryptpad would be not such
much for discussion (this mailing list can play that role for now)
but more as a live to-do list with up to date status (I don't know
if that's formatted like an org-mode list or what). 
Ack.
I'll get back to this thread with a link when I have one.
...
In addition to HA (High Availability)
HA adds complexity - and so adds to support load.
I'm not sure we need it.
I'm not sure you can afford not to have som for of HA. common-lisp.net already has a reputation of poor reliability
and continuing on the same path doesn't seem a very good idea.
All the new lispers, and many of the old ones, have moved to other services (mostly Github) for good reasons.
...
Perhaps the CL foundation wants to vote on it, though.
...
and external monitoring, I can
imagine just a couple reasons we might want to maintain separate
hosts:
1. To have a dedicated build host just for running gitlab-runner and
   executors for it -- so that heavy pipeline jobs wouldn't bog down
   our whole server.
An external host means additional dependencies - plus costs.
I'd rather have gitlab jobs run in a cgroup hierarchy that is
restricted so that it _can't_ bog down the main services.
But a backup machine (which might also run monitoring of the main
host) would be a requirement anyway.
...
That brings up the question of do we still
   support shell logins into the base OS (as we do now) for users as
   well as administrators?
I believe that is a nice feature - even more so if we provide a few
foreign architecture filesystems that people can use for qemu-static
emulation.
...
And do we still enable the "shell"
   executor for gitlab-runner, if everything is supposed to be
   dockerized and we're trying to avoid running things on the bare
   base OS?
Hmmm, that's a good question.
Having _every_ gitlab job in a docker means that these become
independent from our host OS -- so we can upgrade packages as needed
(security fixes, feature updates!) without impacting any CI/CD
pipeline.
...
 2. Maybe for a dedicated gitlab host as well, because that program
is so freaking heavy.
I suggest switching to a lighter-weight alternative like Gitea or, even better, bailing out of source hosting altogether.
It take a lot of work to provide a capable service, which volunteers can hardly provide.
...
Cost?
...
 3. And we might want hosts of different architectures, even Mac & 
Windows.
How about the GCC build farm?
My idea for building Windows binaries for SBCL would be via using WINE 
--
that way this job is just another linux process hierarchy...
...
On that topic, we've had an offer of a donation of an Intel nuc to
use as a permanent host (specs unknown). I also know of at least one
Mac Mini M1 which could be donated as a build host.  The idea of
collecting hardware sounds nice - but is there a viable way to
co-locate, provision network resources, and physically care for such
hardware which may come into the Foundation's possession? 
Yeah, that's a good question. Easier said than done!
If these machines get used to build public binaries we would have to
provide physical security for them as well, which is non-trivial!
-- 
Stelian Ionescu