proxy70

Title: Toward an automated tracking of OpenBSD ports contributions
Author: Solène
Date: 15 November 2020
Tags: openbsd automation
Description: 

Since my previous article about a continous integration service to
track OpenBSD ports contribution I made a simple proof of concept that
allowed me to track what works and what doesn't work.

## The continuous integration goal

A first step for the CI service would be to create a database of diffs
sent to ports. This would allow people to track what has been sent and
not yet committed and what the state of the contribution is
(build/don't built, apply/don't apply). I would proceed following this
logic:

* a mail arrive and is sent to the pipeline
* it's possible to find a pkgpath out of the file
* the diff applies
* distfiles can be fetched
* portcheck is happy

Step 1 is easy, it could be mail dumped into a directory that get
scanned every X minutes.

Step 2 is already done in my POC using a shell script. It's quite hard
and required tuning. Submitted diffs are done with diff(1), cvs diff or
git diff. The important part is to retrieve the pkgpath like
"lang/php/7.4". This allow testing the port exists.

Step 3 is important, I found three cases so far when applying a diff:

* it works, we can then register in the database it can be used to
build
* it doesn't work, human investigation required
* the diff is already applied and patch think you want to reverse it.
It's already committed!

Being able to check if a diff is applied is really useful. When
building the contributions database, a daily check of patches that are
known to apply can be done. If a reverse patch is detected, this mean
it's committed and the entry could be delete from the database. This
would be rather useful to keep the database clean automatically over
time.

Step 4 is an inexpensive extra check to be sure the distfiles can be
downloaded over the internet.

Step 5 is also an inexpensive check, running portinfo can reports easy
to fix mistakes.

All the steps only require a ports tree. Only the step 4 could be
tricked by someone malicious, using a patch to make the system download
very huge files or files with some legal concerns, but that message
would also appear on the mailing list so the risk is quite limited.

To go further in the automation, building the port is required but it
must be done in a clean virtual machine. We could then report into the
database if the diff has been producing a package correctly, if not,
provide the compilation log.

## Automatic VM creation

Automatically creating an OpenBSD-current virtual machine was tricky
but I've been able to sort this out using vmm, rsync and upobsd.

The script download the last sets using rsync, that directory is served
from a mail server. I use upobsd to create an automatic installation
with bsd.rd including my autoinstall file. Then it gets tricky :)

vmm must be started with its storage disk AND the bsd.rd, as it's an
auto install, it will reboot after the install finishes and then will
install again and again.

I found that using the parameters "-B disk" would make the vm to
shutdown after installation for some reasons. I can then wait for the
vm to stop and then start it without bsd.rd.

My vmm VM creation sequence:

```shell commands to generate an OpenBSD virtual machine
upobsd -i autoinstall-vmm-openbsd -m http://localhost:8080/pub/OpenBSD/
vmctl stop -f -w integration
vmctl start -B disk -m 1G -L -i 1 -d main.qcow2 -b autobuild_vm/bsd.rd integration
vmctl wait integration
vmctl start -m 1G -L -i 1 -d main.qcow2 integration
```

The whole process is long though. A derivated qcow image could be used
after creation to try each port faster until we want to update the VM
again.
Multplies vm could be used at once to make parallel testing and make
good use of host ressources.


## What's done so far

I'm currently able to deposite email as files in a directory and run a
script that will extract the pkgpath, try to apply the patch, download
distfiles, run portcheck and run the build on the host using
PORTS_PRIVSEP. If the ports compiled fine, the email file is deleted
and a proper diff is made from the port and moved into a staging
directory where I'll review the diffs known to work.
This script would stop on blocking error and write a short text report
for each port. I intended to sent this as a reply to the mailing at
first, but maintaining a parallel website for people working on ports
seems a better idea.