Back in the days when I wrote the code, I didn't know about the
`require` package yet. Using `require.NoError()` makes the test code
more straight-forward.
No functional changes, except that when tests fail, they now fail
without panicking.
Add a new API operation to get the overall farm status. This is based on
the jobs and workers, and their status.
The statuses are:
- `active`: Actively working on jobs.
- `idle`: Farm could be active, but has no work to do.
- `waiting`: Work has been queued, but all workers are asleep.
- `asleep`: Farm is idle, and all workers are asleep.
- `inoperative`: Cannot work: no workers, or all are offline/error.
- `starting`: Farm is starting up.
- `unknown`: Unexpected configuration of worker and job statuses.
Introduce an "event bus"-like system. It's more like a fan-out
broadcaster for certain events. Instead of directly sending events to
SocketIO, they are now sent to the broker, which in turn sends it to any
registered "forwarder". Currently there is ony one forwarder, for
SocketIO.
This opens the door for a proper MQTT client that sends the same events
to an MQTT server.
When an API request comes in to delete a job, not only log the job's UUID,
but also include its database ID. This can help in figuring out database
issues, as when the job is deleted, it's unknown what UUID it had. Database
relations use the ID, and not the UUID.
Implement the API function to mass-mark jobs for deletion, based on
their 'updated_at' timestamp.
Note that the `last_updated_max` parameter is rounded up to entire
seconds. This may mark more jobs for deletion than you expect, if their
`updated_at` timestamps differ by less than a second.
Updating a tag without `description` field in the request body will keep
the tag's description as-is. Previously this caused it to become empty,
which is now still possible by using an explicit `description: ""`.
When the worker is started with `-restart-exit-code 47` or has
`restart_exit_code=47` in `flamenco-worker.yaml`, it's marked as
'restartable'. This will enable two worker actions 'Restart
(immediately)' and 'Restart (after task is finished)' in the Manager web
interface. When a worker is asked to restart, it will exit with exit
code `47`. Of course any positive exit code can be used here.
Change the package base name of the Go code, from
`git.blender.org/flamenco` to `projects.blender.org/studio/flamenco`.
The old location, `git.blender.org`, has no longer been use since the
[migration to Gitea][1]. The new package names now reflect the actual
location where Flamenco is hosted.
[1]: https://code.blender.org/2023/02/new-blender-development-infrastructure/
Fix an issue where a shared storage path on Linux, that maps via two-way
variables to a drive root on Windows, caused problems with the path
translation system.
Windows paths that consist only of a drive letter (`F:`) cannot just be
concatenated to a relative path, as that will result in `F:path\to\file`,
which is still a relative path of sorts. This is now handled correctly,
and should result in `F:\path\to\file`.
This fixes#104237.
Simplify the variable expansion code. Instead of using a separate goroutine
and two channels, use a struct + a simple function call.
No functional changes.
Simplify the code for the two-way variables' value-to-variable replacement.
Instead of using a goroutine and two channels, use a separate struct and
call a function on that directly.
No functional changes.
As it was decided that the name "tags" would be better for the clarity
of the feature, all files and code named "cluster" or "worker cluster"
have been removed and replaced with "tag" and "worker tag". This is only
a name change, no other features were touched.
This addresses part of #104204.
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104223
As a note to anyone who already ran a pre-release version of Flamenco
and configured some worker clusters, with the help of an SQLite client
you can migrate the clusters to tags. First build Flamenco Manager and
start it, to create the new database schema. Then run these SQL queries
via an sqlite commandline client:
```sql
insert into worker_tags
(id, created_at, updated_at, uuid, name, description)
select id, created_at, updated_at, uuid, name, description
from worker_clusters;
insert into worker_tag_membership (worker_tag_id, worker_id)
select worker_cluster_id, worker_id from worker_cluster_membership;
```
If the mock tests are run by root user then this specific test of
inaccessible path fails because root can write files to anywhere on the
filesystem. It is not clear that Flamenco Manager test
TestCheckSharedStoragePath is checking inaccessible file locations when
it fails and that it should be run by an unprivileged user.
Fix is to fail the permission test if the tests are run as a root user.
Before: `3.3-alpha0-v3.2-76-gdd34d538-dirty`
After : `3.3-alpha0 (v3.2-76-gdd34d538-dirty)`
Also include the new `git` property to always have the Git hash (the part
between parentheses).
Limit which worker statuses are remembered (when they go offline) to
those that we want to restore when they come back online. This is now
set to `awake` and `asleep`. This prevents workers from being told to go
to states that they cannot handle, such as `error` or `starting`.
Fix#99549: When sending Workers offline, remember their previous status
When the status of a worker goes offline, the Manager will now make the status of the worker to be remembered once it goes back online. So when the Worker makes this status change (so for example `X → offline`), Manager should immediately set `StatusRequested = "X" ` once it goes online.
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104217
- Add a little confirmation overlay before deleting a job. This overlay
also shows information about whether the Shaman checkout directory
will be deleted or not.
- Send job updates to the web frontend when jobs are marked for
deletion, and when they are actually deleted.
- Respond to those updates, and handle some corner cases where job info
is missing (because it just got deleted).
This closes T99401.
Show jobs that have been marked for deletion with a red strike-through
line in the jobs table, and show the deletion-request timestamp in the
job details.
Implement the `deleteJob` API endpoint. Calling this endpoint will mark
the job as "deletion requested", after which it's queued for actual
deletion. This makes the API response fast, even when there is a lot of
work to do in the background.
A new background service "job deleter" keeps track of the queue of such
jobs, and performs the actual deletion. It removes:
- Shaman checkout for the job (but see below)
- Manager-local files of the job (task logs, last-rendered images)
- The job itself
The removal is done in the above order, so the job is only removed from the
database if the rest of the removal was succesful.
Shaman checkouts are only removed if the job was submitted with Flamenco
version 3.2. Earlier versions did not record enough information to reliably
do this.
If Shaman is used to submit the job files, store the job's checkout ID
(i.e. the path relative to the checkout root) in the database. This will
make it possible in the future to remove the Shaman checkout along with
the job itself.
Add a timeout when fetching a job from the persistence layers.
It's my intention to add more timeouts, so this also introduces some code
to make it easier to test that a context has a deadline set.
Deduplicate API implementation code to fetch a job from the persistence
service.
Almost no functional changes. Checking that the requested job UUID is
actually a valid UUID is now consistently done on all fetches. This is
not a functional change in normal Flamenco operations, where only valid
UUIDs are used anyway.
The two-way variable replacement function changes the submitted job. To
clarify that this happens, pass the pointer `&submittedJob`.
Both pass-by-pointer and pass-by-value work, because the variable
replacement typically works on maps/slices, which are passed by reference
anyway. Better to be explicit about this, though.
No functional changes.
Fix an issue where workers would switch immediately on a state change
request, even if it was of the "after task is finished" kind.
The "may I keep running" endpoint wasn't checking the lazyness flag, and
thus any state change, lazy or otherwise, would interrupt the worker's
current task.
The priority of an existing can now be changed. It will be taken into
account when assigning tasks to workers, but it will not reassign tasks
that are already active.
Blender not being found can be reported via various errors (this should be
reworked in the 'blender finder API' at some point). `exec.ErrNotFound` is
returned when Blender cannot be found on `$PATH`, which is something that's
absolutely fine. This is now logged less dramatically.
Avoid the word "error" in logging when Blender cannot be found. Typically
these are warnings, and having the word "error" there makes people think
otherwise.
When a submitted job is refused because of a mismatched etag, there is
now a more explanatory error logged on the Manager. The website also has
an entry in the FAQ for this, as I expect more people to run into this
issue when they upgrade Flamenco.
Implement the `getSharedStorage` operation in the Manager, and use it in
the add-on to get the shared storage location in a way that makes sense
for the platform of the user.
Manifest task: T100196
Split "executable" from "its arguments" in blender & ffmpeg commands.
Use `{blenderArgs}` variable to hold the default Blender arguments,
instead of having both the executable and its arguments in `{blender}`.
The reason for this is to support backslashes in the Blender executable
path. These were interpreted as escape characters by the shell lexer.
The shell lexer based splitting is now only performed on the default
arguments, with the result that `C:\Program Files\Blender
Foundation\3.3\blender.exe` is now a valid value for `{blender}`.
This does mean that this is backward incompatible change, and that it
requires setting up Flamenco Manager again, and that older jobs will not
be able to be rerun.
It is recommended to remove `flamenco-manager.yaml`, restart Flamenco
Manager, and reconfigure via the setup assistant.