Flamenco

Author	SHA1	Message	Date
Sybren A. Stüvel	59655ea770	Manager: fix error in sleep scheduler when shutting down When the Manager was shutting down while the sleep scheduler was running, it could cause a null pointer dereference. This is now doubly solved: - `worker.Identifier()` is now nil-safe, as in, `worker` can be `nil` and it will still return a sensible string. - failure to apply the sleep schedule due to the context closing is not logged as error any more.	2022-09-27 12:27:18 +02:00
Sybren A. Stüvel	759a94e49b	Blender finder: also handle `exec.ErrNotFound` as "expected" Blender not being found can be reported via various errors (this should be reworked in the 'blender finder API' at some point). `exec.ErrNotFound` is returned when Blender cannot be found on `$PATH`, which is something that's absolutely fine. This is now logged less dramatically.	2022-09-22 12:39:40 +02:00
Sybren A. Stüvel	161a7f7cb3	Less dramatic logging when Blender cannot be found Avoid the word "error" in logging when Blender cannot be found. Typically these are warnings, and having the word "error" there makes people think otherwise.	2022-09-22 12:37:46 +02:00
Sybren A. Stüvel	b3b46f89b2	Fix T100757: error stating OpenEXR format is unknown format Fix T100757 by reducing the log level to "info" when Blender writes output to a file format the Worker cannot handle. Such cases are expected, and now no longer result in an error message.	2022-09-12 12:40:06 +02:00
Sybren A. Stüvel	1ffd56939a	Manager: match Windows paths in two-way variables also with slashes When doing two-way variable replacement, if the variable has a Windows path (i.e. backslashes) also do a match for the value with forward slashes. In other words, if a path `Y:/shared/...` comes in, and the variable value is (correctly) `Y:\shared\...`, it will be seen as a match.	2022-09-01 15:27:31 +02:00
Sybren A. Stüvel	8368feebac	Fix unit test The recent change in error message caused a test to fail, this is now fixed. No functional changes.	2022-09-01 15:17:04 +02:00
Sybren A. Stüvel	46792ee164	Clarify "job etag mismatch" situation When a submitted job is refused because of a mismatched etag, there is now a more explanatory error logged on the Manager. The website also has an entry in the FAQ for this, as I expect more people to run into this issue when they upgrade Flamenco.	2022-09-01 14:46:30 +02:00
Sybren A. Stüvel	780a9f9ef6	Refactor some tests to use `require.` instead of `assert.` + fail The `require.XXX` functions are exactly the same as `assert.XXX` functions + directly failing the test, so this refactor simplifies the code quite a bit. Can be done in more areas than this. No functional changes.	2022-08-31 17:28:19 +02:00
Sybren A. Stüvel	0afde53209	Simple Blender Render: no longer render to intermediate directory Simple Blender Render now no longer renders to an intermediate directory. This not only simplifies the script, but it also opens the door for selective re-running of individual tasks. In the old situation, where the intermediate directory was renamed to the desired name in the last task, rerunning tasks would fail because the directory they expect to exist no longer exists. This is now resolved.	2022-08-31 17:24:31 +02:00
Sybren A. Stüvel	f065cda830	Cleanup: remove some debug prints from Simple Blender Render script	2022-08-31 16:25:52 +02:00
Sybren A. Stüvel	2e1c0b83bf	Simple Blender Render: refuse to render videos The original idea behind this job type was that it would work equally well for videos as for images, but that was never really well tested. It's currently broken, so this commit removes video support altogether.	2022-08-31 16:25:23 +02:00
Sybren A. Stüvel	eb89984db8	Simple Blender Render: remove `blender_cmd` setting Remove the `blender_cmd` setting, and just hard-code it to `{blender}`. The Blender add-on was already passing this string, and it's very unlikely that people are already writing custom add-ons to pass something different. It provided flexibility that was untested, so it's better to simplify things.	2022-08-31 16:24:34 +02:00
Sybren A. Stüvel	25dd7b214b	Manager: remove superfluous "error compiling job: " prefix from message The wrapped error already mentioned it was about job compilation.	2022-08-31 16:23:10 +02:00
Sybren A. Stüvel	6e401f882f	Worker: fix typo 'FFmepg' -> 'FFmpeg' Just a logging message fix, no functional changes.	2022-08-31 15:34:42 +02:00
Sybren A. Stüvel	9da75eef04	Worker: fix issue running FFmpeg The `exeArg` command parameter was incorrectly always expected. It's now optional, as it should be.	2022-08-31 15:00:46 +02:00
Sybren A. Stüvel	31cf0a4ecc	Implement `getSharedStorage` operation & use it in the add-on Implement the `getSharedStorage` operation in the Manager, and use it in the add-on to get the shared storage location in a way that makes sense for the platform of the user. Manifest task: T100196	2022-08-31 11:44:37 +02:00
Sybren A. Stüvel	31769bcdf2	Manager: always set `config.currentGOOS` This variable is used in tests to mock the current OS, but wasn't set during normal operation of the Manager. This caused issues with the two-way variable system.	2022-08-31 11:43:28 +02:00
Sybren A. Stüvel	0a1e1efc41	OAPI: regenerate code	2022-08-31 11:42:46 +02:00
Sybren A. Stüvel	2eae682b9a	Manager: actually return the short version in the GetVersion operation	2022-08-31 08:58:59 +02:00
Sybren A. Stüvel	ab14c97b2e	Manager: fix tests on Windows Fix some tests that were failing because some parts of Flamenco now use native path separators instead of always-forward ones.	2022-08-30 15:44:14 +02:00
Sybren A. Stüvel	e5a20425c4	Separate variables for Blender executable and its arguments. Split "executable" from "its arguments" in blender & ffmpeg commands. Use `{blenderArgs}` variable to hold the default Blender arguments, instead of having both the executable and its arguments in `{blender}`. The reason for this is to support backslashes in the Blender executable path. These were interpreted as escape characters by the shell lexer. The shell lexer based splitting is now only performed on the default arguments, with the result that `C:\Program Files\Blender Foundation\3.3\blender.exe` is now a valid value for `{blender}`. This does mean that this is backward incompatible change, and that it requires setting up Flamenco Manager again, and that older jobs will not be able to be rerun. It is recommended to remove `flamenco-manager.yaml`, restart Flamenco Manager, and reconfigure via the setup assistant.	2022-08-30 14:58:16 +02:00
Sybren A. Stüvel	87684a0d92	Worker: change "running command" to "running Flamenco command" in log There are Flamenco "commands" and CLI "commands", and it's nice to be explicit about which is which. I'm sure this is needed in some other areas as well.	2022-08-30 10:34:40 +02:00
Sybren A. Stüvel	afdbbcc1d8	Cleanup: explain a bit more in a comment	2022-08-30 10:34:05 +02:00
Sybren A. Stüvel	84cff6919a	Worker: also log job UUID when running a task Having both the job and task UUIDs in the log output helps when debugging.	2022-08-30 10:18:32 +02:00
Sybren A. Stüvel	c504e68d8e	Manager: store the `jobs` implicit variable in platform-native notation Don't change backslashes to forward slashes on Windows. Trying to use forward slashes everywhere was a mistake, and this is one of the steps to make it right.	2022-08-29 17:51:20 +02:00
Sybren A. Stüvel	20395e0e26	Manager: always start the variable lookup table with a fresh map If the loaded config doesn't define the default variables, the latter should not be found in the lookup table any more; this is now fixed.	2022-08-29 17:44:47 +02:00
Sybren A. Stüvel	4a201d47b4	Cleanup: add unit test for parsing backslashes in variable values Backslashes can be included in two ways, as-is (which works fine) and between double quotes (in which case they need escaping). This test checks for both.	2022-08-29 17:28:40 +02:00
Sybren A. Stüvel	0c91fe93d0	Manager: only do pathsep localisation on two-way variables By accident the Manager was performing slash localisation on all command parameters, causing some math expressions for FFmpeg to fail.	2022-08-25 15:02:56 +02:00
Sybren A. Stüvel	6b4b205c1c	Manager: allow backslashes in variables Windows machines should be able to simply use backslashes.	2022-08-25 13:59:02 +02:00
Sybren A. Stüvel	22aa041ec1	Allow relative render output root paths Add a new `abspath(path)` function to the add-on, for use in job type settings. With this, the "simple blender render" job can support relative paths for the "render output root" setting, and still have an absolute final "render output path".	2022-08-25 13:14:48 +02:00
Sybren A. Stüvel	63c60a5b15	Two-way variable replacement: change path separators to target platform Two-way variable replacement now also changes the path separators. Since the two-way replacement is made for paths, it makes sense to also clean up the path for the target platform.	2022-08-25 12:19:30 +02:00
Sybren A. Stüvel	1355ec5e1d	Worker: Change how the worker shuts down Instead of sending the current process an interrupt signal, use a dedicated channel to signal the wish to shut down. The main function responds to that channel closing by performing the shutdown. This solves an issue where the Worker would not cleanly shut down on Windows when `offline` state was requested by the Manager.	2022-08-12 11:15:19 -07:00
Sybren A. Stüvel	2a345a3d2c	API for deleting workers Workers can now be soft-deleted. Tasks assigned to the worker will remain associated with that Worker. Active tasks will be re-queued so other workers can pick them up.	2022-08-11 16:59:53 -07:00
Sybren A. Stüvel	458c33573e	OAPI: regenerate code	2022-08-11 16:58:05 -07:00
Sybren A. Stüvel	cbafe0ff34	Manager: when finding Blender, be less dramatic when it can't be found It's fine when Blender is not available on `$PATH`, so only log that at debug level.	2022-08-02 13:36:25 +02:00
Sybren A. Stüvel	cbc6bfaf02	Manager: also recognise `exec.ErrNotFound` as a "blender not found" error	2022-08-02 13:36:25 +02:00
Sybren A. Stüvel	11e5363d24	Manager: reject removal of empty list of blocklist entries A request to remove an empty list of blocklist entries now results in a 400 Bad Request.	2022-08-01 18:55:33 +02:00
Sybren A. Stüvel	3b978ceda0	Cleanup: manager, name variable correctly It was an old name from copy-pasted code, now it reflects the actual code. No functional changes.	2022-08-01 18:55:08 +02:00
Sybren A. Stüvel	1469345f3a	Manager: sort blocklist by worker name	2022-08-01 18:54:28 +02:00
Sybren A. Stüvel	f3aab8611c	Manager: include worker name when returning blocklist	2022-08-01 18:03:17 +02:00
Sybren A. Stüvel	fef3de28e1	Fix unit test Fix unit test broken in rF449c83b9. No functional changes.	2022-08-01 16:02:08 +02:00
Sybren A. Stüvel	642ef36778	Blender finder: fix compatibility with Windows Home For some reason, calling `AssocQueryStringW` on Windows Home returns error code 122, "The data area passed to a system call is too small", even when the data area is large enough. Furthermore, the API actually describes that in such cases `S_FALSE` is supposed to be returned, with `*pcchOut` set to the required size. Because of this apparent violation of the documentation, and because it just works, Flamenco now ignores this particular error and just returns the obtained string.	2022-08-01 16:00:49 +02:00
Sybren A. Stüvel	350f4f60cb	Worker: convert database interface to GORM Convert the database interface from the stdlib `database/sql` package to the GORM object relational mapper. GORM is also used by the Manager, and thus with this change both Worker and Manager have a uniform way of accessing their databases.	2022-08-01 14:29:14 +02:00
Sybren A. Stüvel	449c83b94a	Manager: broadcast worker update after assigning task The Manager now broadcasts a worker update to SocketIO clients when a worker gets a new task assigned. This ensures the "current task" shown in the worker details view is up to date.	2022-08-01 14:29:08 +02:00
Sybren A. Stüvel	a6c935a634	Fix T99421: Introducing an etag for job types The etag prevents job submissions with old settings, when the job compiler script has been edited. The etag is the SHA1 hash of the `JOB_TYPE` dictionary (as defined by the JavaScript file). The hash is computed in a way that's independent of the exact formatting in the JavaScript file. Also the actual JS code itself is irrelevant, just the `JOB_TYPE` dictionary is used.	2022-07-29 21:13:37 +02:00
Sybren A. Stüvel	48ca73f550	Refactor, manager: rename `compilerForJobType` to `compilerVMForJobType` The function returns a `*VM`, which contains a compiler, and allows you to run a compiler, but is not a compiler itself.	2022-07-29 14:26:54 +02:00
Sybren A. Stüvel	370f935f65	Simple-blender-render job: use absolute path for `render_output_path` Blender cannot be told to only allow absolute path for an RNA property (of type `string`, subtype `dir_path`), so as a workaround the final `render_output_path` is now using `bpy.path.abspath()` to make the path absolute. This has as advantage that the render output path can be defined by artists as a blendfile-relative path, and that it'll be resolved when submitting the blend file.	2022-07-29 11:03:14 +02:00
Sybren A. Stüvel	be1ddaa4eb	Manager test: reduce timeout to practical value The timeout was increased to aid debugging, but shouldn't have been committed.	2022-07-29 09:59:54 +02:00
Sybren A. Stüvel	8c8855554e	Manager: remove `--factory-startup` from default Blender arguments Remove `--factory-startup` from the default Blender arguments. This makes it simpler to configure each Worker to use its own GPU, without having to inject Python code into the arguments. Users can always add this when they need, but I think it's friendlier to have Blender behave the same when they manually run it and when used by Flamenco Worker.	2022-07-29 09:54:29 +02:00
Sybren A. Stüvel	377583c9e2	Cleanup: worker, move FFmpeg-finding at startup into its own file Just a move of code from `main.go` to a dedicated file in the same package. No functional changes	2022-07-29 09:47:30 +02:00
Sybren A. Stüvel	d4dfa2d071	Add release cycle to versioning of Flamenco Include `RELEASE_CYCLE` in the Makefile. This is mentioned at startup of Manager and Worker, and reflects in the software version they report. If `RELEASE_CYCLE == "release"`, Manager and Worker report their version as `ApplicationVersion`. If it's any other string, the Git hash will get appended.	2022-07-28 15:10:27 +02:00
Sybren A. Stüvel	8c86d4c1a9	Worker: Wait for subprocess even when it failed The Worker now always waits for subprocesses. When faced with multiple errors (like I/O reading from stdout and a returned error status from the process) will return the most important one (in this case the exit status of the process). Subprocesses need to be waited for, even when they crashed, otherwise they will linger around as "defunct" processes. This caused out-of-memory errors, because several defunct Blenders were eating up the memory.	2022-07-28 14:36:01 +02:00
Sybren A. Stüvel	c79fe55068	Worker: Refactor the running of subprocesses Blender and FFmpeg were run in the same way, using copy-pasted code. This is now abstracted away into the CLI runner, which in turn is moved into its own subpackage. No functional changes.	2022-07-28 14:34:33 +02:00
Sybren A. Stüvel	c42665322b	Cleanup: add a comment Just a comment that explains why an error is ignored. No functional changes.	2022-07-28 14:28:02 +02:00
Sybren A. Stüvel	b26374d480	Manager: when worker goes to sleep, log in task log which worker When a worker's tasks get requeued because it goes to sleep, the task log will now mention the worker identification (name + UUID). This aids in figuring out what happened to tasks.	2022-07-28 14:27:44 +02:00
Sybren A. Stüvel	4cb0a6fb14	Blender Finder: allow passing the directory instead of the executable Blender Finder now understands that directory paths should be suffixed with `blender` (Linux, macOS) or `blender.exe` (Windows). Giving the Setup Assistant a path like `C:\Program files\Blender Foundation\Blender 3.2` will now just work. This is considerably simpler for many users, as copy-pasting a directory from a file explorer is simpler than obtaining/typing the path to the executable.	2022-07-26 18:18:02 +02:00
Sybren A. Stüvel	1e3a2b5480	Blender Finder: better reporting on timeout errors Instead of just `signal: killed`, report that it actually took too long.	2022-07-26 17:40:28 +02:00
Sybren A. Stüvel	fa79b81d5b	Blender Finder: support multi-line output of `blender --version` When compiled without OpenColorIO, Blender will first complain "Color management: Error could not find role data role." before showing the actual version number. This is now handled by looking for a "Blender " prefix instead of just returning the first line of output. This has as a side-effect that when no such line can be found, we know it's not Blender, and thus an error can be returned (instead of the version of whatever binary was being run).	2022-07-26 17:25:50 +02:00
Sybren A. Stüvel	cb6a3a5a88	Manager: test error with `errors.Is()` instead of `==` It's just a better way to test errors.	2022-07-26 17:25:50 +02:00
Sybren A. Stüvel	859a2e6eda	Manager: better logging when trying to find Blender	2022-07-26 17:25:50 +02:00
Sybren A. Stüvel	3f6dd9be8b	Blender Finder: add timeout to `blender --version` invocation Make sure that the command execution doesn't hang indefinitely.	2022-07-26 17:25:50 +02:00
Sybren A. Stüvel	f71bfdfafe	Manager: fix unit test Fix the unit test I broke in rF736ca103c3d7f37557ed541ca70117bc95bef932	2022-07-26 17:25:50 +02:00
Sybren A. Stüvel	736ca103c3	Manager: show current/last task in worker details The Task details component already linked to the Worker it was assigned to last, and now the Worker links back to the task. There's only one task shown in the Worker details. If the Worker is actively working on a task, that one's shown. Otherwise it's the last-updated task that was assigned to the worker.	2022-07-26 10:36:02 +02:00
Francesco Siddi	9948fdab71	Rename First Time Wizard to Setup Assistant This commit does not introduce functional changes, besides renaming every mention of 'wizard' with 'setup assistant'. In order to run the manager setup assistant use: ./flamenco-manager -setup-assistant The change was introduced to favor more neutral and descriptive working for this functionality. Thanks to Sybren for helping to get this done!	2022-07-25 17:17:04 +02:00
Francesco Siddi	a2bd8a5615	OAPI: generate code	2022-07-25 17:16:53 +02:00
Sybren A. Stüvel	c1a728dc2f	Version updates via Makefile Flamenco now no longer uses the Git tags + hash for the application version, but an explicit `VERSION` variable in the `Makefile`. After changing the `VERSION` variable in the `Makefile`, run `make update-version`. Not every part of Flamenco looks at this variable, though. Most importantly: the Blender add-on needs special handling, because that doesn't just take a version string but a tuple of integers. Running `make update-version` updates the add-on's `bl_info` dict with the new version. If the version has any `-blabla` suffix (like `3.0-beta0`) it will also set the `warning` field to explain that it's not a stable release.	2022-07-25 16:08:07 +02:00
Sybren A. Stüvel	ab8ecc24cc	Cleanup: Add missing license specifiers Add license specifiers to Go files that were missing them: ``` // SPDX-License-Identifier: GPL-3.0-or-later ``` No functional changes.	2022-07-25 16:08:07 +02:00
Sybren A. Stüvel	0e6d61dd84	Remove the `{ffmpeg}` variable Remove the `{ffmpeg}` variable from the default configuration, and its use from the job compiler scripts. Now that the Worker can find its bundled FFmpeg, it's no longer needed to configure its location on the Manager.	2022-07-22 16:37:14 +02:00
Sybren A. Stüvel	09946c0894	Worker: use bundled FFmpeg if available Worker will now try one of the following paths, relative to the flamenco-worker executable, in order to find FFmpeg. If they cannot be found, `$PATH` is searched for FFmpeg. - `tools/ffmpeg-$GOOS-$GOARCH` - `tools/ffmpeg-$GOOS` - `tools/ffmpeg` On Windows these paths will have a `.exe` suffix appended. `$GOOS` is the operating system, like "linux", "darwin", "windows", etc. `$GOARCH` is the architecture, like "amd64", "386", etc.	2022-07-22 16:37:14 +02:00
Sybren A. Stüvel	a5940a24f0	Worker: load `flamenco-worker.yaml` from current directory By accident I made the worker load `flamenco-worker.yaml` from the "local files" directory (~/.local/share/flamenco on Linux) instead of the current directory. This was incorrect, as that file is meant to contain configuration that's shared between workers.	2022-07-22 16:37:14 +02:00
Pablo Vazquez	53598c3ee0	Manager: Rephrase wording on report for successfully writing to Shared Storage * Replace "OK!" with "successfully" Remove exclamation mark since there is no need to call for attention. Use "successfully" as it is more descriptive in this case than OK, which can have other meanings.	2022-07-22 14:57:12 +02:00
Francesco Siddi	08f52993ad	Setup Screen: Overall UI/UX tweaks - Added initial description and illustration - Swap "Check" button for fields with a debounced @input event - Turn Blender's list into a radio selector - Tweak wording when paths are not found - Add microtip library for tooltips - Make navigation steps clickable, according to the state	2022-07-22 14:57:11 +02:00
Sybren A. Stüvel	11a352968a	Fix T99434: Two-way Variables Two-way variable implementation in the job submission end-point. Where Flamenco v2 did the variable replacement in the add-on, this has now been moved to the Manager itself. The only thing the add-on needs to pass is its platform, so that the right values can be recognised. This also implements two-way replacement when tasks are handed out, such that the `{jobs}` value gets replaced to a value suitable for the Worker's platform as well.	2022-07-22 11:58:35 +02:00
Sybren A. Stüvel	585c886bd5	Fix Windows build errors	2022-07-21 20:59:10 +02:00
Sybren A. Stüvel	af0389efc6	Cleanup: correct function name in docstring	2022-07-21 16:29:23 +02:00
Sybren A. Stüvel	894058bc69	Cleanup: variable replacement, avoid hard-coded "workers" string Use `config.VariableAudienceWorkers` instead. No functional changes.	2022-07-21 16:29:05 +02:00
Sybren A. Stüvel	27602174ae	Variable replacement: fix issue replacing vars in nested lists An array-of-strings in Go can become an array-of-`interface{}` when converted to JSON and back again. Such cases are now handled properly.	2022-07-21 16:28:38 +02:00
Sybren A. Stüvel	48f081e03e	Sleep Scheduler: don't overwrite `error` status from Worker The Sleep Scheduler shouldn't push a Worker out of `error` status, as that could hide problematic situations.	2022-07-21 12:49:32 +02:00
Sybren A. Stüvel	d553ca5ab9	Worker: pass input frame rate to FFmpeg when converting frames to video FFmpeg needs the input frame rate as well, otherwise it'll default to 25 FPS, and mysteriously drop frames when rendering a 24 FPS shot.	2022-07-19 18:43:06 +02:00
Sybren A. Stüvel	de80a09223	Manager: include job UUID in "last-rendered image received" log entries This makes it possible to collect all "last-rendered image received" entries for a single job.	2022-07-19 18:40:22 +02:00
Sybren A. Stüvel	d929885b06	Manager: only log task status change if there is an actual change Don't log "changes" from, say, `active` -> `active`.	2022-07-19 17:47:43 +02:00
Sybren A. Stüvel	ac3236786b	Manager: add entry to task log whenever task changes status Add a line to the task log whenever task changes status. This only applies to directly-changed tasks, and not to mass-updates (like all tasks going from 'completed' to 'queued' on a job requeue).	2022-07-19 17:23:13 +02:00
Sybren A. Stüvel	696b97c553	Re-queue tasks of worker after changing to non-'awake' state When a Worker changes state from `awake` to something else, it cannot run tasks any more. This now triggers a requeue of its active task (should be one at most, if things are sane) so that another worker can pick it up.	2022-07-19 15:38:36 +02:00
Sybren A. Stüvel	ecfeaec4b2	Worker: store files on Windows in `Blender Foundation\Flamenco` On Windows, store files in `%LOCALAPPDATA%\Blender Foundation\Flamenco`. Previously the `Blender Foundation` part of the path was missing. Manifest Task: T99415	2022-07-19 12:13:34 +02:00
Sybren A. Stüvel	2f76df437b	T99415: Worker: change default location for writing local files Change the location where the Worker writes its local files so that it follows the XDG specification (instead of writing to the current working directory). - Linux: `$HOME/.local/share/flamenco` - Windows: `C:\Users\UserName\AppData\Local\Flamenco` - macOS: `$HOME/Library/Application Support/Flamenco` NOTE: The old files will not be loaded any more. This means that if nothing is done and the new worker is run as-is, it will reregister as a brand new worker. Move `flamenco-worker-credentials.yaml` and `flamenco-worker.sqlite` to the new location to avoid this.	2022-07-19 12:08:41 +02:00
Sybren A. Stüvel	fa600d6fc9	Cleanup: rename `mustHostname()` to `workerName()` The function determines the worker's name. The fact that it can use the hostname for this isn't that relevant.	2022-07-19 12:03:08 +02:00
Sybren A. Stüvel	0a5f87bc5a	Sleep Scheduler: perform first check at startup Instead of waiting for a minute, run the first sleep scheduler iteration at startup.	2022-07-18 19:30:38 +02:00
Sybren A. Stüvel	83467e4c60	Sleep schedule: store 'next check' timestamp in UTC SQLite doesn't parse the timezone info, so timestamps should always be in UTC.	2022-07-18 19:30:17 +02:00
Sybren A. Stüvel	3baac0a2d8	Manager: reduce log level when worker asks task but has wrong status This can happen quite often and it's fine, so it's not worth a warning.	2022-07-18 19:26:49 +02:00
Sybren A. Stüvel	24f921b0c8	Manager: add more logging when worker cannot be marked as 'seen' SQLite often errors out on this with only `interrupted (9)` as message. This logging should at least tell us whether it's our own "background context" timing out, or whether something else fishy is going on.	2022-07-18 19:04:15 +02:00
Sybren A. Stüvel	bfd6746f78	Manager: consult the sleep schedule on worker sign-on If there is no status change queued for the Worker, the sleep schedule should determine its initial status.	2022-07-18 18:25:24 +02:00
Sybren A. Stüvel	bc725ea7dc	Manager: mark worker as 'seen' when calling the `WorkerState` operation Fix workers timing out when they're `asleep`. When sleeping, the Worker will call the `WorkerState` operation to see if they have to wake up, but that didn't mark the workers as "seen". As a result, a sleeping worker would always time out.	2022-07-18 17:56:56 +02:00
Sybren A. Stüvel	47e517a3a5	Worker: cleanly sign off after flushing buffer When running the Worker with the `-flush` CLI argument, actually sign off from the Manager before shutting down.	2022-07-18 16:36:45 +02:00
Sybren A. Stüvel	0697f71b62	Manager: run some operations in a background context Run some API operations in a background context. This should prevent some of the SQLite "interrupted" errors, as those can occur when the context closes while a query is running. The API operations that Workers use are now mostly running in a separate background context, at least from the moment onward when they can run independently of the Worker connection.	2022-07-18 16:26:06 +02:00
Sybren A. Stüvel	43e8f3f623	Manager: improve the "my own URLs" construction Improve the "my own URLs" construction, such that: - IPv6 link-local addresses are always skipped. They require a "zone index" string, typically the interface name, so something like `[fe80::cafe:f00d%eth0]`. This is not supported by web browsers, so the URLs would be of limited use. Furthermore, they require the interface name of the side initiating the connection, whereas this code is used to answer the question "how can this machine be reached as a server?" - IPv4 addresses are sorted before IPv6 addresses. Even though I like IPv6 a lot, IPv4 is still more familiar to people. - Loopback addresses (::1, 127.0.0.1) are sorted last, so that the First- Time Wizard is most likely to use the bigger-scoped address.	2022-07-18 15:36:43 +02:00
Sybren A. Stüvel	e91623557a	Worker: log which URLs were tried when auto-discovery failed When the Worker cannot find any Manager, log which URLs were tried.	2022-07-18 14:14:02 +02:00
Sybren A. Stüvel	ad57070a2d	Manager: reduce log level of "loading configuration" message Every time the web interface starts, it queries the config to see whether it should be in first-time-wizard mode or not. This caused unnecessary info-level logging. In the future it would be better to load the config file just once, instead.	2022-07-18 14:11:22 +02:00
Sybren A. Stüvel	658a3d7a85	Worker Timeout: subject all but offline/error workers to timeout checks Workers that are in `starting`, `asleep`, or `testing` state should also be subject to the timeout check, not just workers in `awake` state.	2022-07-18 11:30:39 +02:00
Sybren A. Stüvel	a6ca3f7bdc	Sleep Scheduler: reduce check interval and log level Reduce the check interval and the log level of "nothing to do" messages, from "developer friendly" to "actually useful".	2022-07-17 17:31:51 +02:00
Sybren A. Stüvel	d7b164133a	Sleep Scheduler implementation for the Manager The Manager now has a sleep scheduler for Workers. The API and background service work, but there is no web interface yet. Manifest Task: T99397	2022-07-17 17:27:32 +02:00
Sybren A. Stüvel	627996525e	Manager: implement operations for getting & setting worker sleep schedule This is just the API, no web interface yet. Manifest Task: T99397	2022-07-16 16:00:25 +02:00
Sybren A. Stüvel	0e92004f2a	OAPI: regenerate code	2022-07-16 15:59:48 +02:00
Sybren A. Stüvel	726129446d	T99730: Allow access to full task log The web interface has a button that opens the task log in a new window. This might need some restyling ;-)	2022-07-16 12:55:41 +02:00
Sybren A. Stüvel	686295090b	Manager: implement endpoint for getting the full task log Previously only the log tail was available, which is fine for many cases, but for serious debugging the entire log is needed. Manifest task: T99730	2022-07-16 11:13:31 +02:00
Sybren A. Stüvel	e2434b44f2	OAPI: regenerate code	2022-07-16 11:11:34 +02:00
Sybren A. Stüvel	ca586bf3fe	Windows: Skip "inaccessible path" test For some reason, on Windows, creating a directory with zero permissions still allows creating a file in there. Just skip that part of the test. The Explorer's properties panel of the directory also shows "Read Only (only applies to files)", so at least that seems consistent.	2022-07-16 10:31:35 +02:00
Sybren A. Stüvel	859a261b05	Manager: on deletion of a worker, do not cascade to deletion of its tasks Fix an issue where deleting a Worker would also delete the tasks it was assigned to.	2022-07-15 17:00:25 +02:00
Sybren A. Stüvel	904b6c0d73	Stresser: stress the Manager by querying for tasks to execute	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	1fceae3604	Manager: more efficient database queries Be more selective in what's saved to the database to speed some things up. Most importantly, this avoids saving the entire job when a task status is updated or a task is assigned.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	1055aabee2	Manager: optimise db.SaveActivity() query Use an explicit `Select()` GORM call to avoid saving related objects.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	2e1a9c61b8	Manager: add SHA256 password hasher for worker auth Add a SHA256 password hasher for worker authentication. It's not used at the moment, but can be switched to for faster API queries. Note that switching will cause authentication errors on already-existing workers, which means they'll automatically re-register. This is mostly useful for debugging & profiling purposes.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	0e4ed1c54d	Manager: move worker password hasher into a struct + interface Move the Worker password hashing/comparison functions into a struct, and use it via an interface. This will make it easier to switch to different hashing algorithms. Even with a low number of iterations, BCrypt is quite slow. That's good for security, but not for Flamenco Worker authentication -- the password is more as "nice check to avoid accidentally reusing the same ID" than something for security.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	35fe0146d3	Add stress tester for task updates Build with `make stresser`. Run with: ./stresser -worker UUID -secret ABCXYZ The worker ID and secret can be obtained from `flamenco-worker-credentials.yaml`. If left empty, the stresser will register as a new worker, and log the credentials to be used on the next invocation.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	6e28271c93	Manager: prevent saving related job & worker when "touching" task	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	62ecd09f5f	Don't return 500 Error when Blender cannot be found on $PATH In the first-time wizard, if Blender cannot be found on $PATH but it can be found via .blend file association, that should just be reported as a normal sitation, and not as a `500 Internal Server Error`.	2022-07-14 18:50:34 +02:00
Sybren A. Stüvel	c0f4657be4	Wrap error message when finding Blender via file association fails	2022-07-14 18:49:37 +02:00
Sybren A. Stüvel	72337c55cd	Blender finder: fix Windows build error	2022-07-14 18:41:55 +02:00
Sybren A. Stüvel	86bccf3aa9	Blender finder: report only the first line of stdout	2022-07-14 18:41:50 +02:00
Sybren A. Stüvel	8b494dc448	Manager: Fix logic error detecting first-time run If the config file is missing, `true` should be returned.	2022-07-14 18:24:47 +02:00
Sybren A. Stüvel	8719103462	Manager: set default storage path to "" to trigger the first-time wizard Trigger the first-time wizard on first-time runs of Flamenco, by defaulting the storage path to the empty string. The wizard can always be triggered with the `-wizard` CLI argument. This is just for detection of first-time / unconfigured runs.	2022-07-14 18:24:47 +02:00
Sybren A. Stüvel	b35af5de9f	Manager: allow requesting shutdown multiple times It's fine to request a shutdown multiple times. This fixes a hard crash due to a panic.	2022-07-14 18:24:16 +02:00
Sybren A. Stüvel	38b8220476	Restart Flamenco Manager when the first-time wizard is complete	2022-07-14 17:52:38 +02:00
Sybren A. Stüvel	10f56148d4	Allow saving configuration from the first-time wizard This just updates the config and saves it to `flamenco-manager.yaml`. Saving the configuration doesn't restart the Manager yet, that's for another commit.	2022-07-14 17:27:17 +02:00
Sybren A. Stüvel	f9a3d3864a	OAPI: regenerate code	2022-07-14 17:26:26 +02:00
Sybren A. Stüvel	7204bb833a	Blender: run with `enable-autoexec` flag by default & shorten flags Run with `-b -y`, instead of `--background --enable-autoexec`, to shorten the default flags.	2022-07-14 15:52:57 +02:00
Sybren A. Stüvel	aec5ee49e0	First-Time Wizard: allow selecting Blender executables The wizard now finds Blender in various ways, and lets the user select which one to use. Doesn't save anything yet, though.	2022-07-14 12:22:56 +02:00
Sybren A. Stüvel	20f13257f7	Move "blender finder" from Worker-specific to common location Manager's first-time wizard will have to be able to find Blender as well.	2022-07-14 11:17:03 +02:00
Sybren A. Stüvel	aa9837b5f0	First incarnation of the first-time wizard This adds a `-wizard` CLI option to the Manager, which opens a webbrowser and shows the First-Time Wizard to aid in configuration of Flamenco. This is work in progress. The wizard is just one page, and doesn't save anything yet to the configuration.	2022-07-14 11:17:03 +02:00
Sybren A. Stüvel	e4a38f071c	OAPI: regenerate code	2022-07-14 11:16:59 +02:00
Sybren A. Stüvel	6b5f9317cb	Manager: clear job's blocklist when requeueing the job Requeueing a job means that the issues that caused workers to get blocked might be resolved, so it should be run with a clean slate.	2022-07-14 11:03:11 +02:00
Sybren A. Stüvel	3c290b1f6d	Manager: ensure the `{jobs}` implicit variable uses forward slashes Since the variable expansion is unaware of path semantics, using forward slashes is the safest way to go about things in a platform-indepdent way.	2022-07-13 12:45:55 +02:00
Sybren A. Stüvel	ce250a611e	Windows: fix error handling of syscall to AssocQueryStringW syscall.SyscallN returns a `uintptr` type alias, and thus has to be compared to `0`, not `nil`. Yeah, it's a bit weird.	2022-07-13 11:48:26 +02:00
Sybren A. Stüvel	0ff8ed7585	Manager: implement the `getVariables` OpenAPI operation	2022-07-08 11:36:00 +02:00
Sybren A. Stüvel	ae2cb281b4	OAPI: regenerate code	2022-07-08 11:35:57 +02:00
Sybren A. Stüvel	ac5bb5e378	Remove assumption `{jobs}` only exists when Shaman is enabled Manager always creates an implicit variable `{jobs}`. This used to be Shaman-dependent, but now it's always there (has been for a while). This is now reflected in an add-on comment, and in an extra unit test.	2022-07-05 18:19:49 +02:00
Sybren A. Stüvel	d4429d593c	Unify task log storage & manager-local storage The task logs storage system is refactored to use the `local_storage` package. Configuration options have also changed: - `task_logs_path` is renamed to `local_manager_storage_path`, to emphasise that only the Manager deals with those files, with default value `./flamenco-manager-storage`. - `storage_path` is renamed to `shared_storage_path`, to emphasise this is the storage shared between Manager and Workers, with default value `./flamenco-shared-storage`. Task logs are still stored in `${local_manager_storage_path}/job-{jobUUID[0:4]}/{jobUUID}/task-{taskUUID}.txt` Manifest task: T99409	2022-07-05 17:58:58 +02:00
Sybren A. Stüvel	9f9a278634	Manager: remove old commented-out config sections Various config sections were commented out, because they were brought in from Flamenco 2 but weren't implemented yet. These have now been removed, as the basic functionality is there, and new functionality will likely be different from Flamenco 2 anyway.	2022-07-05 17:23:31 +02:00
Sybren A. Stüvel	2965856aa3	Worker: add test flag to enable Blender-dependent test Add a `-withBlender` CLI argument for a unit test, to aid in debugging T99438. Run the test with `go test ./internal/worker/find_blender/ -args -withBlender` to actually fail when the file association with `.blend` files cannot be found. Note that this doesn't rely on Blender being runnable, but it does rely on _something_ being associated with .blend files.	2022-07-05 10:01:10 +02:00
Sybren A. Stüvel	60971722fc	Windows: add missing imports A recent refactor (rFfb89658530da25a77dc03fb329c394198bf6358f) performed on Linux didn't properly update a Windows-only file.	2022-07-05 10:01:10 +02:00
Sybren A. Stüvel	2c932ebad5	Show Worker's "last seen" timestamp in web interface & API responses	2022-07-04 12:49:56 +02:00
Sybren A. Stüvel	7d64d1bca4	Move SwaggerUI to `/api/v3/swagger-ui` Include the `v3` path component in the Swagger UI URL.	2022-07-04 12:21:18 +02:00
Sybren A. Stüvel	f2f8357df7	Bump thumbnail JPEG quality from 80 to 85 80 was a bit too low. 85 might still be too low, we'll have to see.	2022-07-01 17:44:26 +02:00
Sybren A. Stüvel	5fbdc388ad	Job compiler: tweak settings visibility of `simple-blender-render` In the `simple-blender-render` job type settings, hide the `chunk_size` setting from the web frontend, and show the `blendfile` setting instead. The actual blend file being rendered is important to know, whereas the chunk size can be inferred from the task names anyway.	2022-07-01 13:36:44 +02:00
Sybren A. Stüvel	d25151184d	Add a "Last Rendered" view Add a "Last Rendered" view to the webapp. The Manager now stores (in the database) which job was the last recipient of a rendered image, and serves that to the appropriate OpenAPI endpoint. A new SocketIO subscription + accompanying room makes it possible for the web interface to receive all rendered images (if they survive the queue, which discards images when it gets too full).	2022-07-01 12:34:40 +02:00
Sybren A. Stüvel	801fa20f12	OAPI: regenerate code	2022-07-01 12:32:42 +02:00
Sybren A. Stüvel	2457a63518	Manager: Show "nothing rendered yet" image in job details Show a "nothing rendered yet" image in the job details when there is no last-rendered image yet.	2022-06-30 19:20:19 +02:00
Sybren A. Stüvel	0fc5ba0bc6	Manager: broadcast last-rendered image info via SocketIO After processing an image in the "last-rendered" processor, a SocketIO object is sent to clients to indicate the last-rendered image needs to be (re)loaded. This also moves the previously existing "done callback" from a single function to a per-image callback, so that it can be called with the right information in there, and only when that particular image is actually done processing. The notification message sent via SocketIO also contains the necessary info to render the image, so that the web client doesn't have to call the `fetchJobLastRenderedInfo` operation.	2022-06-30 18:36:24 +02:00
Sybren A. Stüvel	6efd67b05c	Manager: implement `FetchJobLastRenderedInfo()` API operation Allow querying for the URL & available versions of a job's last-rendered image.	2022-06-28 17:08:00 +02:00
Sybren A. Stüvel	668e25fe95	OAPI: regenerate code	2022-06-28 17:07:08 +02:00
Sybren A. Stüvel	24344e9632	Cleanup: worker, simplify setting the manager URL The return value of `FileConfigWrangler.SetManagerURL()` was never used, so now the function doesn't return anything any more.	2022-06-28 11:42:47 +02:00
Sybren A. Stüvel	d6cfff4031	Worker: treat empty config file the same as a missing one EOF while parsing the config file is now handled as an indication that the default config should be used, rather than a fatal error.	2022-06-28 10:24:46 +02:00
Sybren A. Stüvel	fb89658530	Refactor: replace `os.IsNotExist()` with `errors.Is(err, fs.ErrNotExist()` `os.IsNotExist()` is from before `errors.Is()` existed. The latter is the recommended approach, as it also recognised wrapped errors. No functional changes, except for recognising more cases of "does not exist" errors as such.	2022-06-28 10:24:46 +02:00
Sybren A. Stüvel	64512c81ba	Manager: implement OAPI operations to fetch blocklist & delete items	2022-06-27 11:32:35 +02:00
Sybren A. Stüvel	1353d1df0f	OAPI: regenerate code	2022-06-27 11:32:12 +02:00
Sybren A. Stüvel	2d6c11e98b	Worker: send produced output to Manager Workers now send output produced by Blender (limited to PNG and JPEG images, currently) to Manager. This is done by converting to JPEG first, then sending the bytes via the Flamenco API to the Manager.	2022-06-27 11:30:37 +02:00
Sybren A. Stüvel	34f1cc076c	Cleanup: Worker, simplify Listerer.Run() function No functional changes, except that now the "listener shutting down" message will also be logged in case of a panic.	2022-06-27 11:30:37 +02:00
Sybren A. Stüvel	f244355328	Worker: parse stdout of Blender to recognise saved files Prepare the Worker for submission of last-rendered images to Manager, by parsing `stdout` of Blender to see which files were saved. This needs more work, as now just an error "not implemented" is logged.	2022-06-27 11:30:37 +02:00
Sybren A. Stüvel	1f8c2df919	Worker: skip sometimes-hanging unit test The test can hang occasionally, and needs some love & attention. For now I've done some patching to make it slightly better, but still disabled it and added a `FIXME` note to it.	2022-06-27 11:30:35 +02:00
Sybren A. Stüvel	e6af6a708c	Manager: always close file when saving to JPEG Always close the output file; previously this was not done when the JPEG encoding would fail.	2022-06-26 13:24:37 +02:00
Sybren A. Stüvel	15ad890646	Unit test: properly close image file in test On Windows it's not allowed to erase a file while it's opened, which caused this error to surface. The file is now properly closed before the test file is erased.	2022-06-26 13:23:48 +02:00
Sybren A. Stüvel	e687c95e5d	Manager: add "last rendered image" processing pipeline Add a handler for the OpenAPI `taskOutputProduced` operation, and an image thumbnailing goroutine. The queue of images to process + the function to handle queued images is managed by `last_rendered.LastRenderedProcessor`. This queue currently simply allows 3 requests; this should be improved such that it keeps track of the job IDs as well, as with the current approach a spammy job can starve the updates from a more calm job.	2022-06-24 16:51:11 +02:00
Sybren A. Stüvel	167b2eaf45	OAPI: regenerate code	2022-06-24 16:39:50 +02:00
Sybren A. Stüvel	b53cd67eb4	Cleanup: rename `assertResponseEmpty()` → `assertResponseNoContent()` The function tests the HTTP response is `204 No Content`, and now the name reflects that better. No functional changes.	2022-06-24 16:09:46 +02:00
Sybren A. Stüvel	27a6dde708	Manager: add `local_storage` package for managing storage locations Add a `local_storage` package that finds a suitable place to put files. Currently it just looks at the location of the currently running executable; it can later do other things. It can be queried for directory to put job-specific files. It is intended to be used by the under-development "last rendered output" processing system, to store an image file per job. Later we should also refactor the task log handling system to use this.	2022-06-23 16:45:38 +02:00
Sybren A. Stüvel	b441f3f3de	Manager: load job compiler scripts from disk as well If there is a `scripts` directory next to the current executable, load scripts from that directory as well. It is still required to restart the Manager in order to pick up changes to those scripts (including new/removed files), PLUS a refresh in the add-on.	2022-06-21 17:59:20 +02:00
Sybren A. Stüvel	87f1959e26	Manager: use blocklist to actually block workers Actually use the blocklist in the task scheduler to block workers from doing blocked job types.	2022-06-21 17:59:20 +02:00
Sybren A. Stüvel	a0e8eebcb3	Manager: make access to job compilers script thread-safe When on-disk job compiler scripts are supported, they will be reloaded often, and it becomes more important to have the access to the map of loaded job compilers thread-safe.	2022-06-20 18:09:33 +02:00
Sybren A. Stüvel	defa5b0431	Refactor: extract 'get the embedded filesystem' to a separate function The global `scriptFS` variable was too easy to access, which caused an issue where the mandatory `"scripts"` subdirectory was not passed. Accessing via a getter function that hides this requirement prevents this.	2022-06-20 17:43:08 +02:00
Sybren A. Stüvel	201236cf46	Refactor: take some functions out of `job_compilers.Service` Take some functions out of the `Service` struct, as they are more or less standalone anyway. This will also make it easier later to make things thread-safe, as that'll become important when files can get live-reloaded.	2022-06-20 17:26:17 +02:00
Sybren A. Stüvel	d5c527209f	Cleanup: rename local var from `compiler` to `service` The `Load()` function returns a `*Service`, and it was confusing that the local variable is named `compiler` instead. Now it's called `service`. No functional changes.	2022-06-20 17:21:19 +02:00
Sybren A. Stüvel	89fdc45b45	Manager: ignore small JS files Empty (or almost-empty) JS files are ignored by the job compiler.	2022-06-20 17:14:06 +02:00
Sybren A. Stüvel	7a89c07fc9	Manager, refactor access to JS script files Refactor the JS script file loading code so that it's tied to the `fs.FS` interface for longer, and less to the specifics of our `embed.FS` instance. This should make it possible to use other filesystems, like a real on-disk one, to load scripts.	2022-06-20 17:06:46 +02:00
Sybren A. Stüvel	2d05e1c773	Fix unit test for recent scheduler change Fix unit test for rF1586c37b.	2022-06-20 16:05:36 +02:00
Sybren A. Stüvel	380d55b4f0	Cleanup: rename `job_compilers/path.go` to `js_path.go` Rename the file by adding `js_` suffix, to indicate it's for exposing a "path" object to JavaScript. No functional changes.	2022-06-20 15:57:03 +02:00
Sybren A. Stüvel	a7fbbf3313	Cleanup: rename `job_compilers/process.go` to `js_process.go` Rename the file by adding `js_` suffix, to indicate it's for exposing a "process" object to JavaScript. No functional changes.	2022-06-20 15:56:09 +02:00
Sybren A. Stüvel	1586c37b32	Manager: mark task as active as soon as it is assigned to a worker Move the task to 'active' status so that it won't be assigned to another worker. This also enables the task timeout monitoring.	2022-06-20 13:00:49 +02:00
Sybren A. Stüvel	2a4c9b2c13	Worker: enable SQLite foreign keys They're not used now, but enabling them is good default behaviour anyway.	2022-06-20 13:00:49 +02:00
Sybren A. Stüvel	de5d12362d	Manager: add `sleep_repeats` parameter to `echo-sleep-test` job type This makes it convenient to create an arbitrary number of tasks.	2022-06-20 11:44:41 +02:00
Sybren A. Stüvel	a2b667c043	Manager: log blocklist threshold	2022-06-17 17:15:23 +02:00
Sybren A. Stüvel	13bdb0ed73	Manager: remove outdated TODO	2022-06-17 17:15:13 +02:00
Sybren A. Stüvel	a368230afa	Manager: fix race condition in logging of worker name/UUID Instead of updating the logger in the context, just store a new logger in a new sub-context.	2022-06-17 17:13:32 +02:00
Sybren A. Stüvel	64c8fa851d	Show assigned worker in task details Show the worker assigned to the task in the task details view, as link to the worker itself.	2022-06-17 16:36:55 +02:00
Sybren A. Stüvel	7327896db9	Worker: allow overriding worker name from environment Allow overriding the worker name by setting the `FLAMENCO_WORKER_NAME` environment variable. This makes it easy to do from Docker configs, and, more importantly, from the scripts I use to run multiple workers on the same machine while developing Flamenco.	2022-06-17 16:24:03 +02:00
Sybren A. Stüvel	cdb7789f08	Refactor: Manager, move test code Move code that covers `worker_task_updates.go` into `worker_task_updates_test.go`. No functional changes.	2022-06-17 15:51:15 +02:00
Sybren A. Stüvel	046853932d	Manager: re-queue previously failed tasks of worker when blocklisting When a Worker is blocked from a job, re-queue its previously failed tasks so that other workers can give them a try.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	b95bed1f96	Refactor: rename `RequeueTasksOfWorker` to `RequeueActiveTasksOfWorker` Soon there will be another function to requeue tasks of workers by other criteria, so being clear in the name helps. No functional changes.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	fd31a85bcd	Manager: add blocking of workers when they fail certain tasks too much When a worker fails too many tasks, of the same task type, on the same job, it'll get blocked from doing those.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	56abc825a6	Refactor: Manager, refactor handling of task failures Split the handling of soft and hard failures into separate functions. No functional changes intended.	2022-06-17 15:01:52 +02:00
Sybren A. Stüvel	6feee74c54	Cleanup: Manager, move worker task update handling code into its own file Move the code related to task updates from workers to `worker_task_updates.go`. It's going to get more complex with the blocklisting in there; this prepares for that. No functional changes.	2022-06-17 11:46:07 +02:00
Sybren A. Stüvel	81f81d0e0a	Show task failure list in the web frontend Show the task failure list in the web frontend's `TaskDetails` component.	2022-06-17 11:37:56 +02:00
Sybren A. Stüvel	0b5140fc5f	Manager: clear task failure list on requeueing of jobs & tasks When a job or task gets requeued from the web interface, its task failure lists (i.e. the list of workers that previously failed this task) will be cleared. This clearing doesn't happen in other situations, e.g. when a worker signs off and its task gets requeued, the task's failure list will remain as-is.	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	e9fca8d993	Cleanup: typo fix in comment	2022-06-17 11:03:43 +02:00
Sybren A. Stüvel	b991e5f446	Cleanup: Manager, clarify some function names of the task state machine Rename functions `onTaskStatusX` to `updateJobOnTaskStatusX` to clarify their responsibility is to update the job in reaction to a task status change. No functional changes.	2022-06-17 11:01:41 +02:00
Sybren A. Stüvel	8764f8f7c1	Manager: task scheduler, don't schedule tasks the worker failed before When a worker asks for a task to perform, don't give it a task that it failed before.	2022-06-16 16:02:28 +02:00
Sybren A. Stüvel	ec10128f85	Worker: Sleep command, return error when sleep time is negative I need a way to reliably generate task errors, and having a more thorough check on the sleep duration parameter seemed a nice way to create those.	2022-06-16 15:46:03 +02:00
Sybren A. Stüvel	d5d0893b05	Worker: use explicit types for command parameter errors Introduce `ParameterMissingError` and `ParameterInvalidError` structs, to be returned from command executors. These replace free-form `fmt.Errorf()` style errors.	2022-06-16 15:45:09 +02:00
Sybren A. Stüvel	8af1b9d976	Worker: fix sync issue in TestUpstreamBufferManagerUnavailable unit test Fix synchronisation/goroutine issue in the "upstream buffer" test, where very occasionally the queue size was checked at the wrong time.	2022-06-16 15:43:20 +02:00
Sybren A. Stüvel	da1b42f9fa	Worker: fix sqlite connection issue in unit tests Fix sqlite issues in the "upstream buffer" test. The test used `:memory:` to have an in-memory DB to separate from other tests. The "flush at shutdown" code runs in a different goroutine, though, and creates a new DB connection. The SQLite separation was too strong, making that function not find any tables. This is now solved by having an in-memory database that's shared between all connections made from the same unit test.	2022-06-16 15:42:52 +02:00
Sybren A. Stüvel	7e28cfa69c	Worker: add task failures to the task log as well Task failures were only placed in the task's activity field, and are now added to the log as well.	2022-06-16 12:22:05 +02:00
Sybren A. Stüvel	e1309ad8fc	Worker: flush upstream buffer when shutting down When shutting down, the worker now tries to flush any buffered task updates before closing.	2022-06-16 12:21:17 +02:00
Sybren A. Stüvel	9ddf72fa37	Worker: sign off as last step of shutdown Within the shutdown procedure, signing off is now the last thing the worker does. This makes things more consistent from the Manager's point of view (like receiving last-second log entries while the Worker is still online).	2022-06-16 12:19:03 +02:00
Sybren A. Stüvel	5bc94101e8	Worker: Avoid sleep at shutdown Make the sleep between fetching tasks interruptable, so that a shutdown doesn't have to wait a few seconds.	2022-06-16 12:08:13 +02:00
Sybren A. Stüvel	9ab41984ac	Adjust Go code for Nickname -> Name change This fixes a bug where 'Worker undefined changed status' was logged in the web interface, as that was (back then incorrectly) `workerupdate.name`. Now that code is correct.	2022-06-16 11:03:18 +02:00
Sybren A. Stüvel	12f0a605a4	Manager: log configured worker timeout at startup	2022-06-16 10:51:17 +02:00
Sybren A. Stüvel	5f2712980e	Manager: task scheduler, check for requested worker status change first Before checking whether the Worker is allowed to do work (i.e. is in `awake` state), check any queued-up status changes. Those should be communicated, before saying "no work for you", so that the Worker can actually respond to it.	2022-06-16 10:48:38 +02:00
Sybren A. Stüvel	ee53373878	Cleanup: compare worker state to constant instead of hard-coded state Use the `requiredStatusToGetTask` constant to compare the worker status, and not just for logging. No functional changes, just better code.	2022-06-16 10:46:50 +02:00
Sybren A. Stüvel	40f711bf69	Fix two unit tests for the previous commit I pushed too soon :'(	2022-06-16 10:42:04 +02:00
Sybren A. Stüvel	be0b10400f	Manager: count workers as 'seen' even when there is no task Fix a bug where a worker would only be counted as 'seen' by the task scheduler if it actually got a task assigned.	2022-06-16 10:39:42 +02:00
Sybren A. Stüvel	7d7c2b1bd6	Cleanup: blacklist → blocklist Change "blacklist" to "blocklist", because that makes people happier. No functional changes.	2022-06-16 10:36:36 +02:00
Sybren A. Stüvel	6e12a2fb25	Manager: keep track of which worker failed which task When a Worker indicates a task failed, mark it as `soft-failed` until enough workers have tried & failed at the same task. This is the first step in a blocklisting system, where tasks of an often-failing worker will be requeued to be retried by others. NOTE: currently the failure list of a task is NOT reset whenever it is requeued! This will be implemented in a future commit, and is tracked in `FEATURES.md`.	2022-06-13 18:41:38 +02:00
Sybren A. Stüvel	c5debdeb70	Manager: add 'task failure list' to record workers failing tasks The persistence layer can now store which worker failed which task, as preparation for a blocklisting system. Such a system should be able to determine whether there are still any workers left to do the work.	2022-06-13 18:41:30 +02:00
Sybren A. Stüvel	e35911d106	Manager: add ability to delete jobs This is needed for a future unit test, and exposed the fact that SQLite didn't enforce foreign key constraints (and thus also didn't handle on-delete-cascade attributes). This has been fixed in the previous commit.	2022-06-13 18:41:19 +02:00
Sybren A. Stüvel	e5d0e987e1	Manager: enforce DB foreign key checks at startup SQLite disables foreign key checks by default, so Flamenco has to enable them explicitly.	2022-06-13 18:41:19 +02:00
Sybren A. Stüvel	6ec493d944	Manager, more efficiently create tasks When creating tasks the inter-task dependencies are saved as a 2nd pass,by updating the tasks in the database. This now only saves those dependencies, and no longer saves the entire task again.	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	02bc03ae2b	Manager: replace `gorm.Model` with our own `persistence.Model` struct `persistence.Model` contains the common database fields for most model structs. It is a copy of `gorm.Model`, but without the `DeletedAt` field (which triggers Gorm's soft deletion). Soft deletion is not used by Flamenco. If it ever becomes necessary to support soft-deletion, see https://gorm.io/docs/delete.html#Soft-Delete	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	ec5b3aac52	Manager: on getting task update from Worker, write log before status change When receiving a `TaskUpdate` from a Worker, write to the task log, before handling any task status change. If both log and task status change are sent, the log will likely contain the cause of the task state change. Any subsequent task logs, for example generated by the Manager in response to the status change, should be logged after that.	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	25d5b01b3c	Cleanup: test errors with `assert.NoError()` instead of `assert.Nil()` No functional changes, just nicer way to test.	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	6fc936d0a6	Revert accidental debug code Revert change in rF01c45afc20854918d1f18e6859b4154499d500b6 that made unit tests use an on-disk database.	2022-06-13 18:40:25 +02:00
Sybren A. Stüvel	b922722614	Manager: broadcast worker timeouts over SocketIO This way the web interface will also show timed-out workers.	2022-06-13 13:05:20 +02:00
Sybren A. Stüvel	75ca0e652e	Cleanup: timeout checker, improve readability of failed tests No functional changes	2022-06-13 12:50:27 +02:00
Sybren A. Stüvel	1de1e3a9a5	Manager: add 'canary' test to all timeout checker tests The canary test asserts that certain constants still have the expected value. Lowering those constants is good for testing the timeout stuff with the actual Flamenco Manager + Worker (without having to wait 5 minutes for it to kick in), but it's too easy to accidentally run the unit tests and get cryptic errors about everything failing horribly and miserably when you leave those constants low.	2022-06-13 12:50:02 +02:00
Sybren A. Stüvel	5dac3c2dc0	Manager: mark workers as 'seen' when they send updates Update the 'last seen at' timestamp of workers when they: - sign on - sign off - get a task assigned - send a task update - check whether they can keep running their task Note that this commit is necessary to not have the workers time out immediately ;-)	2022-06-13 12:47:07 +02:00
Sybren A. Stüvel	986b647967	Manager: re-queue tasks of timed-out workers Allow other workers to pick up the task(s) assigned to a timed-out worker.	2022-06-13 12:38:35 +02:00
Sybren A. Stüvel	7d5aae25b5	Manager: add timeout checks for workers	2022-06-13 12:33:22 +02:00
Sybren A. Stüvel	e8171fc597	Cleanup: Manager, reduce log level of task timeout checks	2022-06-13 12:33:16 +02:00
Sybren A. Stüvel	67562856d3	Manager: let Gorm create an index on `Task.LastTouchedAt` It's used in timeout queries, and there could be tens or hundreds of thousands of tasks in the database.	2022-06-13 12:33:05 +02:00
Sybren A. Stüvel	c3525c3b1a	Manager: move task requeueing to `TaskStateMachine` Requeueing the tasks of a specific worker is now done in the `TaskStateMachine`, such that it can be called from other services as well in future commits. This also makes the `LogStorage` service a dependency of the `TaskStateMachine`, as it needs to write "this task was requeued" kind of messages to the task logs.	2022-06-13 12:33:01 +02:00
Sybren A. Stüvel	e06bc484f4	Cleanup: manager, move task state machine interfaces to their own file No functional changes.	2022-06-13 12:32:18 +02:00
Sybren A. Stüvel	01c45afc20	Manager: explicitly store timestamps as UTC SQLite doesn't handle timezones by default, when you just use something like `date1 < date2`, for example. This makes GORM explicitly use UTC timestamps for the `CreatedAt`, `UpdatedAt`, and `DeletedAt` fields. Our own code should also use UTC when saving timestamps. That way all datetimes in the database are in the same timezone, and can be compared naievely.	2022-06-13 12:10:11 +02:00
Sybren A. Stüvel	fe1627dd85	Cleanup: timeout checker, move task-specific code to `tasks.go` Just a cleanup to prepare for the addition of worker timeouts.	2022-06-10 14:58:44 +02:00
Sybren A. Stüvel	13307c5a24	Manager: add canary test to timeout checker unit test The `TestTaskTimeout()` unit test assumes specific durations for initial & subsequent sleeps of the timeout checker. The test will fail quite cryptically when that assumption doesn't hold, so just test for it at the start of the unit test.	2022-06-10 14:53:23 +02:00
Sybren A. Stüvel	09902d201c	Manager: fix task timeout check logging of assigned workers The task's worker wasn't fetched from the database, always causing "unknown worker" messages in the task log.	2022-06-10 14:52:03 +02:00
Sybren A. Stüvel	d90a8b987d	Manager: Task Timeout Checker Tasks that are in state `active` but haven't been 'touched' by a Worker for 10 minutes or longer will transition to state `failed`. In the future, it might be better to move the decision about which state is suitable to the Task State Machine service, so that it can be smarter and take the history of the task into account. Going to `soft-failed` first might be a nice touch.	2022-06-10 14:32:02 +02:00
Sybren A. Stüvel	295891a17a	Manager: ensure Gorm-generated timestamps are in UTC SQLite should store all timestamps in UTC, as the database is woefully unaware of timezones and will compare lexicographically.	2022-06-10 14:31:53 +02:00
Sybren A. Stüvel	24204084c1	Manager: move timestamping of log messages to `task_logs` package In the future different services will write to the task log, and thus it makes sense to move the responsibility of prepending the timestamps to the log storage service.	2022-06-09 17:00:38 +02:00
Sybren A. Stüvel	819cad1d18	Manager: move broadcasting of task logs via SocketIO to task log service To ensure all task logs also get broadcast via SocketIO, the responsibility has moved from the `api_impl` to the `task_logs` package.	2022-06-09 16:49:48 +02:00
Sybren A. Stüvel	04dd479248	Manager: protect task log writing with mutex A per-task mutex is used to protect the writing of task logs, so that mutliple goroutines can safely write to the same task log.	2022-06-09 14:44:54 +02:00
Sybren A. Stüvel	92d6693871	Show Task's "last touched" in the web interface	2022-06-09 11:59:43 +02:00
Sybren A. Stüvel	354fd29f9e	Manager: Start timeout counting as soon as Worker gets task assigned Set the task's "last touched" field in the database to "now" as soon as the task is assigned to a worker.	2022-06-09 11:58:30 +02:00
Sybren A. Stüvel	87bce6be36	Manager: unify logging of task assignment and requeue-on-signoff The requeue-task-on-worker-signoff operation also needs to log a timestamp. The code for this, and the recently added code for timestamping the "task assigned to worker" message, are now unified.	2022-06-09 11:30:46 +02:00
Sybren A. Stüvel	75903a2da3	Manager: prepend timestamp to "task assigned to worker" task log entries Add a new `clock` service to the Flamenco struct, which allows us to mock the passing of time, and thus test for timestamps in a stable fashion.	2022-06-09 11:24:02 +02:00
Sybren A. Stüvel	b186ea1828	Manager: write to task log when assigning it to a worker	2022-06-09 10:59:44 +02:00
Sybren A. Stüvel	b4d2fc4231	Manager: keep track of when a Worker last worked on a task This will be used for keeping track of stuck tasks.	2022-06-03 16:33:50 +02:00
Sybren A. Stüvel	0be1ca30dd	Cleanup: manager, move api_impl interfaces to interfaces.go The number of interfaces declared by the `api_impl` package is getting large, so they deserve their own file. No functional changes.	2022-06-03 15:52:07 +02:00
Sybren A. Stüvel	8e7f1e2868	Manager: some extra unit tests for worker signoff behaviour	2022-06-02 16:37:29 +02:00
Sybren A. Stüvel	6cf82e5d43	Manager: cleanup, refactor Worker state change request persistence code Move the setting & clearing of worker state change requests into separate functions. No functional changes.	2022-06-02 16:36:06 +02:00
Sybren A. Stüvel	132ce8f2ec	Merge 'shutdown' and 'offline' states Move the 'shutdown' state code to the 'offline' state, to match the removal of the 'shutdown' state from the OpenAPI definition.	2022-06-02 16:35:07 +02:00
Sybren A. Stüvel	678308fb6d	Manager: allow cancelling worker state change requests A worker state change request can now be cancelled by requesting the worker to go to its current state. In other words, a previously requested change `A → B` can be cancelled by requesting the worker goes to state `A`. Previously this would simply overwrite the last request, resulting in a requested state change `A → A`. Having this non-lazy would even interrupt the currently running task.	2022-06-02 12:43:16 +02:00
Sybren A. Stüvel	9ed6b6d931	Manager: adjust code for `WorkerStatusChangeRequest` extraction See preceeding OpenAPI change.	2022-06-02 12:17:54 +02:00
Sybren A. Stüvel	ae6831ce6e	Manager: fix unit test rFcfb17b178da2055ef12b2aa2ad8f7f778a952bc3 changed the semantics of `SocketIOWorkerUpdate`, in the sense that any update that doesn't change the worker status can omit `previous_status`. This commit adjusts the unit test for this.	2022-06-02 12:13:25 +02:00

... 3 4 5 6 7 ...

783 Commits