As described in the Nix manual, almost any <filename>*.drv</filename> store path in a derivation's attribute set will induce a dependency on that derivation.
<varname>mkDerivation</varname>, however, takes a few attributes intended to, between them, include all the dependencies of a package.
This is done both for structure and consistency, but also so that certain other setup can take place.
For example, certain dependencies need their bin directories added to the <envar>PATH</envar>.
That is built-in, but other setup is done via a pluggable mechanism that works in conjunction with these dependency attributes.
See <xreflinkend="ssec-setup-hooks"/> for details.
</para>
<para>
Dependencies can be broken down along three axes: their host and target platforms relative to the new derivation's, and whether they are propagated.
The platform distinctions are motivated by cross compilation; see <xreflinkend="chap-cross"/> for exactly what each platform means.
<footnote><para>
The build platform is ignored because it is a mere implementation detail of the package satisfying the dependency:
As a general programming principle, dependencies are always <emphasis>specified</emphasis> as interfaces, not concrete implementation.
</para></footnote>
But even if one is not cross compiling, the platforms imply whether or not the dependency is needed at run-time or build-time, a concept that makes perfect sense outside of cross compilation.
For now, the run-time/build-time distinction is just a hint for mental clarity, but in the future it perhaps could be enforced.
</para>
<para>
The extension of <envar>PATH</envar> with dependencies, alluded to above, proceeds according to the relative platforms alone.
The process is carried out only for dependencies whose host platform matches the new derivation's build platform–i.e. which run on the platform where the new derivation will be built.
<footnote><para>
Currently, that means for native builds all dependencies are put on the <envar>PATH</envar>.
But in the future that may not be the case for sake of matching cross:
the platforms would be assumed to be unique for native and cross builds alike, so only the <varname>depsBuild*</varname> and <varname>nativeBuildDependencies</varname> dependencies would affect the <envar>PATH</envar>.
</para></footnote>
For each dependency <replaceable>dep</replaceable> of those dependencies, <filename><replaceable>dep</replaceable>/bin</filename>, if present, is added to the <envar>PATH</envar> environment variable.
</para>
<para>
The dependency is propagated when it forces some of its other-transitive (non-immediate) downstream dependencies to also take it on as an immediate dependency.
Nix itself already takes a package's transitive dependencies into account, but this propagation ensures nixpkgs-specific infrastructure like setup hooks (mentioned above) also are run as if the propagated dependency.
</para>
<para>
It is important to note dependencies are not necessary propagated as the same sort of dependency that they were before, but rather as the corresponding sort so that the platform rules still line up.
The exact rules for dependency propagation can be given by assigning each sort of dependency two integers based one how it's host and target platforms are offset from the depending derivation's platforms.
Those offsets are given are given below in the descriptions of each dependency list attribute.
Algorithmically, we traverse propagated inputs, accumulating every propagated dep's propagated deps and adjusting them to account for the "shift in perspective" described by the current dep's platform offsets.
This results in sort a transitive closure of the dependency relation, with the offsets being approximately summed when two dependency links are combined.
We also prune transitive deps whose combined offsets go out-of-bounds, which can be viewed as a filter over that transitive closure removing dependencies that are blatantly absurd.
</para>
<para>
We can define the process precisely with <linkxlink:href="https://en.wikipedia.org/wiki/Natural_deduction">Natural Deduction</link> using the inference rules.
This probably seems a bit obtuse, but so is the bash code that actually implements it!
<footnote><para>
The <function>findInputs</function> function, currently residing in <filename>pkgs/stdenv/generic/setup.sh</filename>, implements the propagation logic.
</para></footnote>
They're confusing in very different ways so...hopefully if something doesn't make sense in one presentation, it does in the other!
<programlisting>
let mapOffset(h, t, i) = i + (if i <= 0 then h else t - 1)
let mapOffset(h, t, i) = i + (if i <= 0 then h else t - 1)
dep(h0, _, A, B)
propagated-dep(h1, t1, B, C)
h0 + h1 in {-1, 0, 1}
h0 + t1 in {-1, 0, -1}
-------------------------------------- Take immediate deps' propagated deps
propagated-dep(mapOffset(h0, t0, h1),
mapOffset(h0, t0, t1),
A, C)</programlisting>
<programlisting>
propagated-dep(h, t, A, B)
-------------------------------------- Propagated deps count as deps
dep(h, t, A, B)</programlisting>
Some explanation of this monstrosity is in order.
In the common case, the target offset of a dependency is the successor to the target offset: <literal>t = h + 1</literal>.
That means that:
<programlisting>
let f(h, t, i) = i + (if i <= 0 then h else t - 1)
let f(h, h + 1, i) = i + (if i <= 0 then h else (h + 1) - 1)
let f(h, h + 1, i) = i + (if i <= 0 then h else h)
let f(h, h + 1, i) = i + h
</programlisting>
This is where the "sum-like" comes from above:
We can just sum all the host offset to get the host offset of the transitive dependency.
The target offset is the transitive dep is simply the host offset + 1, just as it was with the dependencies composed to make this transitive one;
it can be ignored as it doesn't add any new information.
</para>
<para>
Because of the bounds checks, the uncommon cases are <literal>h = t</literal> and <literal>h + 2 = t</literal>.
In the former case, the motivation for <function>mapOffset</function> is that since its host and target platforms are the same, no transitive dep of it should be able to "discover" an offset greater than its reduced target offsets.
<function>mapOffset</function> effectively "squashes" all its transitive dependencies' offsets so that none will ever be greater than the target offset of the original <literal>h = t</literal> package.
In the other case, <literal>h + 1</literal> is skipped over between the host and target offsets.
Instead of squashing the offsets, we need to "rip" them apart so no transitive dependencies' offset is that one.
</para>
<para>
Overall, the unifying theme here is that propagation shouldn't be introducing transitive dependencies involving platforms the needing package is unaware of.
The offset bounds checking and definition of <function>mapOffset</function> together ensure that this is the case.
Discovering a new offset is discovering a new platform, and since those platforms weren't in the derivation "spec" of the needing package, they cannot be relevant.
From a capability perspective, we can imagine that the host and target platforms of a package are the capabilities a package requires, and the depending package must provide the capability to the dependency.
A list of dependencies whose host and target platforms are the new derivation's build platform.
This means a <literal>-1</literal> host and <literal>-1</literal> target offset from the new derivation's platforms.
They are programs/libraries used at build time that furthermore produce programs/libraries also used at build time.
If the dependency doesn't care about the target platform (i.e. isn't a compiler or similar tool), put it in <varname>nativeBuildInputs</varname>instead.
The most common use for this <literal>buildPackages.stdenv.cc</literal>, the default C compiler for this role.
That example crops up more than one might think in old commonly used C libraries.
</para>
<para>
Since these packages are able to be run at build time, that are always added to the <envar>PATH</envar>, as described above.
But since these packages are only guaranteed to be able to run then, they shouldn't persist as run-time dependencies.
This isn't currently enforced, but could be in the future.
A list of dependencies whose host platform is the new derivation's build platform, and target platform is the new derivation's host platform.
This means a <literal>-1</literal> host offset and <literal>0</literal> target offset from the new derivation's platforms.
They are programs/libraries used at build time that, if they are a compiler or similar tool, produce code to run at run time—i.e. tools used to build the new derivation.
If the dependency doesn't care about the target platform (i.e. isn't a compiler or similar tool), put it here, rather than in <varname>depsBuildBuild</varname> or <varname>depsBuildTarget</varname>.
This would be called <varname>depsBuildHost</varname> but for historical continuity.
</para>
<para>
Since these packages are able to be run at build time, that are added to the <envar>PATH</envar>, as described above.
But since these packages only are guaranteed to be able to run then, they shouldn't persist as run-time dependencies.
This isn't currently enforced, but could be in the future.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>depsBuildTarget</varname></term>
<listitem>
<para>
A list of dependencies whose host platform is the new derivation's build platform, and target platform is the new derivation's target platform.
This means a <literal>-1</literal> host offset and <literal>1</literal> target offset from the new derivation's platforms.
They are programs used at build time that produce code to run at run with code produced by the depending package.
Most commonly, these would tools used to build the runtime or standard library the currently-being-built compiler will inject into any code it compiles.
In many cases, the currently-being built compiler is itself employed for that task, but when that compiler won't run (i.e. its build and host platform differ) this is not possible.
Other times, the compiler relies on some other tool, like binutils, that is always built separately so the dependency is unconditional.
</para>
<para>
This is a somewhat confusing dependency to wrap ones head around, and for good reason.
As the only one where the platform offsets are not adjacent integers, it requires thinking of a bootstrapping stage <emphasis>two</emphasis> away from the current one.
It and it's use-case go hand in hand and are both considered poor form:
try not to need this sort dependency, and try not avoid building standard libraries / runtimes in the same derivation as the compiler produces code using them.
Instead strive to build those like a normal library, using the newly-built compiler just as a normal library would.
In short, do not use this attribute unless you are packaging a compiler and are sure it is needed.
</para>
<para>
Since these packages are able to be run at build time, that are added to the <envar>PATH</envar>, as described above.
But since these packages only are guaranteed to be able to run then, they shouldn't persist as run-time dependencies.
This isn't currently enforced, but could be in the future.
A list of dependencies whose host and target platforms match the new derivation's host platform.
This means a both <literal>0</literal> host offset and <literal>0</literal> target offset from the new derivation's host platform.
These are packages used at run-time to generate code also used at run-time.
In practice, that would usually be tools used by compilers for metaprogramming/macro systems, or libraries used by the macros/metaprogramming code itself.
It's always preferable to use a <varname>depsBuildBuild</varname> dependency in the derivation being built than a <varname>depsHostHost</varname> on the tool doing the building for this purpose.
A list of dependencies whose host platform and target platform match the new derivation's.
This means a <literal>0</literal> host offset and <literal>1</literal> target offset from the new derivation's host platform.
This would be called <varname>depsHostTarget</varname> but for historical continuity.
If the dependency doesn't care about the target platform (i.e. isn't a compiler or similar tool), put it here, rather than in <varname>depsBuildBuild</varname>.
</para>
<para>
These often are programs/libraries used by the new derivation at <emphasis>run</emphasis>-time, but that isn't always the case.
For example, the machine code in a statically linked library is only used at run time, but the derivation containing the library is only needed at build time.
Even in the dynamic case, the library may also be needed at build time to appease the linker.
A list of dependencies whose host platform matches the new derivation's target platform.
This means a <literal>1</literal> offset from the new derivation's platforms.
These are packages that run on the target platform, e.g. the standard library or run-time deps of standard library that a compiler insists on knowing about.
It's poor form in almost all cases for a package to depend on another from a future stage [future stage corresponding to positive offset].
Do not use this attribute unless you are packaging a compiler and are sure it is needed.
The propagated equivalent of <varname>nativeBuildInputs</varname>.
This would be called <varname>depsBuildHostPropagated</varname> but for historical continuity.
For example, if package <varname>Y</varname> has <literal>propagatedNativeBuildInputs = [X]</literal>, and package <varname>Z</varname> has <literal>buildInputs = [Y]</literal>, then package <varname>Z</varname> will be built as if it included package <varname>X</varname> in its <varname>nativeBuildInputs</varname>.
If instead, package <varname>Z</varname> has <literal>nativeBuildInputs = [Y]</literal>, then <varname>Z</varname> will be built as if it included <varname>X</varname> in the <varname>depsBuildBuild</varname> of package <varname>Z</varname>, because of the sum of the two <literal>-1</literal> host offsets.
A natural number indicating how much information to log.
If set to 1 or higher, <literal>stdenv</literal> will print moderate debug information during the build.
In particular, the <command>gcc</command> and <command>ld</command> wrapper scripts will print out the complete command line passed to the wrapped tools.
If set to 6 or higher, the <literal>stdenv</literal> setup script will be run with <literal>set -x</literal> tracing.
If set to 7 or higher, the <command>gcc</command> and <command>ld</command> wrapper scripts will also be run with <literal>set -x</literal> tracing.
<footnote><para>Eventually these will be passed when in native builds too, to improve determinism: build-time guessing, as is done today, is a risk of impurity.</para></footnote>
Nix itself considers a build-time dependency merely something that should previously be built and accessible at build time—packages themselves are on their own to perform any additional setup.
In most cases, that is fine, and the downstream derivation can deal with it's own dependencies.
But for a few common tasks, that would result in almost every package doing the same sort of setup work---depending not on the package itself, but entirely on which dependencies were used.
</para>
<para>
In order to alleviate this burden, the <firstterm>setup hook></firstterm>mechanism was written, where any package can include a shell script that [by convention rather than enforcement by Nix], any downstream reverse-dependency will source as part of its build process.
That allows the downstream dependency to merely specify its dependencies, and lets those dependencies effectively initialize themselves.
No boilerplate mirroring the list of dependencies is needed.
</para>
<para>
The Setup hook mechanism is a bit of a sledgehammer though: a powerful feature with a broad and indiscriminate area of effect.
The combination of its power and implicit use may be expedient, but isn't without costs.
Nix itself is unchanged, but the spirit of adding dependencies being effect-free is violated even if the letter isn't.
For example, if a derivation path is mentioned more than once, Nix itself doesn't care and simply makes sure the dependency derivation is already built just the same—depending is just needing something to exist, and needing is idempotent.
However, a dependency specified twice will have its setup hook run twice, and that could easily change the build environment (though a well-written setup hook will therefore strive to be idempotent so this is in fact not observable).
More broadly, setup hooks are anti-modular in that multiple dependencies, whether the same or different, should not interfere and yet their setup hooks may well do so.
</para>
<para>
The most typical use of the setup hook is actually to add other hooks which are then run (i.e. after all the setup hooks) on each dependency.
For example, the C compiler wrapper's setup hook feeds itself flags for each dependency that contains relevant libaries and headers.
This is done by defining a bash function, and appending its name to one of
<envar>envBuildBuildHooks</envar>`,
<envar>envBuildHostHooks</envar>`,
<envar>envBuildTargetHooks</envar>`,
<envar>envHostHostHooks</envar>`,
<envar>envHostTargetHooks</envar>`, or
<envar>envTargetTargetHooks</envar>`.
These 6 bash variables correspond to the 6 sorts of dependencies by platform (there's 12 total but we ignore the propagated/non-propagated axis).
</para>
<para>
Packages adding a hook should not hard code a specific hook, but rather choose a variable <emphasis>relative</emphasis> to how they are included.
Returning to the C compiler wrapper example, if it itself is an <literal>n</literal> dependency, then it only wants to accumulate flags from <literal>n + 1</literal> dependencies, as only those ones match the compiler's target platform.
The <envar>hostOffset</envar> variable is defined with the current dependency's host offset <envar>targetOffset</envar> with its target offset, before it's setup hook is sourced.
Additionally, since most environment hooks don't care about the target platform,
That means the setup hook can append to the right bash array by doing something like
<programlistinglanguage="bash">
addEnvHooks "$hostOffset" myBashFunction
</programlisting>
</para>
<para>
The <emphasis>existence</emphasis> of setups hooks has long been documented and packages inside Nixpkgs are free to use these mechanism.
Other packages, however, should not rely on these mechanisms not changing between Nixpkgs versions.
Because of the existing issues with this system, there's little benefit from mandating it be stable for any period of time.
</para>
<para>
Here are some packages that provide a setup hook.
Since the mechanism is modular, this probably isn't an exhaustive list.
Then again, since the mechanism is only to be used as a last resort, it might be.
Bintools Wrapper wraps the binary utilities for a bunch of miscellaneous purposes.
These are GNU Binutils when targetting Linux, and a mix of cctools and GNU binutils for Darwin.
[The "Bintools" name is supposed to be a compromise between "Binutils" and "cctools" not denoting any specific implementation.]
Specifically, the underlying bintools package, and a C standard library (glibc or Darwin's libSystem, just for the dynamic loader) are all fed in, and dependency finding, hardening (see below), and purity checks for each are handled by Bintools Wrapper.
Packages typically depend on CC Wrapper, which in turn (at run time) depends on Bintools Wrapper.
Bintools Wrapper was only just recently split off from CC Wrapper, so the division of labor is still being worked out.
For example, it shouldn't care about about the C standard library, but just take a derivation with the dynamic loader (which happens to be the glibc on linux).
Dependency finding however is a task both wrappers will continue to need to share, and probably the most important to understand.
It is currently accomplished by collecting directories of host-platform dependencies (i.e. <varname>buildInputs</varname> and <varname>nativeBuildInputs</varname>) in environment variables.
Bintools Wrapper's setup hook causes any <filename>lib</filename> and <filename>lib64</filename> subdirectories to be added to <envar>NIX_LDFLAGS</envar>.
Since CC Wrapper and Bintools Wrapper use the same strategy, most of the Bintools Wrapper code is sparsely commented and refers to CC Wrapper.
But CC Wrapper's code, by contrast, has quite lengthy comments.
Bintools Wrapper merely cites those, rather than repeating them, to avoid falling out of sync.
Firstly, this helps poorly-written packages, e.g. ones that look for just <command>gcc</command> when <envar>CC</envar> isn't defined yet <command>clang</command> is to be used.
Secondly, this helps packages not get confused when cross-compiling, in which case multiple Bintools Wrappers may simultaneously be in use.
<footnote><para>
Each wrapper targets a single platform, so if binaries for multiple platforms are needed, the underlying binaries must be wrapped multiple times.
As this is a property of the wrapper itself, the multiple wrappings are needed whether or not the same underlying binaries can target multiple platforms.
</para></footnote>
<envar>BUILD_</envar>- and <envar>TARGET_</envar>-prefixed versions of the normal environment variable are defined for the additional Bintools Wrappers, properly disambiguating them.
Most packages, however, firstly use the C compiler for linking, secondly use <envar>LD</envar> anyways, defining it as the C compiler, and thirdly, only so define <envar>LD</envar> when it is undefined as a fallback.
This triple-threat means Bintools Wrapper will break those packages, as LD is already defined as the actual linker which the package won't override yet doesn't want to use.
CC Wrapper wraps a C toolchain for a bunch of miscellaneous purposes.
Specifically, a C compiler (GCC or Clang), wrapped binary tools, and a C standard library (glibc or Darwin's libSystem, just for the dynamic loader) are all fed in, and dependency finding, hardening (see below), and purity checks for each are handled by CC Wrapper.
Packages typically depend on CC Wrapper, which in turn (at run time) depends on Bintools Wrapper.
</para>
<para>
Dependency finding is undoubtedly the main task of CC Wrapper.
This works just like Bintools Wrapper, except that any <filename>include</filename> subdirectory of any relevant dependency is added to <envar>NIX_CFLAGS_COMPILE</envar>.
The setup hook itself contains some lengthy comments describing the exact convoluted mechanism by which this is accomplished.
</para>
<para>
CC Wrapper also like Bintools Wrapper defines standard environment variables with the names of the tools it wraps, for the same reasons described above.
Importantly, while it includes a <command>cc</command> symlink to the c compiler for portability, the <envar>CC</envar> will be defined using the compiler's "real name" (i.e. <command>gcc</command> or <command>clang</command>).
This helps lousy build systems that inspect on the name of the compiler rather than run it.
Adds the <filename>lib/site_perl</filename> subdirectory of each build input to the <envar>PERL5LIB</envar> environment variable.
For instance, if <varname>buildInputs</varname> contains Perl, then the <filename>lib/site_perl</filename> subdirectory of each input is added to the <envar>PERL5LIB</envar> environment variable.
<para>This needs to be turned off or fixed for errors similar to:</para>
<programlisting>
/tmp/nix-build-zynaddsubfx-2.5.2.drv-0/zynaddsubfx-2.5.2/src/UI/guimain.cpp:571:28: error: format not a string literal and no format arguments [-Werror=format-security]