Improved performance in Nixpkgs
1 Avoiding subshells
A common complain in using Nixpkgs is that things can become slow when
you have lots of dependencies. Processing of build inputs is processed
in Bash which tends to be pretty hard to make performant. Bash doesn’t
give us any way to loop through dependencies in parallel, so we end up
with pretty slow Bash. Luckily, someone has found some ways to speed
this up with some clever tricks in the setup.sh
script.
1.1 Pull request
Albert Safin (@xzfc on GitHub) made an excellent PR that promises to
improve performance for all users of Nixpkgs. The PR is available at
PR #69131. The basic idea is to avoid invoking “subshells” in Bash. A
subshell is basically anything that uses $(cmd ...)
. Each subshell
requires forking a new process which has a constant time cost that
ends up being ~2ms. This isn’t much in isolation, but adds up in big
loops.
Subshells are usually used in Bash because they are convenient and
easy to reason about. It’s easy to understand how a subshell works as
it’s just substituting the result of one command into another’s
arguments. We don’t usually care about the performance cost of
subshells. In the hot path of Nixpkgs’ setup.sh
, however, it’s
pretty important to squeeze every bit of performance we can.
A few interesting changes were required to make this work. I’ll go through and document what there are. More information can be found at the Bash manual.
diff --git a/pkgs/stdenv/generic/setup.sh b/pkgs/stdenv/generic/setup.sh index 326a60676a26..60067a4051de 100644 --- a/pkgs/stdenv/generic/setup.sh +++ b/pkgs/stdenv/generic/setup.sh @@ -98,7 +98,7 @@ _callImplicitHook() { # hooks exits the hook, not the caller. Also will only pass args if # command can take them _eval() { - if [ "$(type -t "$1")" = function ]; then + if declare -F "$1" > /dev/null 2>&1; then set +u "$@" # including args else
The first change is pretty easy to understand. It just replaces the
type
call with a declare
call, utilizing an exit code in place of
stdout. Unfortunately, declare
is a Bashism which is not available
in all POSIX shells. It’s been ill defined whether Bashisms can be
used in Nixpkgs, but we now will require Nixpkgs to be sourced with
Bash 4+.
diff --git a/pkgs/stdenv/generic/setup.sh b/pkgs/stdenv/generic/setup.sh index 60067a4051de..7e7f8739845b 100644 --- a/pkgs/stdenv/generic/setup.sh +++ b/pkgs/stdenv/generic/setup.sh @@ -403,6 +403,7 @@ findInputs() { # The current package's host and target offset together # provide a <=-preserving homomorphism from the relative # offsets to current offset + local -i mapOffsetResult function mapOffset() { local -ri inputOffset="$1" if (( "$inputOffset" <= 0 )); then @@ -410,7 +411,7 @@ findInputs() { else local -ri outputOffset="$inputOffset - 1 + $targetOffset" fi - echo "$outputOffset" + mapOffsetResult="$outputOffset" } # Host offset relative to that of the package whose immediate @@ -422,8 +423,8 @@ findInputs() { # Host offset relative to the package currently being # built---as absolute an offset as will be used. - local -i hostOffsetNext - hostOffsetNext="$(mapOffset relHostOffset)" + mapOffset relHostOffset + local -i hostOffsetNext="$mapOffsetResult" # Ensure we're in bounds relative to the package currently # being built. @@ -441,8 +442,8 @@ findInputs() { # Target offset relative to the package currently being # built. - local -i targetOffsetNext - targetOffsetNext="$(mapOffset relTargetOffset)" + mapOffset relTargetOffset + local -i targetOffsetNext="$mapOffsetResult" # Once again, ensure we're in bounds relative to the # package currently being built.
Similarly, this change makes mapOffset
set to it’s result to
mapOffsetResult
instead of printing it to stdout, avoiding the
subshell. Less functional, but more performant!
diff --git a/pkgs/stdenv/generic/setup.sh b/pkgs/stdenv/generic/setup.sh index 7e7f8739845b..e25ea735a93c 100644 --- a/pkgs/stdenv/generic/setup.sh +++ b/pkgs/stdenv/generic/setup.sh @@ -73,21 +73,18 @@ _callImplicitHook() { set -u local def="$1" local hookName="$2" - case "$(type -t "$hookName")" in - (function|alias|builtin) - set +u - "$hookName";; - (file) - set +u - source "$hookName";; - (keyword) :;; - (*) if [ -z "${!hookName:-}" ]; then - return "$def"; - else - set +u - eval "${!hookName}" - fi;; - esac + if declare -F "$hookName" > /dev/null; then + set +u + "$hookName" + elif type -p "$hookName" > /dev/null; then + set +u + source "$hookName" + elif [ -n "${!hookName:-}" ]; then + set +u + eval "${!hookName}" + else + return "$def" + fi # `_eval` expects hook to need nounset disable and leave it # disabled anyways, so Ok to to delegate. The alternative of a # return trap is no good because it would affect nested returns.
This change replaces the type -t
command with calls to specific Bash
builtins. declare -F
tells us if the hook is a function, type -p
tells us if hookName
is a file, and otherwise -n
tells us if the
hook is non-empty. Again, this introduces a Bashism.
In the worst case, this does replace one case
with multiple if
branches. But since most hooks are functions, most of the time this
ends up being a single if
.
diff --git a/pkgs/stdenv/generic/setup.sh b/pkgs/stdenv/generic/setup.sh index e25ea735a93c..ea550a6d534b 100644 --- a/pkgs/stdenv/generic/setup.sh +++ b/pkgs/stdenv/generic/setup.sh @@ -449,7 +449,8 @@ findInputs() { [[ -f "$pkg/nix-support/$file" ]] || continue local pkgNext - for pkgNext in $(< "$pkg/nix-support/$file"); do + read -r -d '' pkgNext < "$pkg/nix-support/$file" || true + for pkgNext in $pkgNext; do findInputs "$pkgNext" "$hostOffsetNext" "$targetOffsetNext" done done
This change replaces the $(< )
call with a read
call. This is a
little surprising since read
is using an empty delimiter ''
instead of a new line. This replaces one Bashsism $(< )
with another
in -d
. And, the result, gets rid of a remaining subshell usage.
diff --git a/pkgs/build-support/bintools-wrapper/setup-hook.sh b/pkgs/build-support/bintools-wrapper/setup-hook.sh index f65b792485a0..27d3e6ad5120 100644 --- a/pkgs/build-support/bintools-wrapper/setup-hook.sh +++ b/pkgs/build-support/bintools-wrapper/setup-hook.sh @@ -61,9 +61,8 @@ do if PATH=$_PATH type -p "@targetPrefix@${cmd}" > /dev/null then - upper_case="$(echo "$cmd" | tr "[:lower:]" "[:upper:]")" - export "${role_pre}${upper_case}=@targetPrefix@${cmd}"; - export "${upper_case}${role_post}=@targetPrefix@${cmd}"; + export "${role_pre}${cmd^^}=@targetPrefix@${cmd}"; + export "${cmd^^}${role_post}=@targetPrefix@${cmd}"; fi done
This replace a call to tr
with a usage of the ^^
.
${parameter^^pattern}
is a Bash 4 feature and allows you to
upper-case a string without calling out to tr
.
diff --git a/pkgs/build-support/bintools-wrapper/setup-hook.sh b/pkgs/build-support/bintools-wrapper/setup-hook.sh index 27d3e6ad5120..2e15fa95c794 100644 --- a/pkgs/build-support/bintools-wrapper/setup-hook.sh +++ b/pkgs/build-support/bintools-wrapper/setup-hook.sh @@ -24,7 +24,8 @@ bintoolsWrapper_addLDVars () { # Python and Haskell packages often only have directories like $out/lib/ghc-8.4.3/ or # $out/lib/python3.6/, so having them in LDFLAGS just makes the linker search unnecessary # directories and bloats the size of the environment variable space. - if [[ -n "$(echo $1/lib/lib*)" ]]; then + local -a glob=( $1/lib/lib* ) + if [ "${#glob[*]}" -gt 0 ]; then export NIX_${role_pre}LDFLAGS+=" -L$1/lib" fi fi
Here, we are checking for whether any files exist in /lib/lib*
using
a glob. It originally used a subshell to check if the result was
empty, but this change replaces it with the Bash ${#parameter}
length operation.
diff --git a/pkgs/stdenv/generic/setup.sh b/pkgs/stdenv/generic/setup.sh index 311292169ecd..326a60676a26 100644 --- a/pkgs/stdenv/generic/setup.sh +++ b/pkgs/stdenv/generic/setup.sh @@ -17,7 +17,8 @@ fi # code). The hooks for <hookName> are the shell function or variable # <hookName>, and the values of the shell array ‘<hookName>Hooks’. runHook() { - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set -u # May be called from elsewhere, so do `set -u`. local hookName="$1" @@ -32,7 +33,7 @@ runHook() { set -u # To balance `_eval` done - eval "${oldOpts}" + set "$oldOpts" return 0 } @@ -40,7 +41,8 @@ runHook() { # Run all hooks with the specified name, until one succeeds (returns a # zero exit code). If none succeed, return a non-zero exit code. runOneHook() { - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set -u # May be called from elsewhere, so do `set -u`. local hookName="$1" @@ -57,7 +59,7 @@ runOneHook() { set -u # To balance `_eval` done - eval "${oldOpts}" + set "$oldOpts" return "$ret" } @@ -500,10 +502,11 @@ activatePackage() { (( "$hostOffset" <= "$targetOffset" )) || exit -1 if [ -f "$pkg" ]; then - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set +u source "$pkg" - eval "$oldOpts" + set "$oldOpts" fi # Only dependencies whose host platform is guaranteed to match the @@ -522,10 +525,11 @@ activatePackage() { fi if [[ -f "$pkg/nix-support/setup-hook" ]]; then - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set +u source "$pkg/nix-support/setup-hook" - eval "$oldOpts" + set "$oldOpts" fi } @@ -1273,17 +1277,19 @@ showPhaseHeader() { genericBuild() { if [ -f "${buildCommandPath:-}" ]; then - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set +u source "$buildCommandPath" - eval "$oldOpts" + set "$oldOpts" return fi if [ -n "${buildCommand:-}" ]; then - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set +u eval "$buildCommand" - eval "$oldOpts" + set "$oldOpts" return fi @@ -1313,10 +1319,11 @@ genericBuild() { # Evaluate the variable named $curPhase if it exists, otherwise the # function named $curPhase. - local oldOpts="$(shopt -po nounset)" + local oldOpts="-u" + shopt -qo nounset || oldOpts="+u" set +u eval "${!curPhase:-$curPhase}" - eval "$oldOpts" + set "$oldOpts" if [ "$curPhase" = unpackPhase ]; then cd "${sourceRoot:-.}"
This last change is maybe the trickiest. $(shopt -po nounset)
is
used to get the old value of nounset
. The nounset
setting tells
Bash to treat unset variables as an error. This is used temporarily
for phases and hooks to enforce this property. It will be reset to its
previous value after we finish evaling the current phase or hook. To
avoid the subshell here, the stdout provided in shopt -po
is
replaced with an exit code provided in shopt -qo nounset
. If the
shopt -qo nounset
fails, we set oldOpts
to +u
, otherwise it is
assumed that it is -u
.
This commit was first merged in on September 20, but it takes a while for it to hit master. Today, it was finally merged into master (October 13) in 4e6826a so we can finally can see the benefits from it!
1.2 Benchmarking
Hyperfine makes it easy to compare differences in timings. You can install it locally with:
$ nix-env -iA nixpkgs.hyperfine
Here are some of the results:
$ hyperfine --warmup 3 \ 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p stdenv --run :' \ 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p stdenv --run :' Benchmark #1: nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p stdenv --run : Time (mean ± σ): 436.4 ms ± 8.5 ms [User: 324.7 ms, System: 107.8 ms] Range (min … max): 430.8 ms … 459.6 ms 10 runs Benchmark #2: nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p stdenv --run : Time (mean ± σ): 244.5 ms ± 2.3 ms [User: 190.7 ms, System: 34.2 ms] Range (min … max): 241.8 ms … 248.3 ms 12 runs Summary 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p stdenv --run :' ran 1.79 ± 0.04 times faster than 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p stdenv --run :'
$ hyperfine --warmup 3 \ 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p i3.buildInputs --run :' \ 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p i3.buildInputs --run :' Benchmark #1: nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p i3.buildInputs --run : Time (mean ± σ): 3.428 s ± 0.015 s [User: 2.489 s, System: 1.081 s] Range (min … max): 3.404 s … 3.453 s 10 runs Benchmark #2: nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p i3.buildInputs --run : Time (mean ± σ): 873.4 ms ± 12.2 ms [User: 714.7 ms, System: 89.3 ms] Range (min … max): 861.5 ms … 906.4 ms 10 runs Summary 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p i3.buildInputs --run :' ran 3.92 ± 0.06 times faster than 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p i3.buildInputs --run :'
$ hyperfine --warmup 3 \ 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p inkscape.buildInputs --run :' \ 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p inkscape.buildInputs --run :' Benchmark #1: nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p inkscape.buildInputs --run : Time (mean ± σ): 4.380 s ± 0.024 s [User: 3.155 s, System: 1.443 s] Range (min … max): 4.339 s … 4.409 s 10 runs Benchmark #2: nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p inkscape.buildInputs --run : Time (mean ± σ): 1.007 s ± 0.011 s [User: 826.7 ms, System: 114.2 ms] Range (min … max): 0.995 s … 1.026 s 10 runs Summary 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4e6826a.tar.gz -p inkscape.buildInputs --run :' ran 4.35 ± 0.05 times faster than 'nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/33366cc.tar.gz -p inkscape.buildInputs --run :'
Try running these commands yourself, and compare the results.
1.3 Results
Avoiding subshells leads to a decrease in up to 4x of the time it used to take. That multiplier is going to depend on precisely how many inputs we are processing. It’s a pretty impressive improvement, and it comes with no added cost. These kind of easy wins in performance are pretty rare, and worth celebrating!