Inspired by this blog post from Tavis, I decided to document my own recent journey of reducing terminal (ZSH) lag startup. This post is way less interesting than the one from Tavis that uses a debugger to patch applications on the fly, but should still be interesting for some. And it also shows how powerful Nix can be for some things.
For context, I have basically 3 systems where I interact with terminal frequently:
- Thinkpad P14s Gen 1 running NixOS, with a reasonable fast CPU (AMD Ryzen 7 PRO 4750U) and disk (NVMe)
- MacBook Pro with a really fast CPU (M1 Pro) and disk, but of course running macOS, being phased-out since this is a job owned machine and I am changing jobs right now, but should be replaced with another one soon™
- Chromebook Duet 3 running ChromeOS, with slow CPU (Snapdragon 7c Gen 2) and disk (eMMC, really?)
My experience is similar to Tavis, at around 300ms of startup time I don't care too much, but around 500ms+ is where I start to notice. I never had any issues with startup time in NixOS itself (I had issues with macOS before, but it was not actually the fault of macOS), but in the Chromebook it was awful: 600ms+ with hot start, while cold start it could take multiple seconds.
We can check how long ZSH takes to start by using:
$ time zsh -ci exit
zsh -ic exit 0.04s user 0.10s system 100% cpu 0.143 total
The -i
flag here is important, because we are interested in the interactive
use of ZSH. Without this flag ZSH will ignore your ~/.zshrc
file, and the
results will be meaningless.
To do a more interesting benchmark, we can use
hyperfine
:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 145.4 ms ± 4.2 ms [User: 49.8 ms, System: 97.3 ms]
Range (min … max): 138.6 ms … 155.3 ms 19 runs
Hyperfine will run the command multiple times and take care of things like shell startup time. A really great tool to have in your toolbox by the way, but I digress.
So let's do a little time travelling. Going back to commit
b12757f
from nix-configs. Running
hyperfine
like above from my NixOS laptop, we have:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 218.6 ms ± 5.1 ms [User: 70.6 ms, System: 151.5 ms]
Range (min … max): 210.3 ms … 227.0 ms 13 runs
This doesn't look that bad, but let's see the same commit in my Chromebook:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 679.7 ms ± 40.2 ms [User: 230.8 ms, System: 448.5 ms]
Range (min … max): 607.3 ms … 737.0 ms 10 runs
Yikes, this is much worse. And those are the results after I retried the
benchmark (so it is a hot start). The cold start times were above 3s. So let's
investigate what is happening here. We can profile what is taking time during
the startup of ZSH using zprof.
You can add the following in your ~/.zshrc
:
# At the top of your ~/.zshrc file
zmodload zsh/zprof
# ...
# At the end of your ~/.zshrc file
zprof
Or if using Home-Manager, use the
programs.zsh.zprof.enable
option. Once we restart ZSH, we will have something like:
num calls time self name
-----------------------------------------------------------------------------------
1) 1 36.91 36.91 34.29% 30.47 30.47 28.31% (anon) [/home/thiagoko/.zsh/plugins/zim-completion/init.zsh:13]
2) 1 25.43 25.43 23.63% 25.43 25.43 23.63% (anon) [/home/thiagoko/.zsh/plugins/zim-ssh/init.zsh:6]
3) 1 22.00 22.00 20.45% 21.92 21.92 20.36% _zsh_highlight_load_highlighters
4) 1 12.32 12.32 11.45% 12.32 12.32 11.45% autopair-init
5) 1 6.44 6.44 5.98% 6.44 6.44 5.98% compinit
6) 1 3.56 3.56 3.31% 3.48 3.48 3.23% prompt_pure_state_setup
7) 2 3.79 1.89 3.52% 2.85 1.43 2.65% async
8) 1 0.93 0.93 0.87% 0.93 0.93 0.87% async_init
9) 6 0.93 0.15 0.86% 0.93 0.15 0.86% is-at-least
10) 6 0.67 0.11 0.63% 0.67 0.11 0.63% add-zle-hook-widget
11) 1 8.25 8.25 7.66% 0.61 0.61 0.57% prompt_pure_setup
12) 1 0.40 0.40 0.37% 0.40 0.40 0.37% (anon) [/nix/store/p1zqypy7600fvfyl1v571bljx2l8zhay-zsh-autosuggestions-0.7.0/share/zsh-autosuggestions/zsh-autosuggestions.zsh:458]
13) 5 0.31 0.06 0.29% 0.31 0.06 0.29% add-zsh-hook
14) 1 0.60 0.60 0.56% 0.29 0.29 0.27% (anon) [/home/thiagoko/.zsh/plugins/zim-input/init.zsh:5]
15) 1 0.21 0.21 0.20% 0.21 0.21 0.20% compdef
16) 1 0.10 0.10 0.09% 0.10 0.10 0.09% _zsh_highlight__function_is_autoload_stub_p
17) 1 0.26 0.26 0.24% 0.08 0.08 0.08% _zsh_highlight__function_callable_p
18) 1 0.08 0.08 0.08% 0.08 0.08 0.08% prompt_pure_is_inside_container
19) 1 0.07 0.07 0.07% 0.07 0.07 0.07% _zsh_highlight__is_function_p
20) 1 0.01 0.01 0.01% 0.01 0.01 0.01% __wezterm_install_bash_prexec
21) 1 0.00 0.00 0.00% 0.00 0.00 0.00% _zsh_highlight_bind_widgets
# ...
I ommited some output for brevit. The first 2 things that shows are from the
zimfw, the framework that I use to configure
my ZSH (similar to Oh-My-Zsh). I actually don't use zimfw
directly, instead I
just load some modules that I find useful, like the zim-completion
and
zim-ssh
that we can see above. By the way, Zim is generally really well
optimised for startup time, but those 2 modules are kind slow.
For zim-completion
, after taking a
look at it, there isn't much I could do. It seems that the reason
zim-completion
takes so long during startup is because it is trying to decide
if it needs to recompile the completions (and replacing it with just a naive
autoload -U compinit && compinit
is even worse for startup performance). I
may eventually replace it for something else, but I really like what Zim brings
here, so I decided to not touch it for now.
However zim-ssh
is another history. The only
reason I used it is to start a ssh-agent
and keep it between multiple ZSH
sessions. It shouldn't have that much influence in startup time. So I took a
look the code (since it is small, I am reproducing it here):
#
# Set up ssh-agent
#
# Don't do anything unless we can actually use ssh-agent
(( ${+commands[ssh-agent]} )) && () {
ssh-add -l &>/dev/null
if (( ? == 2 )); then
# Unable to contact the authentication agent
# Load stored agent connection info
local -r ssh_env=${HOME}/.ssh-agent
if [[ -r ${ssh_env} ]] source ${ssh_env} >/dev/null
ssh-add -l &>/dev/null
if (( ? == 2 )); then
# Start agent and store agent connection info
(umask 066; ssh-agent >! ${ssh_env})
source ${ssh_env} >/dev/null
fi
fi
# Load identities
ssh-add -l &>/dev/null
if (( ? == 1 )); then
local -a zssh_ids
zstyle -a ':zim:ssh' ids 'zssh_ids'
if (( ${#zssh_ids} )); then
ssh-add ${HOME}/.ssh/${^zssh_ids} 2>/dev/null
else
ssh-add 2>/dev/null
fi
fi
}
Well, this is bad. Let's assume the common path, where the ssh-agent
is
already running but you open a new shell instance (that doesn't have the
connection info yet so it will need to load). This will run ssh-add
at 4
times. How long does ssh-add
takes to run?
$ hyperfine -Ni "ssh-add -l"
Benchmark 1: ssh-add -l
Time (mean ± σ): 4.6 ms ± 1.1 ms [User: 2.0 ms, System: 2.0 ms]
Range (min … max): 3.4 ms … 8.7 ms 619 runs
Warning: Ignoring non-zero exit code.
For those curious, -N
disables the Shell usage, that works better when the
command being tested is too fast.
In average we have 4x4ms=16ms of startup time. But keep in mind the worst case can be much worse. The question is, how can we improve the situation here?
After taking a look, I decided to write my own code, based in some ideas stolen from Oh-My-Zsh ssh-agent plugin. Here is final version of my code:
zmodload zsh/net/socket
_check_agent(){
if [[ -S "$SSH_AUTH_SOCK" ]] && zsocket "$SSH_AUTH_SOCK" 2>/dev/null; then
return 0
fi
return 1
}
_start_agent() {
# Test if $SSH_AUTH_SOCK is visible, in case we start e.g.: ssh-agent via
# systemd service
if _check_agent; then
return 0
fi
# Get the filename to store/lookup the environment from
local -r ssh_env_cache="$HOME/.ssh-agent"
# Check if ssh-agent is already running
if [[ -f "$ssh_env_cache" ]]; then
source "$ssh_env_cache" > /dev/null
# Test if $SSH_AUTH_SOCK is visible, e.g.: the ssh-agent is still alive
if _check_agent; then
return 0
fi
fi
# start ssh-agent and setup environment
(
umask 066
ssh-agent -s >! "$ssh_env_cache"
)
source "$ssh_env_cache" > /dev/null
}
_start_agent
unfunction _check_agent _start_agent
The idea here is simple: using
zsocket
module from ZSH itself to check if the ssh-agent
is working instead of
executing ssh-add -l
. The only case we run any program now is to start the
agent itself if needed. Let's run hyperfine
again:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 188.3 ms ± 8.2 ms [User: 61.1 ms, System: 130.0 ms]
Range (min … max): 170.9 ms … 198.4 ms 16 runs
Got a good improvement here already. Let's see zprof
again:
num calls time self name
-----------------------------------------------------------------------------------
1) 1 41.23 41.23 48.66% 33.52 33.52 39.56% (anon) [/home/thiagoko/.zsh/plugins/zim-completion/init.zsh:13]
2) 1 22.23 22.23 26.24% 22.12 22.12 26.10% _zsh_highlight_load_highlighters
3) 1 8.90 8.90 10.51% 8.90 8.90 10.51% Gautopair-init
4) 1 7.71 7.71 9.10% 7.71 7.71 9.10% compinit
5) 1 5.74 5.74 6.77% 5.60 5.60 6.60% prompt_pure_state_setup
6) 6 1.19 0.20 1.41% 1.19 0.20 1.41% add-zle-hook-widget
7) 2 1.97 0.99 2.33% 1.14 0.57 1.34% async
8) 6 0.87 0.15 1.03% 0.87 0.15 1.03% is-at-least
9) 1 0.84 0.84 0.99% 0.84 0.84 0.99% async_init
10) 1 9.30 9.30 10.97% 0.72 0.72 0.84% prompt_pure_setup
11) 5 0.63 0.13 0.75% 0.63 0.13 0.75% add-zsh-hook
12) 1 0.41 0.41 0.48% 0.41 0.41 0.48% _start_agent
13) 1 0.31 0.31 0.37% 0.31 0.31 0.37% (anon) [/nix/store/p1zqypy7600fvfyl1v571bljx2l8zhay-zsh-autosuggestions-0.7.0/share/zsh-autosuggestions/zsh-autosuggestions.zsh:458]
14) 1 0.55 0.55 0.64% 0.24 0.24 0.28% (anon) [/home/thiagoko/.zsh/plugins/zim-input/init.zsh:5]
15) 1 0.14 0.14 0.16% 0.14 0.14 0.16% prompt_pure_is_inside_container
16) 1 0.14 0.14 0.16% 0.14 0.14 0.16% compdef
17) 1 0.09 0.09 0.11% 0.09 0.09 0.11% _zsh_highlight__function_is_autoload_stub_p
18) 1 0.25 0.25 0.29% 0.08 0.08 0.09% _zsh_highlight__function_callable_p
19) 1 0.07 0.07 0.09% 0.07 0.07 0.09% _zsh_highlight__is_function_p
20) 1 0.01 0.01 0.01% 0.01 0.01 0.01% __wezterm_install_bash_prexec
21) 1 0.01 0.01 0.01% 0.01 0.01 0.01% _zsh_highlight_bind_widgets
# ...
Well, there is nothing interesting here anymore. I mean, zim-completion
is
still the main culprit, but nothing to do for now. Instead of looking at
zproof
, let's take a look at my ~/.zshrc
instead:
# ...
if [[ $options[zle] = on ]]; then
eval "$(/nix/store/sk6wsgp4h477baxypksz9rl8ldwwh9yg-fzf-0.54.0/bin/fzf --zsh)"
fi
# ...
/nix/store/x3yblr73r5x76dmaanjk3333mvzxc49r-any-nix-shell-1.2.1/bin/any-nix-shell zsh | source /dev/stdin
# ...
eval "$(/nix/store/330d6k81flfs6w46b44afmncxk57qggv-zoxide-0.9.4/bin/zoxide init zsh )"
# ...
eval "$(/nix/store/8l9j9kdv9m0z0s30lp4yvrc9s5bcbgmx-direnv-2.34.0/bin/direnv hook zsh)"
So you see, starting all those programs during ZSH startup can hurt the shell
startup considerable. Not necessary for commands fast like fzf
(that is
written in Go), but let's see
any-nix-shell
, that is written
in shell script:
$ hyperfine "any-nix-shell zsh"
Benchmark 1: any-nix-shell zsh
Time (mean ± σ): 16.0 ms ± 1.8 ms [User: 5.6 ms, System: 10.5 ms]
Range (min … max): 11.3 ms … 20.3 ms 143 runs
This is bad, consistently bad actually. Even for commands that are fast, keep
in mind that there is a difference between the cold and hot start again. For
example, fzf
:
$ hyperfine -N "fzf --zsh"
Benchmark 1: fzf --zsh
Time (mean ± σ): 2.9 ms ± 0.9 ms [User: 0.6 ms, System: 2.3 ms]
Range (min … max): 1.7 ms … 6.8 ms 1113 runs
See the range? While 1.7ms is something that is probably difficult to notice, 6.8ms can be noticiable, especially if this accumulates with other slow starting apps.
And the thing is, all those commands are just generating in the end a fixed output, at least for the current version of the program. Can we pre-generate them instead? If using Nix, of course we can:
# You need to disable the default integration
programs.direnv.enableZshIntegration = false;
programs.fzf.enableZshIntegration = false;
programs.zoxide.enableZshIntegration = false;
programs.zsh.initExtra =
# bash
''
# any-nix-shell
source ${
pkgs.runCommand "any-nix-shell-zsh" { } ''
${lib.getExe pkgs.any-nix-shell} zsh > $out
''
}
# fzf
source ${config.programs.fzf.package}/share/fzf/completion.zsh
source ${config.programs.fzf.package}/share/fzf/key-bindings.zsh
# zoxide
source ${
pkgs.runCommand "zoxide-init-zsh" { } ''
${lib.getExe config.programs.zoxide.package} init zsh > $out
''
}
# direnv
source ${
pkgs.runCommand "direnv-hook-zsh" { } ''
${lib.getExe config.programs.direnv.package} hook zsh > $out
''
}
'';
So we can use pkgs.runCommand
to run those commands during build time and
source
the result. fzf
actually doesn't need this since we have the files
already generated in the package. I think this is one of those things that
really shows the power of Nix: I wouldn't do something similar if I didn't use
Nix because the risk of breaking something later is big (e.g.: forgetting to
update the generated files), but Nix makes those things trivial.
Let's run hyperfine
again:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 162.3 ms ± 4.9 ms [User: 52.7 ms, System: 111.1 ms]
Range (min … max): 153.0 ms … 173.4 ms 19 runs
Another good improvement. The last change I did is switching between
zsh-syntax-highlighting
to
zsh-fast-syntax-highlighting
,
that is supposed to be faster and have better highlighting too. I got that from
_zsh_highlight_load_highlighters
using 26% of the time from my zprof
above.
And for the final hyperfine
in my laptop:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 138.3 ms ± 7.1 ms [User: 47.5 ms, System: 91.9 ms]
Range (min … max): 123.8 ms … 157.9 ms 21 runs
A ~36% improvement, not bad. Let's see how it fares in my Chromebook:
$ hyperfine "zsh -ic exit"
Benchmark 1: zsh -ic exit
Time (mean ± σ): 278.2 ms ± 46.9 ms [User: 88.0 ms, System: 184.8 ms]
Range (min … max): 204.7 ms … 368.5 ms 11 runs
An even more impressive ~59% improvement. And yes, the shell startup now feels much better.