<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>@bnjbvr - rust</title>
    <subtitle>Technical blog and random musings.</subtitle>
    <link rel="self" type="application/atom+xml" href="https://bouvier.cc/tags/rust/atom.xml"/>
    <link rel="alternate" type="text/html" href="https://bouvier.cc"/>
    <generator uri="https://www.getzola.org/">Zola</generator>
    <updated>2024-11-02T00:00:00+00:00</updated>
    <id>https://bouvier.cc/tags/rust/atom.xml</id>
    <entry xml:lang="en">
        <title>Publish a Zola blog with Gitlab CI, real fast</title>
        <published>2024-11-02T00:00:00+00:00</published>
        <updated>2024-11-02T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Benjamin Bouvier
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://bouvier.cc/tech/publish-zola-with-gitlab-ci/"/>
        <id>https://bouvier.cc/tech/publish-zola-with-gitlab-ci/</id>
        <content type="html" xml:base="https://bouvier.cc/tech/publish-zola-with-gitlab-ci/">&lt;p&gt;The two or three regular visitors of this blog (hi friends!) might have noticed I’ve changed the
design, as of this year. I’ve actually moved from using the &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;getpelican.com&#x2F;&quot;&gt;Pelican&lt;&#x2F;a&gt;
static blog generator to using &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.getzola.org&#x2F;&quot;&gt;Zola&lt;&#x2F;a&gt;. This article shows the continuous
integration I’ve set up to automatically build and push content to my server, Real Fast™.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;Was there anything wrong with &lt;code&gt;Pelican&lt;&#x2F;code&gt;? Not really. It has served me well over the years, and its
naming coming from an anagram of &lt;code&gt;calepin&lt;&#x2F;code&gt;, French for a “small notebook”, is still just genius.
Installing on a new machine was a bit of a pain as I needed to recall which Python &#x2F; virtualenv
&#x2F; etc. commands were required to install dependencies, but over the years I’ve made a Makefile to simplify that. And it
didn’t happen that often though; at most I’d run such a command once every two years or so (which
is, arguably, already a lot if that’s the frequency at which one switches machines).&lt;&#x2F;p&gt;
&lt;p&gt;I mostly moved to Zola because I’m a zealot &#x2F; member of the &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;rustjerk&#x2F;comments&#x2F;av5pog&#x2F;higherres_rust_evangelism_strike_force_image&#x2F;&quot;&gt;Rust evangelism strike
force&lt;&#x2F;a&gt;
on my spare time, and want to support tools created with Rust, which are usually blazing 😎 fast 🚀
&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-2-1&quot;&gt;&lt;a href=&quot;#fn-2&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;, in addition to being super safe. And, having a statically linked binary is super convenient,
and notably super nice for continuous integration and deployment purposes. Indeed, it’s trivial to
cache a single binary in CI, and not have to worry about installing&#x2F;caching dependencies and so on.
In addition to that, Zola supports front-matter annotations with TOML &lt;em&gt;or&lt;&#x2F;em&gt; YAML, so it offered a
nice migration path from Pelican, who uses YAML for frontmatters.&lt;&#x2F;p&gt;
&lt;p&gt;I’ve been using &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;about.gitlab.com&#x2F;&quot;&gt;Gitlab&lt;&#x2F;a&gt;, notably the &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;framagit.org&#x2F;&quot;&gt;Framagit&lt;&#x2F;a&gt;
instance hosted by the good fellows at &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;soutenir.framasoft.org&quot;&gt;Framasoft&lt;&#x2F;a&gt; (give them
money to support them!), for the repository hosting the sources of my blog. As such, I’ve wanted to
be able to push to the repository, and have the CI build and publish to my website.&lt;&#x2F;p&gt;
&lt;p&gt;Now, let me explain how I’ve did it. These instructions are valid for &lt;em&gt;GitLab Community Edition
v17.5.1&lt;&#x2F;em&gt; ; some things may change in newer versions, so I can’t guarantee they’ll work forever.
There’s a first stage that will build the public website using a cached Zola binary, if possible,
or grab it from the Github releases website otherwise.
The second stage&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-1-1&quot;&gt;&lt;a href=&quot;#fn-1&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; will upload it to the server, using &lt;code&gt;rsync&lt;&#x2F;code&gt;. I’d recommend creating a new user
just for this task, with limited SSH access to a single directory, that is, where the generated
HTML will live. With caching, each stage takes at most 10 seconds to run, which I find…
acceptable 😁.&lt;&#x2F;p&gt;
&lt;p&gt;Here’s the &lt;code&gt;.gitlab-ci.yml&lt;&#x2F;code&gt; file I’ve checked in, heavily commented for your (and my future self’s) best
understanding:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;yaml&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;default&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  image&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; debian:stable-slim&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;variables&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # The runner will be able to pull your Zola theme when the strategy is&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # set to &amp;quot;recursive&amp;quot;.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  GIT_SUBMODULE_STRATEGY&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;recursive&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # If you don&amp;#39;t set a version here, your site will be built with the latest&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # version of Zola available in GitHub releases.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Use the semver (x.y.z) format to specify a version. For example: &amp;quot;0.17.2&amp;quot; or &amp;quot;0.18.0&amp;quot;.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  ZOLA_VERSION&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;    description&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;The version of Zola used to build the site.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;    value&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;0.19.1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;build&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  stage&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; build&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Cache the Zola binary based on its version, to avoid conflicts between different versions.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  cache&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;    key&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; $ZOLA_VERSION&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;    paths&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;     # $CI_PROJECT_DIR is the current working directory in subsequent steps.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;      -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; &amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;$CI_PROJECT_DIR&#x2F;zola&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  script&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;    -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; |&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;      if [ ! -e &amp;quot;$CI_PROJECT_DIR&#x2F;zola&amp;quot; ]; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        echo &amp;quot;Downloading Zola…&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        # Download enough to use `wget`.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        apt-get update --assume-yes &amp;amp;&amp;amp; apt-get install --assume-yes --no-install-recommends wget ca-certificates&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        if [ $ZOLA_VERSION ]; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;          zola_url=&amp;quot;https:&#x2F;&#x2F;github.com&#x2F;getzola&#x2F;zola&#x2F;releases&#x2F;download&#x2F;v$ZOLA_VERSION&#x2F;zola-v$ZOLA_VERSION-x86_64-unknown-linux-gnu.tar.gz&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;          if ! wget --quiet --spider $zola_url; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;            echo &amp;quot;A Zola release with the specified version could not be found.&amp;quot;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;            exit 1;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;          fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        else&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;          github_api_url=&amp;quot;https:&#x2F;&#x2F;api.github.com&#x2F;repos&#x2F;getzola&#x2F;zola&#x2F;releases&#x2F;latest&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;          zola_url=$(&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;            wget --output-document - $github_api_url |&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;            grep &amp;quot;browser_download_url.*linux-gnu.tar.gz&amp;quot; |&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;            cut --delimiter : --fields 2,3 |&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;            tr --delete &amp;quot;\&amp;quot; &amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;          )&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        wget $zola_url&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        tar -xzf *.tar.gz&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;      else&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;        echo &amp;quot;Reusing cached Zola…&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;      fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;    -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; |&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;      $CI_PROJECT_DIR&#x2F;zola build&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # The built artifacts will be put in the `public&#x2F;` directory, and reused during the next stage.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  artifacts&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;    paths&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;      -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; public&#x2F;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;    expire_in&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; 1 day&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;deploy&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  stage&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; deploy&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  only&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; main&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  dependencies&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; build&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F7768E;&quot;&gt;  script&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Install rsync and ssh, if needs be.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; apt-get update -qq &amp;amp;&amp;amp; apt-get install -y -qq rsync&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;which ssh-agent || ( apt-get update -y &amp;amp;&amp;amp; apt-get install openssh-client -y )&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; eval $(ssh-agent -s)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Set SSH private key, and define the right permissions to not trigger security errors.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; echo &amp;quot;$SSH_PRIVATE_KEY&amp;quot; | tr -d &amp;#39;\r&amp;#39; | ssh-add - &amp;gt; &#x2F;dev&#x2F;null&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; mkdir -p ~&#x2F;.ssh&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; chmod 700 ~&#x2F;.ssh&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Set SSH known hosts, and define the right permissions to not trigger security errors.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; echo &amp;quot;$SSH_KNOWN_HOSTS&amp;quot; &amp;gt; ~&#x2F;.ssh&#x2F;known_hosts&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; chmod 644 ~&#x2F;.ssh&#x2F;known_hosts&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Run rsync.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;  # Note: -avzh = append &#x2F; verbose &#x2F; compress &#x2F; human-readable&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;  -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; rsync -avzh --delete public&#x2F;* -e &amp;quot;ssh -p $SSH_PORT&amp;quot; $SSH_USERNAME@$SSH_HOST:$SSH_TARGET_DIR&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;After you’ve set this up, you need to fill some secrets on your CI, by going to &lt;code&gt;Settings&lt;&#x2F;code&gt;,
then &lt;code&gt;CI&#x2F;CD&lt;&#x2F;code&gt;, then fill all the following variables under &lt;code&gt;Variables&lt;&#x2F;code&gt;. Make sure to create them
as &lt;code&gt;Masked&lt;&#x2F;code&gt; variables, if not &lt;code&gt;Masked and hidden&lt;&#x2F;code&gt;, unless you’d like the SSH private key to
leak 🤡.&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;SSH_HOST&lt;&#x2F;code&gt;: server’s host, e.g. &lt;code&gt;mysupersecretserver.bouvier.cc&lt;&#x2F;code&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;SSH_PORT&lt;&#x2F;code&gt;: the port to be used for connecting with SSH, e.g. 22 by default.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;SSH_KNOWN_HOSTS&lt;&#x2F;code&gt;: a copy of a simplified &lt;code&gt;~&#x2F;.ssh&#x2F;known_hosts&lt;&#x2F;code&gt; file, stripped down to only
contain lines related to the above &lt;code&gt;SSH_HOST&lt;&#x2F;code&gt;. This is to avoid having the CI task to confirm “Are
you sure you want to trust this host” when connecting for the first time to the server.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;SSH_TARGET_DIR&lt;&#x2F;code&gt;: the final directory where the generated HTML should live, e.g. &lt;code&gt;&#x2F;var&#x2F;www&#x2F;myblog&lt;&#x2F;code&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;SSH_USERNAME&lt;&#x2F;code&gt;: the username of the (Unix) user used in the CI task to upload generated artifacts
to your Web server…&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;SSH_PRIVATE_KEY&lt;&#x2F;code&gt;: …and its private key.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Hope this was useful!&lt;&#x2F;p&gt;
&lt;section class=&quot;footnotes&quot;&gt;
&lt;ol class=&quot;footnotes-list&quot;&gt;
&lt;li id=&quot;fn-2&quot;&gt;
&lt;p&gt;Don’t pay attention to the emojis, they’re a meme at this point. &lt;a href=&quot;#fr-2-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-1&quot;&gt;
&lt;p&gt;Using two stages is inherited from my previous setup using Pelican, and might be overkill
since the each stage takes at most 10 seconds to run, with a hot cache. &lt;a href=&quot;#fr-1-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;&#x2F;section&gt;
</content>
    </entry>
    <entry xml:lang="en">
        <title>cargo-machete, find unused dependencies quickly</title>
        <published>2022-04-27T23:17:23+00:00</published>
        <updated>2022-04-27T23:17:23+00:00</updated>
        
        <author>
          <name>
            
              Benjamin Bouvier
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://bouvier.cc/tech/cargo-machete/"/>
        <id>https://bouvier.cc/tech/cargo-machete/</id>
        <content type="html" xml:base="https://bouvier.cc/tech/cargo-machete/">&lt;p&gt;&lt;code&gt;cargo-machete&lt;&#x2F;code&gt; is a new Cargo tool that detects unused dependencies in Rust
projects, in a fast (yet imprecise) way. As of today you can install it with
&lt;code&gt;cargo install cargo-machete&lt;&#x2F;code&gt; and then run it with &lt;code&gt;cargo machete&lt;&#x2F;code&gt; from any
folder that contains a workspace or crate, to find if you have potentially
unused dependencies. Beware, it can report a few false positives!&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;&lt;h2 id=&quot;problem-statement&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#problem-statement&quot; aria-label=&quot;Anchor link for: problem-statement&quot;&gt;🔗&lt;&#x2F;a&gt;Problem statement&lt;&#x2F;h2&gt;
&lt;p&gt;When developers hack on code, it’s a pretty common to reuse software that
already exists and has been written, optimized, and battle-tested by many
others. In fact, that’s a core idea of the open-source movement, and one
historical reason for its existence.&lt;&#x2F;p&gt;
&lt;p&gt;When zooming in into the Rust programming language case, my opinion is that it
is also a key reason why Rust has been so successful: having plenty of crates
doing everything you might need, already implemented for you and at hand’s
reach on &lt;code&gt;crates.io&lt;&#x2F;code&gt;. Plus, having the wonder of a one-does-it-all Cargo tool
that makes it very easy to use those crates as dependencies in your project.
&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-1-1&quot;&gt;&lt;a href=&quot;#fn-1&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;&lt;&#x2F;p&gt;
&lt;p&gt;However, this comes with a price: sometimes you add a dependency because it’s
useful at a particular point in time. Much later, it’s not useful anymore, but
you may have forgotten about it. And then, the dependency remains as a zombie
in your &lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; file. Cargo will include it in the compilation graph,
despite the compilation artifacts not being used at all. The unused dependency
will just stay there, silently weep, waiting for you to recall it exists.&lt;&#x2F;p&gt;
&lt;p&gt;Of course, the problem can even become worse: maybe you maintain several crates
that have unused dependencies. Or maybe you work with many crates as part of a
workspace, and each may have unused dependencies. Or simply you use many
dependencies yourself, and some may include unused dependencies. If you’ve
published your crates and others use those, then everyone could also compile
unused dependencies. At the scale of the entire Rust crates ecosystem, it can
have a huge impact on the compile times, produced heat and wasted energy.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;have-you-heard-about-our-lord-and-savior-cargo-udeps&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#have-you-heard-about-our-lord-and-savior-cargo-udeps&quot; aria-label=&quot;Anchor link for: have-you-heard-about-our-lord-and-savior-cargo-udeps&quot;&gt;🔗&lt;&#x2F;a&gt;Have you heard about our lord and savior, &lt;code&gt;cargo-udeps&lt;&#x2F;code&gt;&lt;&#x2F;h2&gt;
&lt;p&gt;There’s already a nice tool for this in the ecosystem:
&lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;est31&#x2F;cargo-udeps&quot;&gt;&lt;code&gt;cargo-udeps&lt;&#x2F;code&gt;&lt;&#x2F;a&gt;. It will compile your
crate (or workspace) and then infer from the compiled artifacts what
dependencies are used by your project, and thus show you which dependencies are
&lt;em&gt;unused&lt;&#x2F;em&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;That’s great, but the way it works forces a few tradeoffs:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;it requires to compile the whole crate with the Rustc &lt;code&gt;nightly&lt;&#x2F;code&gt; compiler. For
me that means recompiling the whole project from scratch, most of the time,
since I’m mostly using stable rustc as my daily driver.&lt;&#x2F;li&gt;
&lt;li&gt;if you compile for multiple targets (i.e. different combinations of CPU
flavor, OS, environment, etc.), you’d need to run &lt;code&gt;cargo-udeps&lt;&#x2F;code&gt; on each of
those to find per-target unused dependencies. For instance, if a dependency
is only configured when compiling for x86_64 machines, then it may be flagged
as unused on every other configuration.&lt;&#x2F;li&gt;
&lt;li&gt;most of all, since it look at compilation artifacts, it cannot know if a
specific dependency is directly used by your crate, or indirectly, leading to
somehwat mystifying results in the case of workspaces.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Let’s dive a bit deeper into the last item, which I’ll refer to as &lt;em&gt;the
transitively-used dependencies problem&lt;&#x2F;em&gt;. Say you have your project &lt;code&gt;AAA&lt;&#x2F;code&gt; that
contains a dependency to &lt;code&gt;serde&lt;&#x2F;code&gt; in its &lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; file, while it’s not
directly used by your code. In fact, if you did a text-search of &lt;code&gt;serde&lt;&#x2F;code&gt; in
&lt;code&gt;AAA&lt;&#x2F;code&gt;’s code with &lt;code&gt;grep&lt;&#x2F;code&gt;, you wouldn’t find a single match&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-2-1&quot;&gt;&lt;a href=&quot;#fn-2&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;. But now &lt;code&gt;AAA&lt;&#x2F;code&gt;
is using another crate, &lt;code&gt;AB&lt;&#x2F;code&gt;, that itself depends on &lt;code&gt;serde&lt;&#x2F;code&gt;. &lt;code&gt;cargo-udeps&lt;&#x2F;code&gt;
will see that &lt;code&gt;serde&lt;&#x2F;code&gt; is used &lt;em&gt;overall&lt;&#x2F;em&gt;, so it cannot let you know that &lt;code&gt;AAA&lt;&#x2F;code&gt;’s
&lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; file references an unused dependency to &lt;code&gt;serde&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;bouvier.cc&#x2F;tech&#x2F;cargo-machete&#x2F;unused.png&quot; alt=&quot;Graph of crates containing one unused crate&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;How is this a problem? After all, if the workspace uses &lt;code&gt;serde&lt;&#x2F;code&gt; even
indirectly, then we &lt;em&gt;will&lt;&#x2F;em&gt; have to compile it at some point, so it’s not like
it’s &lt;em&gt;really&lt;&#x2F;em&gt; unused.&lt;&#x2F;p&gt;
&lt;p&gt;First of all, the &lt;code&gt;AAA&lt;&#x2F;code&gt; crate might be using a different version of &lt;code&gt;serde&lt;&#x2F;code&gt;
than the &lt;code&gt;AB&lt;&#x2F;code&gt; crate, and this could result in different copies of the same
crate in your workspace. Note there are other nice tools that automatically
detect this kind of situation (hi there
&lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;EmbarkStudios&#x2F;cargo-deny&#x2F;&quot;&gt;cargo-deny&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;p&gt;Second, the order in which crates are compiled has an impact on compilation
parallelism, and having unused dependencies may add spurious synchronization
points in the compilation graph. When a Rust crate gets compiled by Cargo,
Cargo proceeds in two phases:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;first, it collects information so as to unlock the compilation of other
crates further down the road that may depend on this particular one. I don’t
know precisely what it entails, but one can make educated guesses: parse the
code, analyze which items are &lt;code&gt;pub&lt;&#x2F;code&gt;lic, compute memory layouts for public
types, collect type information and so on and so forth.&lt;&#x2F;li&gt;
&lt;li&gt;then, it does the actual compilation: optimize and generate the actual
machine code for that particular crate, that will be later linked with other
artifacts to form the final executable program.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;The advantage of this two-phases scheme is that once Cargo is done with phase 1
for a particular crate, it can kick off the same process for other crates up
the dependency tree, while it runs phase 2 concurrently. With a multi-core
machine as is the norm on desktop computers, it’s almost certain that this will
bring speedups!&lt;&#x2F;p&gt;
&lt;p&gt;For instance, consider the following &lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; file from our previous
example project:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;toml&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;dependencies&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;]&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;serde&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; = &amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;1.0&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Then a possible compilation graph could look like that:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;bouvier.cc&#x2F;tech&#x2F;cargo-machete&#x2F;phases.png&quot; alt=&quot;Compilation graph showing phases&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;In this case, &lt;code&gt;ab&lt;&#x2F;code&gt; phase 1 can start as soon as &lt;code&gt;serde&lt;&#x2F;code&gt; phase 1 has finished,
while &lt;code&gt;serde&lt;&#x2F;code&gt;’s compilation phase 2 happens in the background.&lt;&#x2F;p&gt;
&lt;p&gt;If you’re interested in reducing the overall compile times of your Rust
project, I would strongly suggest to &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;doc.rust-lang.org&#x2F;nightly&#x2F;cargo&#x2F;reference&#x2F;unstable.html#timings&quot;&gt;go read Rust’s documentation around
timings
visualization&lt;&#x2F;a&gt;.
Crates which spend lots of time in the first phase (or more generally, in both
phases) are basically pipelining bottlenecks, so identifying&#x2F;removing&#x2F;working
around them overall speeds up compile times.&lt;&#x2F;p&gt;
&lt;p&gt;Back to our small unused dependency problem: an unused dependency in your
&lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; may block the compilation of other crates up the dependency tree,
and thus may slow down the whole compilation process by creating useless check
points.&lt;&#x2F;p&gt;
&lt;p&gt;Consider a crate &lt;code&gt;C&lt;&#x2F;code&gt; that depends on crates &lt;code&gt;A&lt;&#x2F;code&gt; and &lt;code&gt;B&lt;&#x2F;code&gt;, with &lt;code&gt;B&lt;&#x2F;code&gt; actually
unused:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;bouvier.cc&#x2F;tech&#x2F;cargo-machete&#x2F;pipeline-stall.png&quot; alt=&quot;Pipeline stall&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Here, the compilation of the crate &lt;code&gt;C&lt;&#x2F;code&gt; could start way earlier, but it’s
blocking waiting for the compilation of &lt;code&gt;B&lt;&#x2F;code&gt; to finish first, while it’s not
even used!&lt;&#x2F;p&gt;
&lt;h2 id=&quot;solving-this-the-naive-way&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#solving-this-the-naive-way&quot; aria-label=&quot;Anchor link for: solving-this-the-naive-way&quot;&gt;🔗&lt;&#x2F;a&gt;Solving this, the naive way&lt;&#x2F;h2&gt;
&lt;p&gt;So when I was trying to confirm whether crates found by &lt;code&gt;cargo-udeps&lt;&#x2F;code&gt; were
actually used or not in my Rust projects, the thing I’d do would be to &lt;code&gt;grep&lt;&#x2F;code&gt;
(or better, use the blazingly fast Rust replacement
&lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;BurntSushi&#x2F;ripgrep&quot;&gt;ripgrep&lt;&#x2F;a&gt;) the crate’s name in the
project. After all, the crate’s name is in the source directory, if and only if
the crate is used, right?&lt;&#x2F;p&gt;
&lt;p&gt;The answer is… mostly, yes. If we exclude dynamic code loading via mechanisms
like &lt;code&gt;dlopen&lt;&#x2F;code&gt; or WebAssembly, then there aren’t so many ways to use other
crates &lt;em&gt;directly&lt;&#x2F;em&gt;, in Rust code. In fact, we can exhaustively enumerate all the
syntax items to use other dependencies in Rust:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;use&lt;&#x2F;span&gt;&lt;span&gt; my_crate&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;use&lt;&#x2F;span&gt;&lt;span&gt; your_crate &lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;as&lt;&#x2F;span&gt;&lt;span&gt; my_crate&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;use {&lt;&#x2F;span&gt;&lt;span&gt; your_crate &lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;as&lt;&#x2F;span&gt;&lt;span&gt; my_crate &lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;};&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;extern&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7DCFFF;&quot;&gt; crate&lt;&#x2F;span&gt;&lt;span&gt; my_crate&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;fn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; main&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;() {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;    my_crate&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;something&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;();&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;I’ve looked at a bit of Rust code now, and I haven’t seen other direct forms;
if I am missing any, please let me know! Now, these are the most &lt;em&gt;frequent&lt;&#x2F;em&gt;
ways to use a dependency, but there are in fact other ways:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;build.rs&lt;&#x2F;code&gt; scripts can generate code that could use other crates, and that
would not be visible through a text search in the &lt;code&gt;src&#x2F;&lt;&#x2F;code&gt; directory, as the
generated code is somewhere inside the &lt;code&gt;target&#x2F;build&#x2F;&lt;&#x2F;code&gt; directory.&lt;&#x2F;li&gt;
&lt;li&gt;macros (procedural or not) can expand to code that’s using other crates,
while the source code doesn’t &lt;em&gt;explicitly&lt;&#x2F;em&gt; mention them. For instance, the
&lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;Luthaf&#x2F;log-once&quot;&gt;&lt;code&gt;log_once&lt;&#x2F;code&gt;&lt;&#x2F;a&gt; crate uses the &lt;code&gt;log&lt;&#x2F;code&gt; macros
in its own macros, but &lt;code&gt;log_once&lt;&#x2F;code&gt; doesn’t depend on &lt;code&gt;log&lt;&#x2F;code&gt; explicitly. It’s a
bold and smart move: it breaks the coupling with the specific version of
&lt;code&gt;log&lt;&#x2F;code&gt; , and as long as the high-level API of &lt;code&gt;log&lt;&#x2F;code&gt; is stable (which is the
case), then &lt;code&gt;log_once&lt;&#x2F;code&gt; works with &lt;em&gt;any&lt;&#x2F;em&gt; version of &lt;code&gt;log&lt;&#x2F;code&gt;.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;And then, there’s still a bit of room for some false positives:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;raw text submatches: e.g. if a crate is named &lt;code&gt;bar&lt;&#x2F;code&gt;, then &lt;code&gt;foobar::&lt;&#x2F;code&gt; would be
a match if we’re doing a raw &lt;code&gt;grep&lt;&#x2F;code&gt; search&lt;&#x2F;li&gt;
&lt;li&gt;text search isn’t syntaxic analysis, and we wouldn’t know if a match is in a
comment (&lt;code&gt;&#x2F;&#x2F; use foo;&lt;&#x2F;code&gt;), or a string (&lt;code&gt;String::from(&quot;use foo;&quot;)&lt;&#x2F;code&gt;).&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;But that would do &lt;em&gt;most of the job&lt;&#x2F;em&gt;, wouldn’t it? In particular, compared to
&lt;code&gt;cargo-udeps&lt;&#x2F;code&gt;, this approach doesn’t suffer from the &lt;em&gt;transitively-used
dependencies&lt;&#x2F;em&gt; problem. If you look for a crate’s name in the &lt;code&gt;src&#x2F;&lt;&#x2F;code&gt; directory
and it’s not there, it’s likely not used by your crate. The End.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;a-tedious-process-calls-for-automation-so-i-made-a-tool&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#a-tedious-process-calls-for-automation-so-i-made-a-tool&quot; aria-label=&quot;Anchor link for: a-tedious-process-calls-for-automation-so-i-made-a-tool&quot;&gt;🔗&lt;&#x2F;a&gt;A tedious process calls for automation, so I made a tool&lt;&#x2F;h2&gt;
&lt;p&gt;And I’ve called it &lt;code&gt;cargo-machete&lt;&#x2F;code&gt;. Like a machete, it is very useful for
quickly weeding out things, but it is very imprecise and you wouldn’t trust it
at 100%.&lt;&#x2F;p&gt;
&lt;p&gt;The gist of it is:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;find directories that might contain Rust projects, as indicated by the
presence of a &lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; file&lt;&#x2F;li&gt;
&lt;li&gt;for each dependency, create an absolutely ugly regular expression that
matches any of the syntaxic forms presented above. The regular expression
does better than just raw text search, in particular it doesn’t run into the
text submatch issue.
&lt;ul&gt;
&lt;li&gt;then for each file in the project, try to match the regular expression
against each line of any source file, and stop at the first successful
match (which means the dependency &lt;em&gt;is&lt;&#x2F;em&gt; used)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;This tool is &lt;em&gt;fast&lt;&#x2F;em&gt;, because it combines the core library behind &lt;code&gt;ripgrep&lt;&#x2F;code&gt; for
matching regular expressions, with &lt;em&gt;rayon&lt;&#x2F;em&gt; for running it in parallel across
all the dependencies of a project. On my machine, the problem is CPU-bound,
because of the execution of the regular expression (and maybe thanks to my NVME
storage too). That’s only one data point, but on this particular beefy desktop
I use, it scans the entirety of the &lt;code&gt;rust-lang&#x2F;rust&lt;&#x2F;code&gt; repository in 1.08
seconds, or all of &lt;code&gt;BytecodeAlliance&#x2F;wasmtime&lt;&#x2F;code&gt; in 0.58 seconds.&lt;&#x2F;p&gt;
&lt;p&gt;The tool is &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bnjbvr&#x2F;cargo-machete&quot;&gt;&lt;em&gt;open source&lt;&#x2F;em&gt;&lt;&#x2F;a&gt;, of course.&lt;&#x2F;p&gt;
&lt;p&gt;As is the tradition for Cargo tools, it can be installed with:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;cargo install cargo&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;-&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;machete&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;and then can be used, from any directory that contains Rust code (be it a
workspace, a single project, or a directory on top of many Rust projects), with
the following line:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;cargo machete&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Here’s an output example:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt; cargo machete&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Looking&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; for crates in this directory and analyzing their dependencies...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;&#x2F;home&#x2F;ben&#x2F;code&#x2F;cargo-machete&#x2F;integration-tests&#x2F;with-bench&#x2F;Cargo.toml&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E0AF68;&quot;&gt; --&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; no package, must be a workspace&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;just-unused&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E0AF68;&quot;&gt; --&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; &#x2F;home&#x2F;ben&#x2F;code&#x2F;cargo-machete&#x2F;integration-tests&#x2F;just-unused&#x2F;Cargo.toml:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;	log&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;unused-transitive&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E0AF68;&quot;&gt; --&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt; &#x2F;home&#x2F;ben&#x2F;code&#x2F;cargo-machete&#x2F;integration-tests&#x2F;unused-transitive&#x2F;Cargo.toml:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;	lib1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Done!&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;There are &lt;em&gt;false positives&lt;&#x2F;em&gt;: code generated via macros or build scripts aren’t
inspected as they’re not in the &lt;code&gt;src&#x2F;&lt;&#x2F;code&gt; directory and &lt;code&gt;cargo-machete&lt;&#x2F;code&gt; doesn’t
run any compile step. For instance, if a project depends on &lt;code&gt;log&lt;&#x2F;code&gt; , but uses it
only through &lt;code&gt;log_once&lt;&#x2F;code&gt;, then &lt;code&gt;cargo-machete&lt;&#x2F;code&gt; will incorrectly flag &lt;code&gt;log&lt;&#x2F;code&gt; as an
unused dependency.&lt;&#x2F;p&gt;
&lt;p&gt;The good news is that, thanks to a contribution from &lt;code&gt;@daniel5151&lt;&#x2F;code&gt;, you can
specify &lt;em&gt;known false positives&lt;&#x2F;em&gt; in the &lt;code&gt;Cargo.toml&lt;&#x2F;code&gt; file of your crate,
allowing use of &lt;code&gt;cargo-machete&lt;&#x2F;code&gt; in CI setups:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;toml&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;package&lt;&#x2F;span&gt;&lt;span&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;metadata&lt;&#x2F;span&gt;&lt;span&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;cargo-machete&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;]&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ignored&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt; [&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ECE6A;&quot;&gt;log&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9ABDF5;&quot;&gt;]&lt;&#x2F;span&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt; # false positive, used by log_once! macro&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;As far as I know, the risk for &lt;em&gt;false negatives&lt;&#x2F;em&gt; (i.e. crates that are unused,
but the tool thinks they’re used) is pretty low. One such instance would be a
multi-line string containing one of the &lt;code&gt;use&lt;&#x2F;code&gt; forms, but that seems rather
unlikely to be present in most Rust projects.&lt;&#x2F;p&gt;
&lt;p&gt;The tool is still a bit rough, but it’s been already quite useful for some
projects I’ve been working on! In a particular work project, most unused
dependencies were transitively used and compiled, but the rejiggering of the
compilation graph lead to a 5% compile time speedup overall. Good impact over
effort ratio.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;what-about-other-languages&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#what-about-other-languages&quot; aria-label=&quot;Anchor link for: what-about-other-languages&quot;&gt;🔗&lt;&#x2F;a&gt;What about other languages?&lt;&#x2F;h2&gt;
&lt;p&gt;What makes this possible in Rust, and could it be extended to other languages?&lt;&#x2F;p&gt;
&lt;p&gt;Dynamic languages by nature dynamically load code, but there are still ways to
try to automate detecting unused dependencies same as &lt;code&gt;cargo-machete&lt;&#x2F;code&gt; does.
Consider JavaScript and its &lt;code&gt;require&lt;&#x2F;code&gt; function, that can dynamically evaluate a
string that’s a path to a file with code we want to import. Since there’s an
infinity of ways to create a string, we can’t just perfectly rely on finding
&lt;code&gt;require(&quot;abc&quot;)&lt;&#x2F;code&gt; and assume that if not present, then &lt;code&gt;abc&lt;&#x2F;code&gt; isn’t used. Ditto
with &lt;code&gt;import&lt;&#x2F;code&gt; statements, which can evaluate dynamic sources. That being said,
if JS code is restricted to use &lt;code&gt;require&lt;&#x2F;code&gt; statements with only static strings
or static &lt;code&gt;import&lt;&#x2F;code&gt; statements, then this may work too! Although when
restricting to static &lt;code&gt;require&lt;&#x2F;code&gt;s, even just &lt;em&gt;loading&lt;&#x2F;em&gt; the code in NodeJS would
be sufficient to find unused dependencies with perfect accuraccy.&lt;&#x2F;p&gt;
&lt;p&gt;Back to static languages, where I constrain the problem to non-dynamic
dependencies (loaded via &lt;code&gt;dlopen&lt;&#x2F;code&gt; etc.). In a language like C or C++, there are
no unified module systems or package description (yet! although &lt;code&gt;cmake&lt;&#x2F;code&gt; might
be a de-facto standard). We can still apply this to header files, and look for
their inclusion via &lt;code&gt;#include&lt;&#x2F;code&gt; statements. Macros and preprocessed code would
also throw a wrench in the process. Then some human intervention would still be
required to eliminate the .c files, but I haven’t thought about it too much.&lt;&#x2F;p&gt;
&lt;p&gt;Static analysis of compiled binaries might be simpler, for that matter. If we
consider the problem for WebAssembly, we can frame it as “which imported
functions are not used in the module”, potentially eliminating an entire range
of host functions. In the simplest case, we could just look at the code
section, through the function bodies, and see if there’s any reference to
indices of every single imported function in &lt;code&gt;call&lt;&#x2F;code&gt; opcodes. Then, there can be
function &lt;code&gt;Table&lt;&#x2F;code&gt;s referencing those, so we have to make sure no table elements
reference the function. And if any table is mutable and publicly exposed via an
export, then a user of the wasm module may reference any function declared in
the wasm module, including imported functions, so all bets are off. Note
dead-code elimination in wasm would be pretty similar and suffer from the same
limitations: after all, a function dependency is just another kind of function,
in wasm! Each format may have such idiosyncrasies like that. Static analysis of
final binaries (as opposed to libraries) might be possible and reliable,
though.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;closing-thoughts&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#closing-thoughts&quot; aria-label=&quot;Anchor link for: closing-thoughts&quot;&gt;🔗&lt;&#x2F;a&gt;Closing thoughts&lt;&#x2F;h2&gt;
&lt;p&gt;For the sake of completeness, I should mention the existence of a rustc
crate-wide lint for this, &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;rust-lang&#x2F;rust&#x2F;pull&#x2F;72342&quot;&gt;since May 2020 or
so&lt;&#x2F;a&gt;:
&lt;code&gt;#![warn(unused_crate_dependencies)]&lt;&#x2F;code&gt;. This tells about unused crate
dependencies directly as a Rust warning, which in my opinion would be the ideal
end goal! Unfortunately, some Github comments suggest it suffers from having
too many false positives, and still it requires compiling the code.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;code&gt;@est31&lt;&#x2F;code&gt;, of &lt;code&gt;cargo-udeps&lt;&#x2F;code&gt;’s fame, has been working on a &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;rust-lang&#x2F;cargo&#x2F;pull&#x2F;8437&quot;&gt;better
solution&lt;&#x2F;a&gt;. It seems to not be so
far from completion, so between this and the Rust lint, I’m hopeful that there
could be a time where we have a solution that is perfectly precise, with
neither false positives nor false negatives.&lt;&#x2F;p&gt;
&lt;p&gt;In the meanwhile, I hope that &lt;code&gt;cargo-machete&lt;&#x2F;code&gt; can be useful to some of you, or
that it inspires others to make similar quick-and-dirty tools, in Rust or in
other languages. Thanks for reading this far, and please &lt;a href=&quot;https:&#x2F;&#x2F;bouvier.cc&#x2F;tech&#x2F;cargo-machete&#x2F;@mailto:benjamin+cargomachete@bouvier.cc&quot;&gt;get in
touch&lt;&#x2F;a&gt; if you have any thoughts about this!&lt;&#x2F;p&gt;
&lt;section class=&quot;footnotes&quot;&gt;
&lt;ol class=&quot;footnotes-list&quot;&gt;
&lt;li id=&quot;fn-1&quot;&gt;
&lt;p&gt;If you don’t know about the &lt;code&gt;cargo-edit&lt;&#x2F;code&gt; tool that allows you to add a
dependency in one line with &lt;code&gt;cargo add serde&lt;&#x2F;code&gt; to your project: &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;killercup&#x2F;cargo-edit&quot;&gt;now you
do&lt;&#x2F;a&gt;. &lt;a href=&quot;#fr-1-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-2&quot;&gt;
&lt;p&gt;Wait for it… &lt;a href=&quot;#fr-2-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;&#x2F;section&gt;
</content>
    </entry>
    <entry xml:lang="en">
        <title>A primer on code generation in Cranelift</title>
        <published>2021-02-17T19:00:42+00:00</published>
        <updated>2021-02-17T19:00:42+00:00</updated>
        
        <author>
          <name>
            
              Benjamin Bouvier
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://bouvier.cc/tech/cranelift-codegen-primer/"/>
        <id>https://bouvier.cc/tech/cranelift-codegen-primer/</id>
        <content type="html" xml:base="https://bouvier.cc/tech/cranelift-codegen-primer/">&lt;script src=&quot;https:&#x2F;&#x2F;cdn.jsdelivr.net&#x2F;npm&#x2F;mermaid&#x2F;dist&#x2F;mermaid.min.js&quot;&gt;&lt;&#x2F;script&gt;
&lt;script&gt;mermaid.initialize({startOnLoad:true});&lt;&#x2F;script&gt;
&lt;p&gt;&lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bytecodealliance&#x2F;wasmtime&#x2F;tree&#x2F;main&#x2F;cranelift#cranelift-code-generator&quot;&gt;Cranelift&lt;&#x2F;a&gt; is a code generator written in the Rust programming language that aims to be a fast code generator, which outputs machine code that runs at reasonable speeds.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;The Cranelift compilation model consists in compiling functions one by one, holding extra information about external entities, like external functions, memory addresses, and so on. This model allows for concurrent and parallel compilation of individual functions, which supports the goal of fast compilation. It was designed this way to allow for just-in-time (JIT) compilation of WebAssembly binary code in Firefox, although its scope has broadened a bit. Nowadays it is used in a few different WebAssembly runtimes, including &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bytecodealliance&#x2F;wasmtime#wasmtime&quot;&gt;Wasmtime&lt;&#x2F;a&gt; and &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;wasmer.io&#x2F;&quot;&gt;Wasmer&lt;&#x2F;a&gt;, but also as an alternative backend for Rust debug compilation, thanks to &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bjorn3&#x2F;rustc_codegen_cranelift&quot;&gt;cg_clif&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;A classic compiler design usually includes running a parser to translate the source to some form of intermediate representations, then run optimization passes onto them, then feeds this to the machine code generator.&lt;&#x2F;p&gt;
&lt;p&gt;This blog post focuses on the final step, namely the concepts that are involved in code generation, and what they map to in Cranelift. To make things more concrete, we’ll take a specific instruction, and see how it’s translated, from its creation down to code generation. At each step of the process, I’ll provide a short (&lt;em&gt;ahem&lt;&#x2F;em&gt;) high-level explanation of the concepts involved, and I’ll show what they map to in Cranelift, using the example instruction. While this is not a tutorial detailing how to add new instructions in Cranelift, this should be an interesting read for anyone who’s interested in compilers, and this could be an entry point if you’re interested in hacking on the Cranelift &lt;code&gt;codegen&lt;&#x2F;code&gt; crate.&lt;&#x2F;p&gt;
&lt;p&gt;This is our plan for this blog post: each squared box represents data, each
rounded box is a process. We’re going to go through each of them below.&lt;&#x2F;p&gt;
&lt;div class=&quot;mermaid&quot;&gt;
graph TD;
    clif[Optimized CLIF];
    vcode[VCode];
    final_vcode[Final VCode];
    machine_code[Machine code artifacts];
    lowering([Lowering]);
    regalloc([Register allocation]);
    codegen([Machine code generation]);
    clif --&gt; lowering --&gt; vcode --&gt; regalloc --&gt; final_vcode --&gt; codegen --&gt; machine_code
&lt;&#x2F;div&gt;
&lt;h2 id=&quot;intermediate-representations&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#intermediate-representations&quot; aria-label=&quot;Anchor link for: intermediate-representations&quot;&gt;🔗&lt;&#x2F;a&gt;Intermediate representations&lt;&#x2F;h2&gt;
&lt;p&gt;Compilers use &lt;strong&gt;intermediate representations&lt;&#x2F;strong&gt; (&lt;em&gt;IR&lt;&#x2F;em&gt;) to represent source code. Here we’re interested in representations of the &lt;em&gt;data flow&lt;&#x2F;em&gt;, that is instructions themselves and only that. The IRs contain information about the instructions themselves, their operands, type specialization information, and any additional metadata that might be useful. IRs usually map to a certain level of abstraction, and as such, they are useful for solving different problems that require different levels of abstraction. Their shape (which data structures) and numbers often have a huge impact on the performance of the compiler itself (that is, how fast it is at compiling).&lt;&#x2F;p&gt;
&lt;p&gt;In general, most programming languages use IRs internally, and yet, these are invisible to the programmers. The reason is that source code is usually first &lt;em&gt;parsed&lt;&#x2F;em&gt; (tokenized, verified) and then translated into an IR. The &lt;em&gt;abstract syntax tree&lt;&#x2F;em&gt;, aka AST, is one such IR representing the source code itself, in a format that’s very close to the source code itself. Since the raison d’être of Cranelift is to be a code generator, having a text format is secondary, and only useful for testing and debugging purposes. That’s why embedders directly create and manipulate Cranelift’s IR.&lt;&#x2F;p&gt;
&lt;p&gt;At the time of writing, Cranelift has two IRs to represent the function’s code:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;one external, high-level intermediate representation, called &lt;strong&gt;CLIF&lt;&#x2F;strong&gt; (for &lt;em&gt;Cranelift IR format&lt;&#x2F;em&gt;),&lt;&#x2F;li&gt;
&lt;li&gt;one internal, low-level intermediate representation called &lt;strong&gt;VCode&lt;&#x2F;strong&gt; (for &lt;em&gt;virtual-registerized code&lt;&#x2F;em&gt;).&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h2 id=&quot;clif-ir&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#clif-ir&quot; aria-label=&quot;Anchor link for: clif-ir&quot;&gt;🔗&lt;&#x2F;a&gt;CLIF IR&lt;&#x2F;h2&gt;
&lt;p&gt;CLIF is the IR that Cranelift embedders create and manipulate. It consists of high-level typed operations that are convenient to use and&#x2F;or can be simply translated to machine code. It is in &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Static_single_assignment_form&quot;&gt;static single assignment (SSA) form&lt;&#x2F;a&gt;: each value referenced by an operation (SSA value) is defined only once, and may have as many uses as desired. CLIF is practical to use and manipulate for classic compilers optimization passes (e.g. &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Loop-invariant_code_motion&quot;&gt;LICM&lt;&#x2F;a&gt;), as it is generic over the target architecture which we’re compiling to.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; x&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; builder&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;ins&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;().&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;iconst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;types&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;I64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 42&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; y&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; builder&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;ins&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;().&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;iconst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;types&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;I64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 1337&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; sum&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; builder&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;ins&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;().&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;iadd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;x&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; y&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;An example of Rust code that would generate CLIF IR: using an IR builder, two constant 64-bits integer SSA values x and y are created, and then added together. The result is stored into the &lt;code&gt;sum&lt;&#x2F;code&gt; SSA value, which can then be consumed by other instructions.&lt;&#x2F;p&gt;
&lt;p&gt;The code for the IR builder we’re manipulating above is automatically generated by the &lt;code&gt;cranelift-codegen&lt;&#x2F;code&gt; build script. The build script uses a domain specific &lt;em&gt;meta&lt;&#x2F;em&gt; language (DSL)&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-2-1&quot;&gt;&lt;a href=&quot;#fn-2&quot;&gt;1&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; that defines the instructions, their input and output operands, which input types are allowed, how the output type is inferred, etc. We won’t take a look at this &lt;em&gt;today&lt;&#x2F;em&gt;: this is a bit too far from code generation, but this could be material for another blog post.&lt;&#x2F;p&gt;
&lt;p&gt;As an example of a full-blown CLIF generator, there is &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bytecodealliance&#x2F;wasmtime&#x2F;tree&#x2F;main&#x2F;cranelift&#x2F;wasm&quot;&gt;a crate&lt;&#x2F;a&gt; in the Cranelift project that allows translating from the WebAssembly binary format to CLIF. The Cranelift backend for Rustc uses its own CLIF generator that translates from one of the Rust compiler’s IRs.&lt;&#x2F;p&gt;
&lt;p&gt;Finally, it’s time to reveal what’s going to be our running example! The Chosen One is the &lt;code&gt;iadd&lt;&#x2F;code&gt; CLIF operation, which allows to add two integers of any length together, with wrapping semantics. It is both simple to understand what it does, and exhibits interesting behaviors on the two architectures we’re interested in. So, let’s continue down the pipeline!&lt;&#x2F;p&gt;
&lt;h2 id=&quot;vcode-ir&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#vcode-ir&quot; aria-label=&quot;Anchor link for: vcode-ir&quot;&gt;🔗&lt;&#x2F;a&gt;VCode IR&lt;&#x2F;h2&gt;
&lt;p&gt;Later on, the CLIF intermediate representation is &lt;em&gt;lowered&lt;&#x2F;em&gt;, i.e. transformed from a high-level one into a lower-level one. Here lower level means a form more specialized for a machine architecture. This lower IR is called &lt;em&gt;VCode&lt;&#x2F;em&gt; in Cranelift. The values it references are called &lt;em&gt;virtual registers&lt;&#x2F;em&gt; (more on the &lt;em&gt;virtual&lt;&#x2F;em&gt; bit below). They’re not in SSA form anymore: each virtual register may be redefined as many times as we want. This IR is used to encode register allocation constraints and it guides machine code generation. As a matter of fact, since this information is tied to the machine code’s representation itself, this IR is also target-specific: there’s one flavor of VCode per each CPU architecture we’re compiling to.&lt;&#x2F;p&gt;
&lt;p&gt;Let’s get back to our example, that we’re going to compile on two instruction set architectures:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;ARM 64-bits (aka aarch64), which is used in most mobile devices but start to become mainstream on laptops (Apple’s Mac M1, some Chromebooks)&lt;&#x2F;li&gt;
&lt;li&gt;Intel’s x86 64-bits (aka x86_64, also abbreviated x64), which is used in most desktop and laptop machines).&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;An integer addition machine instruction on aarch64 will take three operands: two input operands (one of which must be a register), and another third output register operand. While on the x86_64 architecture, the equivalent instruction involves a total of two registers: one that is a read-only source register, and another that is an in-out modified register, containing both the second source and the destination register. We’ll get back to this.&lt;&#x2F;p&gt;
&lt;p&gt;So considering &lt;code&gt;iadd&lt;&#x2F;code&gt;, let’s look at (one of&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-4-1&quot;&gt;&lt;a href=&quot;#fn-4&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;) the VCode instruction that’s used to represent integer additions on aarch64 (as defined in &lt;code&gt;cranelift&#x2F;codegen&#x2F;src&#x2F;isa&#x2F;aarch64&#x2F;inst&#x2F;mod.rs&lt;&#x2F;code&gt;):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F;&#x2F; An ALU operation with two register sources and a register destination.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;AluRRR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    alu_op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Writable&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;},&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Some details here:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;alu_op&lt;&#x2F;code&gt; defines the sub-opcode used in the ALU (Arithmetic Logic Unit). It will be &lt;code&gt;AluOp::Add64&lt;&#x2F;code&gt; for a 64-bits integer addition.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;rn&lt;&#x2F;code&gt; and &lt;code&gt;rm&lt;&#x2F;code&gt; are the conventional aarch64 names for the two input registers.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;code&gt;rd&lt;&#x2F;code&gt; is the destination register. See how it’s marked as &lt;code&gt;Writable&lt;&#x2F;code&gt;, while the two others are not? &lt;code&gt;Writable&lt;&#x2F;code&gt; is a plain Rust wrapper that makes sure that we &lt;em&gt;can&lt;&#x2F;em&gt; statically differentiate read-only registers from writable registers; a neat trick that allows us to catch more issues at compile-time.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;All this information is directly tied to the machine code representation of an addition instruction on aarch64: each field is later used to select some bytes that will be generated during code generation.&lt;&#x2F;p&gt;
&lt;p&gt;As said before, the VCode is specific to each architecture, so x86_64 has a different VCode representation for the same instruction (as defined in &lt;code&gt;cranelift&#x2F;codegen&#x2F;src&#x2F;isa&#x2F;x64&#x2F;inst&#x2F;mod.rs&lt;&#x2F;code&gt;):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F;&#x2F; Integer arithmetic&#x2F;bit-twiddling: (add sub and or xor mul adc? sbb?) (32 64) (reg addr imm) reg&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;AluRmiR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    is_64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; bool&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; AluRmiROpcode&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    src&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; RegMemImm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Writable&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;},&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Here, the sub-opcode is defined as part of the &lt;code&gt;AluRmiROpcode&lt;&#x2F;code&gt; enum (the comment hints at which other x86 machine instructions are generated by this same VCode). See how there’s only one &lt;code&gt;src&lt;&#x2F;code&gt; (source) register (or memory or immediate operand), while the instruction conceptually takes two inputs? That’s because it’s expected that the &lt;code&gt;dst&lt;&#x2F;code&gt; (destination) register is &lt;em&gt;modified&lt;&#x2F;em&gt;, that is, both read (so it’s the second input operand) and written to (so it’s the result register). In equivalent C code, the x86’s add instruction doesn’t actually do &lt;code&gt;a = b + c&lt;&#x2F;code&gt;. What it does is &lt;code&gt;a += b&lt;&#x2F;code&gt;, that is, one of the sources is &lt;em&gt;consumed&lt;&#x2F;em&gt; by the instruction. This is an artifact inherited from the design of older x86 machines in the 1970’s, when instructions were designed around an accumulator model (and representing efficiently three operands in a CISC architecture would make the encoding larger and harder than it is).&lt;&#x2F;p&gt;
&lt;h2 id=&quot;instruction-selection-lowering&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#instruction-selection-lowering&quot; aria-label=&quot;Anchor link for: instruction-selection-lowering&quot;&gt;🔗&lt;&#x2F;a&gt;Instruction selection (lowering)&lt;&#x2F;h2&gt;
&lt;p&gt;As said before, converting from the high-level IR (CLIF) to the low-level IR (VCode) is called lowering. Since VCode is target-dependent, this process is also target-dependent. That’s where we consider which machine instructions get eventually used for a given CLIF opcode. There are many ways to achieve the same machine state results for given semantics, but some of these ways are faster than other, and&#x2F;or require fewer code bytes to achieve. The problem can be summed up like this: given some CLIF, which VCode can we create to generate the fastest and&#x2F;or smallest machine code that carries out the desired semantics? This is called &lt;em&gt;instruction selection&lt;&#x2F;em&gt;, because we’re selecting the VCode instructions among a set of different possible instructions.&lt;&#x2F;p&gt;
&lt;p&gt;How do these IR map to each other? A given CLIF node may be lowered into 1 to N VCode instructions. A given VCode instruction may lead to the code generation of 1 to M machine instructions. There are no rules governing the maximum of entities mapped. For instance, the integer addition CLIF opcode &lt;code&gt;iadd&lt;&#x2F;code&gt; on 64-bits inputs maps to a single VCode instruction on aarch64. The VCode instruction then causes a single code instruction to be generated.&lt;&#x2F;p&gt;
&lt;p&gt;Other CLIF opcodes may generate more than a single machine instruction eventually. Consider the CLIF opcode for signed integer division &lt;code&gt;idiv&lt;&#x2F;code&gt;. Its semantics define that it traps for zero inputs and in case of integer overflow&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-3-1&quot;&gt;&lt;a href=&quot;#fn-3&quot;&gt;3&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;. On aarch64, this is lowered into:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;one VCode instruction that checks if the input is zero and trap otherwise&lt;&#x2F;li&gt;
&lt;li&gt;two VCode instructions for comparing the input values against the minimal integer value and -1&lt;&#x2F;li&gt;
&lt;li&gt;one VCode instruction to trap if the two input values match what we checked against&lt;&#x2F;li&gt;
&lt;li&gt;and one VCode instruction that does the actual division operation.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Each of these VCode instruction then generates one or more machine code instructions, resulting in a bit of a longer sequence.&lt;&#x2F;p&gt;
&lt;p&gt;Let’s look at the lowering of &lt;code&gt;iadd&lt;&#x2F;code&gt; on aarch64 (in &lt;code&gt;cranelift&#x2F;codegen&#x2F;src&#x2F;isa&#x2F;aarch64&#x2F;lower_inst.rs&lt;&#x2F;code&gt;), edited and simplified for clarity. I’ve added comments in the code, explaining what each line does:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Opcode&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Iadd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&amp;gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Get the destination register.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; get_output_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; outputs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;]).&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;only_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;().&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;unwrap&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;();&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Get the controlling type of the addition (32-bits int or 64-bits int or&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; int vector, etc.).&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;unwrap&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;();&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Force one of the inputs into a register, not applying any signed- or&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; zero-extension.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; put_input_in_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; inputs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;],&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt; NarrowValueMode&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;None&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Try to see if we can encode the second operand as an immediate on&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; 12-bits, maybe by negating it;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Otherwise, put it into a register.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; negated&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; put_input_in_rse_imm12_maybe_negated&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;        ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;        inputs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;        ty_bits&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;),&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;        NarrowValueMode&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;None&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    );&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Select the ALU subopcode, based on possible negation and controlling&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; type.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; alu_op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; if !&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;negated&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;        choose_32_64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt; ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Add32&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt; ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Add64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    }&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; else&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;        choose_32_64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt; ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Sub32&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt; ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Sub64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    };&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Emit the VCode instruction in the VCode stream.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;emit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;alu_inst_imm12&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;alu_op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;));&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;In fact, the &lt;code&gt;alu_inst_imm12&lt;&#x2F;code&gt; wrapper can create one VCode instruction among a set of possible ones (since we’re trying to select &lt;em&gt;the best one&lt;&#x2F;em&gt;). For the sake of simplicity, we’ll assume that &lt;code&gt;AluRRR&lt;&#x2F;code&gt; is going to be generated, i.e. the selected instruction is the one using only register encodings for the input values.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;register-allocation&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#register-allocation&quot; aria-label=&quot;Anchor link for: register-allocation&quot;&gt;🔗&lt;&#x2F;a&gt;Register allocation&lt;&#x2F;h2&gt;
&lt;div class=&quot;mermaid&quot;&gt;
graph TD
    vcode_vreg[VCode with virtual registers]
    regalloc([Register allocation])
    vcode_rreg[VCode with real registers]
    codegen([Code generation])
    machine_code(Machine code)
    vcode_vreg --&gt; regalloc --&gt; vcode_rreg --&gt; codegen --&gt; machine_code
&lt;&#x2F;div&gt;
&lt;h3 id=&quot;vcode-registers-and-stack-slots&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#vcode-registers-and-stack-slots&quot; aria-label=&quot;Anchor link for: vcode-registers-and-stack-slots&quot;&gt;🔗&lt;&#x2F;a&gt;VCode, registers and stack slots&lt;&#x2F;h3&gt;
&lt;p&gt;Hey, ever wondered what the V in VCode meant? Back to the drawing board. While a program may reference a theoretically unlimited number of instructions, each referencing a theoretically unlimited number of values as inputs and outputs, the physical machine only has a fixed set of containers for those values:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;either they must live in machine &lt;strong&gt;registers&lt;&#x2F;strong&gt;: very fast to access in the CPU, take some CPU real estate, thus are costly, so there are usually few of them.&lt;&#x2F;li&gt;
&lt;li&gt;or they must live in the process’ &lt;strong&gt;stack memory&lt;&#x2F;strong&gt;: it’s slower to access, but we can have virtually any amount of stack &lt;em&gt;slots&lt;&#x2F;em&gt;.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;asm&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;mov&lt;&#x2F;span&gt;&lt;span&gt; %&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;edi&lt;&#x2F;span&gt;&lt;span&gt;,-&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0x4&lt;&#x2F;span&gt;&lt;span&gt;(%&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;rbp&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;mov&lt;&#x2F;span&gt;&lt;span&gt; %&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;rsi&lt;&#x2F;span&gt;&lt;span&gt;,-&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0x10&lt;&#x2F;span&gt;&lt;span&gt;(%&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;rbp&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;mov&lt;&#x2F;span&gt;&lt;span&gt; -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0x4&lt;&#x2F;span&gt;&lt;span&gt;(%&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;rbp&lt;&#x2F;span&gt;&lt;span&gt;),%&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;eax&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;em&gt;In this example of x86 machine code, %edi, %rsi, %rbp, %eax are all registers; stack slots are memory addresses computed as the frame pointer (%rbp) plus an offset value (which happens to be negative here). Note that stack slots may be referred to by the stack pointer (%rsp) in general.&lt;&#x2F;em&gt;&lt;&#x2F;p&gt;
&lt;h3 id=&quot;defining-the-register-allocation-problem&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#defining-the-register-allocation-problem&quot; aria-label=&quot;Anchor link for: defining-the-register-allocation-problem&quot;&gt;🔗&lt;&#x2F;a&gt;Defining the register allocation problem&lt;&#x2F;h3&gt;
&lt;p&gt;The problem of mapping the IR values (in VCode these are the &lt;code&gt;Reg&lt;&#x2F;code&gt;) to machine “containers” is called &lt;strong&gt;&lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Register_allocation&quot;&gt;register allocation&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt; (aka regalloc). Inputs to register allocation can be as numerous as we want them, and map to “virtual” values, hence we call them &lt;em&gt;virtual registers&lt;&#x2F;em&gt;. And… that’s where the V from VCode comes from: the instructions in VCode reference values that are &lt;em&gt;virtual&lt;&#x2F;em&gt; registers before register allocation, so we say the code is in &lt;em&gt;virtualized&lt;&#x2F;em&gt; register form. The output of register allocation is a set of new instructions, where the virtual registers have been replaced by &lt;em&gt;real registers&lt;&#x2F;em&gt; (the physical ones, limited in quantity) or stack slots references (and other additional metadata).&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; Before register allocation, with unlimited virtual registers:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v2 = v0 + v1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v3 = v2 * 2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v4 = v2 + 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v5 = v4 + v3&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;return v5&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; One possible register allocation, on a machine that has 2 registers %r0, %r1:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r0 = %r0 + %r1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r1 = %r0 * 2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r0 = %r0 + 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r1 = %r0 + %r1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;return %r1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;When all is well, the virtual registers don’t conceptually &lt;em&gt;live&lt;&#x2F;em&gt; at the same time, and they can be put into physical registers. Issues arise when there’s not enough physical registers to contain all the virtual registers that live at the same time, which is the case for… a very large majority of programs. Then, register allocation must decide which registers continue to live in registers at a given program point, and which should be &lt;strong&gt;spilled&lt;&#x2F;strong&gt; into a stack slot, effectively &lt;em&gt;storing&lt;&#x2F;em&gt; them onto the stack for later use. This later reuse will imply to &lt;strong&gt;reload&lt;&#x2F;strong&gt; them from the stack slot, using a &lt;em&gt;load&lt;&#x2F;em&gt; machine instruction. The complexity resides in choosing which registers should be spilled, at which program point they should be spilled, and at which program points we should reload them, if we need to do so. Making good choices there will have a large impact on the speed of the generated code, since memory accesses to the stack imply an additional runtime cost. For instance, a variable that’s frequently used in a hot loop should live in a register for the whole loop’s lifetime, and not be spilled&#x2F;reloaded in the middle of the loop.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; Before register allocation, with unlimited virtual registers:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v2 = v0 + v1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v3 = v0 + v2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v4 = v3 + v1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;return v4&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; One possible register allocation, on a machine that has 2 registers %r0, %r1.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; We need to spill one value, because there&amp;#39;s a point where 3 values are live at the same time!&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;spill %r1 --&amp;gt; stack_slot(0)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r1 = %r0 + %r1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r1 = %r0 + %r1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;reload stack_slot(0) --&amp;gt; %r0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r1 = %r1 + %r0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;return %r1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;And, since we like to have our cake and eat it too, the register allocator itself should be &lt;em&gt;fast&lt;&#x2F;em&gt;: it should not take an unbounded amount of time to make these allocation decisions. Register allocation has the good taste to be a &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;NP-completeness&quot;&gt;NP-complete&lt;&#x2F;a&gt; problem. Concretely, this means that implementations cannot find the &lt;em&gt;best&lt;&#x2F;em&gt; solutions given arbitrary inputs, but they’ll estimate &lt;em&gt;good&lt;&#x2F;em&gt; solutions based on heuristics, in worst-case quadratic time over the size of the input. All of this makes it so that register allocation has its own whole research field, and has been extensively studied for some time now. It is a fascinating problem.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;register-allocation-in-cranelift&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#register-allocation-in-cranelift&quot; aria-label=&quot;Anchor link for: register-allocation-in-cranelift&quot;&gt;🔗&lt;&#x2F;a&gt;Register allocation in Cranelift&lt;&#x2F;h3&gt;
&lt;p&gt;Back to Cranelift. The register allocation contract is that if a value &lt;em&gt;must&lt;&#x2F;em&gt; live in a real register at a given program point, then it &lt;em&gt;does&lt;&#x2F;em&gt; live where it should (unless register allocation is impossible). At the start of code generation for a VCode instruction, we are guaranteed that the input values live in real registers, and that the output real register is available before the next VCode instruction.&lt;&#x2F;p&gt;
&lt;p&gt;You might have noticed that the VCode instructions only refer to registers, and not stack slots. But where are the stack slots, then? The trick is that the stack slots are &lt;em&gt;invisible&lt;&#x2F;em&gt; to VCode. Register allocation may create an arbitrary number of spills, reloads, and register moves&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-5-1&quot;&gt;&lt;a href=&quot;#fn-5&quot;&gt;4&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; around VCode instructions, to ensure that their register allocation constraints are met. This is why the output of register allocation is a new list of instructions, that includes not only the initial instructions filled with the actual registers, but also additional spill, reload and move (VCode) instructions added by regalloc.&lt;&#x2F;p&gt;
&lt;p&gt;As said before, this problem is so sufficiently complex, involved and independent from the rest of the code (assuming the right set of interfaces!) that its code lives in a separate crate, &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bytecodealliance&#x2F;regalloc.rs&quot;&gt;&lt;code&gt;regalloc.rs&lt;&#x2F;code&gt;&lt;&#x2F;a&gt;, with its own fuzzing and testing infrastructure. I hope to shed some light on it at some point too.&lt;&#x2F;p&gt;
&lt;p&gt;What’s interesting to us today is the register allocation &lt;em&gt;constraints&lt;&#x2F;em&gt;. Consider the aarch64 integer add instruction &lt;code&gt;add rd, rn, rm&lt;&#x2F;code&gt;: &lt;code&gt;rd&lt;&#x2F;code&gt; is the output virtual register that’s written to, while &lt;code&gt;rn&lt;&#x2F;code&gt; and &lt;code&gt;rm&lt;&#x2F;code&gt; are the inputs, thus read from. We need to inform the register allocation algorithm about these constraints. In regalloc jargon, “read to” is known as &lt;em&gt;used&lt;&#x2F;em&gt;, while “written to” is known as &lt;em&gt;defined&lt;&#x2F;em&gt;. Here, the aarch64 VCode instruction &lt;code&gt;AluRRR&lt;&#x2F;code&gt; does &lt;em&gt;use&lt;&#x2F;em&gt; &lt;code&gt;rn&lt;&#x2F;code&gt; and &lt;code&gt;rm&lt;&#x2F;code&gt;, and it &lt;em&gt;def&lt;&#x2F;em&gt;ines &lt;code&gt;rd&lt;&#x2F;code&gt;. This usage information is &lt;em&gt;collected&lt;&#x2F;em&gt; in the &lt;code&gt;aarch64_get_regs&lt;&#x2F;code&gt; function (&lt;code&gt;cranelift&#x2F;codegen&#x2F;src&#x2F;isa&#x2F;aarch64&#x2F;inst&#x2F;mod.rs&lt;&#x2F;code&gt;):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;fn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; aarch64_get_regs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; collector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt;mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; RegUsageCollector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    match&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;AluRRR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;, .. } =&amp;gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;            collector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;add_def&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;            collector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;add_use&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;            collector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;add_use&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;        &#x2F;&#x2F; etc.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Then, after register allocation has assigned the physical registers, we need to instruct it how to replace virtual register mentions by physical register mentions. This is done in the &lt;code&gt;aarch64_map_regs&lt;&#x2F;code&gt; function (same file as above):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;fn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; aarch64_map_regs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;RUM&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; RegUsageMapper&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt;mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; mapper&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;: &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;RUM&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; ...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    match&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt;mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;AluRRR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;            ref&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt; mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;            ref&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt; mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;            ref&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt; mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;            ..&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        } =&amp;gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;            map_def&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;mapper&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;            map_use&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;mapper&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;            map_use&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;mapper&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;        &#x2F;&#x2F; etc.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Note this is reflecting quite precisely what the usage collector did: we’re replacing the virtual register mention for the defined register &lt;code&gt;rd&lt;&#x2F;code&gt; with the information (which real register) provided by the &lt;code&gt;RegUsageMapper&lt;&#x2F;code&gt;. These two functions must stay in sync, otherwise here be dragons! (and bugs very hard to debug!)&lt;&#x2F;p&gt;
&lt;h3 id=&quot;register-allocation-on-x86&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#register-allocation-on-x86&quot; aria-label=&quot;Anchor link for: register-allocation-on-x86&quot;&gt;🔗&lt;&#x2F;a&gt;Register allocation on x86&lt;&#x2F;h3&gt;
&lt;p&gt;On Intel’s x86, register allocation may be a bit trickier: in some cases, the lowering needs to be carefully written so it satisfies some register allocation constraints that are very specific to this architecture. In particular, x86 has &lt;em&gt;fixed register constraints&lt;&#x2F;em&gt; as well as &lt;em&gt;tied operands&lt;&#x2F;em&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;For this specific part, we’ll look at the integer shift-left instruction, which is equivalent to C’s &lt;code&gt;x &amp;lt;&amp;lt; y&lt;&#x2F;code&gt;. Why this particular instruction? It exhibits both properties that we’re interested in studying here. The lowering of &lt;code&gt;iadd&lt;&#x2F;code&gt; is similar, albeit slightly simpler, as it &lt;em&gt;only&lt;&#x2F;em&gt; involves tied operands.&lt;&#x2F;p&gt;
&lt;h4 id=&quot;fixed-register-constraints&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#fixed-register-constraints&quot; aria-label=&quot;Anchor link for: fixed-register-constraints&quot;&gt;🔗&lt;&#x2F;a&gt;Fixed register constraints&lt;&#x2F;h4&gt;
&lt;p&gt;On the one hand, some instructions expect their inputs to be in &lt;em&gt;fixed&lt;&#x2F;em&gt; registers, that is, specific registers arbitrarily predefined by the architecture manual. For the example of the shift instruction, if the count is not statically known at compile time (it’s not a shift by a constant value), then the amount by which we’re shifting must be in the &lt;code&gt;rcx&lt;&#x2F;code&gt; register&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-8-1&quot;&gt;&lt;a href=&quot;#fn-8&quot;&gt;5&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Now, how do we make sure that the input value actually is in &lt;code&gt;rcx&lt;&#x2F;code&gt;? We can mark &lt;code&gt;rcx&lt;&#x2F;code&gt; as used in the &lt;code&gt;get_regs&lt;&#x2F;code&gt; function so regalloc knows about this, but nothing ensures that the input &lt;em&gt;resides&lt;&#x2F;em&gt; in it at the beginning of the instruction. To resolve this, we’ll introduce a &lt;strong&gt;move instruction&lt;&#x2F;strong&gt; during lowering, that is going to copy the input value into &lt;code&gt;rcx&lt;&#x2F;code&gt;. Then we’re sure it lives there, and register allocation knows it’s used: we’re good to go!&lt;&#x2F;p&gt;
&lt;p&gt;In a nutshell, this shows how lowering and register allocation play together:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;during lowering, we introduce a move from a dynamic shift input value to &lt;code&gt;rcx&lt;&#x2F;code&gt; before the actual shift&lt;&#x2F;li&gt;
&lt;li&gt;in the register usage function, we mark &lt;code&gt;rcx&lt;&#x2F;code&gt; as used&lt;&#x2F;li&gt;
&lt;li&gt;(nothing to do in the register mapping function: &lt;code&gt;rcx&lt;&#x2F;code&gt; is a real register already)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h4 id=&quot;tied-operands&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#tied-operands&quot; aria-label=&quot;Anchor link for: tied-operands&quot;&gt;🔗&lt;&#x2F;a&gt;Tied operands&lt;&#x2F;h4&gt;
&lt;p&gt;On the other hand, some instructions have operands that are both read and written at the same time: we call them &lt;em&gt;modified&lt;&#x2F;em&gt; in Cranelift and regalloc.rs, but they’re also known as &lt;em&gt;tied operands&lt;&#x2F;em&gt; in the compiler literature. It’s not just that there’s a register that must be read, and a register that must be written to: they &lt;em&gt;must&lt;&#x2F;em&gt; be the same register. How do we model this, then?&lt;&#x2F;p&gt;
&lt;p&gt;Consider a naive solution. We take the input virtual register, and decide it’s allocated to the same register as the output (modified) register. Unfortunately, if the chosen virtual register was going to be reused by another later VCode instruction, then its value would be overwritten (clobbered) by the current instruction. This would result in incorrect code being generated, so this is not acceptable. In general we can’t clobber the value that was in an input value during lowering, because that’s the role of regalloc to make this kind of decisions.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; Before register allocation, with virtual registers:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v2 = v0 + v1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;v3 = v0 + 42&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; After register allocation, on a machine with two registers %r0 and %r1:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&#x2F;&#x2F; assign v0 to %r0, v1 to %r1, v2 to %r0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;%r0 += v1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;... = %r0 + 42 &#x2F;&#x2F; ohnoes! the value in %r0 is v2, not v0 anymore!&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The right solution is, again, to &lt;em&gt;copy&lt;&#x2F;em&gt; this input virtual register into the output virtual register, right before the instruction. This way, we can still reuse the untouched input register in other instructions without modifying it: only the copy is written to.&lt;&#x2F;p&gt;
&lt;p&gt;Pfew! We can now look at the entire lowering for the shift left instruction, edited and commented for clarity:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; Read the instruction operand size from the output&amp;#39;s type.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; size&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst_ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;bytes&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;() as&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; u8&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; Put the left hand side into a virtual register.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; lhs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; put_input_in_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; inputs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;]);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; Put the right hand side (shift amount) into either an immediate (if it&amp;#39;s&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; statically known at compile time), or into a virtual register.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;count&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rhs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) =&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    if let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Some&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;cst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;get_input_as_source_or_const&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;insn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;).&lt;&#x2F;span&gt;&lt;span&gt;constant &lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;        &#x2F;&#x2F; Mask count, according to Cranelift&amp;#39;s semantics.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;        let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; cst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; = (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;cst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; as&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; u8&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) &amp;amp; (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;dst_ty&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;bits&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;() as&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; u8&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Some&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;cst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;),&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; None&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    }&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; else&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;        (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;None&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Some&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;put_input_in_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; inputs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;])))&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    };&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; Get the destination virtual register.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; get_output_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; outputs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;]).&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;only_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;().&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;unwrap&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;();&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; Copy the left hand side into the (modified) output operand, to satisfy the&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; mod constraint.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;emit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;mov_r_r&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;true&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; lhs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;));&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; If the shift count is statically known: nothing particular to do. Otherwise,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; we need to put it in the RCX register.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;if&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; count&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;is_none&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;() {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; w_rcx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Writable&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;from_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;regs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;rcx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;());&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Copy the shift count (which is in rhs) into RCX.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;emit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;mov_r_r&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt;true&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rhs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;unwrap&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(),&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; w_rcx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;));&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; Generate the actual shift instruction.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ctx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;emit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;shift_r&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;size&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt; ShiftKind&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ShiftLeft&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; count&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;));&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;And this is how we tell the register usage collector about our constraints:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ShiftR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; num_bits&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;, .. } =&amp;gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    if&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; num_bits&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;is_none&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;() {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;        &#x2F;&#x2F; if the shift count is dynamic, mark RCX as used.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;        collector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;add_use&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #0DB9D7;&quot;&gt;regs&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;rcx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;());&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; In all the cases, the destination operand is modified.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    collector&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;add_mod&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(*&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Only the modified register needs to be mapped to its allocated physical register:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;ShiftR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; { ref&lt;&#x2F;span&gt;&lt;span style=&quot;color: #9D7CD8;font-style: italic;&quot;&gt; mut&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;, .. } =&amp;gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;    map_mod&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;mapper&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; dst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;virtual-registers-copies-and-performance&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#virtual-registers-copies-and-performance&quot; aria-label=&quot;Anchor link for: virtual-registers-copies-and-performance&quot;&gt;🔗&lt;&#x2F;a&gt;Virtual registers copies and performance&lt;&#x2F;h3&gt;
&lt;p&gt;Do these virtual register copies sound costly to you? In theory, they could lead to the code generation of a move instructions, increasing the size of the code generated and causing a small runtime cost. In practice,
register allocation, through its interface, knows how to identify move instructions, their source and their destination. By analyzing them, it can see when a source isn’t used after a given move instruction, and thus allocate the same register for the source and the destination of the move. Then, when Cranelift generates the code, it will avoid generating a move from a physical register to the same one&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-7-1&quot;&gt;&lt;a href=&quot;#fn-7&quot;&gt;6&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;. As a matter of fact, creating a VCode copy doesn’t necessarily mean that it will generate a machine code move instruction later: it is present just in case regalloc &lt;em&gt;needs&lt;&#x2F;em&gt; it, but it can be avoided when it’s spurious.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;code-generation&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#code-generation&quot; aria-label=&quot;Anchor link for: code-generation&quot;&gt;🔗&lt;&#x2F;a&gt;Code generation&lt;&#x2F;h2&gt;
&lt;p&gt;Oh my, we’re getting closer to actually being able to run the code! Once register allocation has run, we can generate the actual machine code for the VCode instructions. Cool kids call this step of the pipeline &lt;em&gt;codegen&lt;&#x2F;em&gt;, for code generation. This is the part where we decipher the architecture manuals provided by the CPU vendors, and generate the raw machine bytes for our machine instructions. In Cranelift, this means filling a code buffer (there’s a &lt;code&gt;MachBuffer&lt;&#x2F;code&gt; sink interface for this!), returned along some internal relocations&lt;sup class=&quot;footnote-reference&quot; id=&quot;fr-6-1&quot;&gt;&lt;a href=&quot;#fn-6&quot;&gt;7&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; and additional metadata. Let’s see what happens for our integer addition, when the times come to generate the code for its VCode equivalent &lt;code&gt;AluRRR&lt;&#x2F;code&gt; on &lt;code&gt;aarch64&lt;&#x2F;code&gt; (in &lt;code&gt;cranelift&#x2F;codegen&#x2F;src&#x2F;isa&#x2F;aarch64&#x2F;inst&#x2F;emit.rs&lt;&#x2F;code&gt;):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;&#x2F;&#x2F; We match on the VCode&amp;#39;s identity here:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Inst&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;AluRRR&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; alu_op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; } =&amp;gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; First select the top 11 bits based on the ALU subopcode.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; top11&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; match&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; alu_op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;        ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Add32&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 0b00001011_000&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;        ALUOp&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;::&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Add64&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 0b10001011_000&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;        &#x2F;&#x2F; etc&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    };&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Then decide the bits 10 to 15, based on the ALU subopcode as well.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;    let&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; bit15_10&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; match&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; alu_op&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;        &#x2F;&#x2F; other cases&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;        _&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; =&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 0b000000&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    };&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; Then use an helper and pass forward the allocated physical registers&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #51597D;font-style: italic;&quot;&gt;    &#x2F;&#x2F; values.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;    sink&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;put4&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;enc_arith_rrr&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;top11&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; bit15_10&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;));&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;And what’s this &lt;code&gt;enc_arith_rrr&lt;&#x2F;code&gt; doing, then?&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #A9B1D6; background-color: #1A1B26;&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;fn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; enc_arith_rrr&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;bits_31_21&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; u32&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; bits_15_10&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; u32&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Writable&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;&amp;gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; Reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;) -&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt; u32&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;    (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;bits_31_21&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; &amp;lt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 21&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;        |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;bits_15_10&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; &amp;lt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 10&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;        |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt; machreg_to_gpr&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;to_reg&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;())&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;        |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;machreg_to_gpr&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rn&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; &amp;lt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 5&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt;        |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt; (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #7AA2F7;&quot;&gt;machreg_to_gpr&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #C0CAF5;&quot;&gt;rm&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;span style=&quot;color: #BB9AF7;&quot;&gt; &amp;lt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FF9E64;&quot;&gt; 16&lt;&#x2F;span&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #89DDFF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Encoding the instruction parts (operands, register mentions) is a lot of bit twiddling and fun. We do so for each VCode instruction, until we’ve generated the whole function’s body. If you remember correctly, at this point register allocation may have added some spills&#x2F;reloads&#x2F;move instructions. From the codegen’s point of view, these are just regular instructions with precomputed operands (either real registers, or memory operands involving the stack pointer), so they’re not treated particularly and they’re just generated the same way other VCode instructions are.&lt;&#x2F;p&gt;
&lt;p&gt;More work is done by the codegen backend then, to optimize blocks placement, compute final branch offsets, etc. If you’re interested by this, I strongly encourage you to go read &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;cfallin.org&#x2F;blog&#x2F;2021&#x2F;01&#x2F;22&#x2F;cranelift-isel-2&#x2F;&quot;&gt;this blog post&lt;&#x2F;a&gt; by Chris Fallin. After this, we’re finally done: we’ve produced a code buffer, as well as external relocations (to other functions, memory addresses, etc.) for a single function. The code generator’s task is complete: the final steps consist in linking and, optionally, producing an executable binary.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;mission-accomplished&quot;&gt;&lt;a class=&quot;zola-anchor&quot; href=&quot;#mission-accomplished&quot; aria-label=&quot;Anchor link for: mission-accomplished&quot;&gt;🔗&lt;&#x2F;a&gt;Mission accomplished!&lt;&#x2F;h2&gt;
&lt;p&gt;So, we’re done for today! Thanks for reading this far, hope it has been a useful and pleasant read to you! Feel free to reach out to me on the &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;bnjbvr&quot;&gt;twitterz&lt;&#x2F;a&gt; if you have additional remarks&#x2F;questions, and to go contribute on &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;bytecodealliance&#x2F;wasmtime&quot;&gt;Wasmtime&#x2F;Cranelift&lt;&#x2F;a&gt; if this sort of things is interesting to you 😇. Until next time, take care of yourselves!&lt;&#x2F;p&gt;
&lt;p&gt;Thanks to &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;cfallin.org&quot;&gt;Chris Fallin&lt;&#x2F;a&gt; for reading and suggesting improvements to this blog post.&lt;&#x2F;p&gt;
&lt;section class=&quot;footnotes&quot;&gt;
&lt;ol class=&quot;footnotes-list&quot;&gt;
&lt;li id=&quot;fn-2&quot;&gt;
&lt;p&gt;Really, Rust &lt;em&gt;is&lt;&#x2F;em&gt; the DSL. It was Python code before, that had the advantage to be faster to update. Yet it was doing a lot of magic behind the curtain, which wasn’t very friendly for new people trying to learn and use Cranelift. Despite a statically typed language helping for exploration through tooling, this meta-language is to partially disappear in the long run, see Chris’ &lt;a rel=&quot;noopener noreferrer external&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;cfallin.org&#x2F;blog&#x2F;2020&#x2F;09&#x2F;18&#x2F;cranelift-isel-1&#x2F;&quot;&gt;blog post&lt;&#x2F;a&gt; on this topic. &lt;a href=&quot;#fr-2-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-4&quot;&gt;
&lt;p&gt;Aarch64 connoisseurs may notice that there are other ways to encode an addition. Say, if one of the input operands was the result of a bit shift instruction by an immediate value, then it’s possible to &lt;em&gt;embed&lt;&#x2F;em&gt; the shift within the add, so we end up with fewer machine instructions (and lower the register pressure). This other possible encoding is sufficiently different in terms of register allocation and code generation that it justifies having its own VCode instruction. &lt;code&gt;AluRRR&lt;&#x2F;code&gt; is simpler in the sense that it’s only concerned with register inputs and outputs, thus a perfect example for this post. &lt;a href=&quot;#fr-4-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-3&quot;&gt;
&lt;p&gt;What’s an integer overflow for signed integer division? Consider an integer value represented on &lt;code&gt;N&lt;&#x2F;code&gt; bits. If you try to divide the smallest integer value &lt;code&gt;-2**N&lt;&#x2F;code&gt; by &lt;code&gt;-1&lt;&#x2F;code&gt;, it should return &lt;code&gt;2**N&lt;&#x2F;code&gt;, but this is out of range, since the biggest signed integer value we can represent on &lt;code&gt;N&lt;&#x2F;code&gt; bits is &lt;code&gt;(2**N) - 1&lt;&#x2F;code&gt;! So this will overflow and be set to &lt;code&gt;-2**N&lt;&#x2F;code&gt;, which is the initial value, but not the correct result. Good luck debugging this without a software trap! &lt;a href=&quot;#fr-3-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-5&quot;&gt;
&lt;p&gt;Register moves may be introduced because a successor block (in the control flow graph) expects a given virtual register to live in a particular real register, or because a particular instruction requires a virtual register to be allocated to a &lt;em&gt;fixed&lt;&#x2F;em&gt; real register that’s busy: regalloc can then temporarily divert the busy register into another unused register. &lt;a href=&quot;#fr-5-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-8&quot;&gt;
&lt;p&gt;The &lt;code&gt;c&lt;&#x2F;code&gt; in &lt;code&gt;rcx&lt;&#x2F;code&gt; actually stands for &lt;code&gt;count&lt;&#x2F;code&gt;; this is a property inherited from former CPU designs. &lt;a href=&quot;#fr-8-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-7&quot;&gt;
&lt;p&gt;Unless this move carries sign- or zero-extending semantics, which is the case for e.g. x86’s 32-bits &lt;code&gt;mov&lt;&#x2F;code&gt; instructions on a 64-bits architecture. &lt;a href=&quot;#fr-7-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li id=&quot;fn-6&quot;&gt;
&lt;p&gt;Relocations are placeholders for information we don’t have &lt;em&gt;yet&lt;&#x2F;em&gt; access to. For instance, when we’re generating jump instructions, the jump targets offsets are not determined yet. So we record where the jump instruction is in the code stream, as well as which control flow block it should jump into, so we can &lt;em&gt;patch it&lt;&#x2F;em&gt; later when the final offsets are known: that’s the content of our relocation. &lt;a href=&quot;#fr-6-1&quot;&gt;↩&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;&#x2F;section&gt;
</content>
    </entry>
</feed>

