Running MMseqs2 on CPUs with SSE4.1 or AVX2 instruction sets
Hi! I'm Ryan Moore, NBA fan & PhD candidate in Eric Wommack's viral ecology lab @ UD. Follow me on Twitter!
MMseqs2 is a software suite for searching and clustering giant protein and nucleotide datasets. It is very fast while still being very sensitive, and if you are using BLAST for homology search, I definitely recommend giving MMseqs2 a try!
Single instruction, multiple data (SIMD)
MMseqs2 requires a 64-bit system with either SSE4.1 or AVX2 instruction sets to run. SSE (Streaming SIMD Extensions) and AVX (Advanced Vector Extensions) are instruction sets that take advantage of a CPU’s ability to execute multiple instructions simultaneously (instruction level parallelism). CPUs with these instruction sets can use data level parallelism, which allows a program to execute several computations at the same time.
Picking the correct version of MMseqs2
Basically, SSE4.1 and AVX2 allow MMseqs2 to do multiple data operations simultaneously, speeding up the program. Not every CPU has SSE4.1 or AVX2, so to use MMseqs2, you will need to determine if your computer has one of these, and if so, which instruction set your computer has. Then you can download and use the correct pre-compiled release for whichever instruction set your computer supports. AVX is generally faster than SSE, so you will want to use the AVX2 version of MMseqs2 if you can.
If you are running MMseqs2 locally, this is no problem. You can check whether your computer supports AVX2 or SSE4.1 and then pick the correct MMseqs2 version. However, I generally run MMseqs2 on a computing cluster. The cluster that our lab uses has a lot of nodes with various architectures. Some nodes have SSE4.1, some have AVX2, and some have neither. Because of this, a different version of MMseqs2 is needed depending on where the job is run.
The cluster uses slurm for job scheduling, which means that I could figure out which nodes have AVX2 and which have SSE4.1 and then specify the allowed nodes in my slurm submission script (for example, using
--nodelist=node_name) whenever I want to run MMseqs2. I got tired of doing this pretty quickly, so I thought it would be nice to write a little shell script to automatically pick the correct version of MMseqs2 depending on the instruction set of the CPU. It turns out that this is pretty easy to do. Let’s walk through it!
Note: Like my last two posts, I’m going to go into a fair bit of detail so that it is accessible to beginners. If you just want to see the final result, skip down to the bottom.
Checking for instruction sets
First, we need a way to figure out which instruction set the computer we will run MMseqs2 on has. It is a bit different depending on whether you are running on Linux or on a Mac. Here is how you would check for the instruction sets on Linux:
And on a Mac…
(Here is some more info on interpreting the
If your computer has either of the instruction sets, then the
grep command will print a line to the terminal. If not, nothing will be shown. For example, on my Mac, when I check for
SSE4.1 here is the result
grep returned a line with
SSE4.1 in it, I know that my laptop supports SSE4.1. On the other hand, when I run command to check for
AVX2, nothing is printed, so my laptop doesn’t support AVX2. Since my computer has SSE4.1, but not AVX2, I would go and grab a precompiled SSE4.1 binary for Mac from the MMseqs2 GitHub page and use that.
A wrapper script for MMseqs2
As I mentioned above, it’s simple to pick an MMseqs2 binary for my own computer, but our computing cluster has some nodes with AVX2 and some with SSE4.1. To deal with this, we will write a little shell script that determines which instruction set a computer has, and then runs the correct version of MMseqs2 for that instruction set.
First, download both the SSE4.1 and the AVX2 versions of MMseqs2 from the website. Then, put each of those in its own location. For example, I have the AVX2 binary here
and the SSE4.1 binary here
The first thing our script needs is the Shebang line, which specifies the interpreter for our script, and some variables to hold the locations of the different MMseqs2 binaries.
Checking for AVX2 instructions
Since we want to use the AVX2 version of MMseqs2 if possible, we will check for that instruction set first.
That line is the same command to check for AVX2 on Linex shown above, but with the output redirected to
/dev/null is the null device, a special file that discards anything written to it. If we didn’t redirect the
grep program’s output to
/dev/null, then every time we ran our script, it would spit out a long line of stuff to the terminal, which we don’t want. So redirecting the output in this way will keep our script’s output nice and clean.
Now we need to check whether the command was successful. For this, we can check the exit code of the
grep command. If you check the man page for
grep, the Exit Status section states that
0 if there were any selected lines, and something else otherwise. So, if we check for a zero exit code from the
grep command, we will find out whether or not the computer has the AVX2 instructions. Here is an example. (The
$ means whatever follows is a command entered into the terminal.)
$? returns the exit status of the last command that was run (in this case, that would be
grep). In the first example, running
1 indicating failure (this computer doesn’t have the AVX2 instruction set), whereas in the second case, it returns
0, indicating success (this computer does have AVX2 instructions).
Alright, so let’s add a check like this into our bash script. For this, we can use an if statement. In bash, it will look something like this.
[ $? -eq 0 ] (don’t forget the spaces around the brackets!) part checks whether the exit code of the last command was zero (i.e., was successful). If the first
grep command is successful, then the computer has AVX2 instructions, so we want to run the AVX2 version of MMseqs2. Here is how that would look.
$@ variable gives all command line parameters separated by spaces. That way, any arguments passed in to our wrapper script will get passed in to the actual
mmseqs program. For example, if we name our wrapper script
mmseqs_wrapper.sh and call it like this
"$mmseqs_avx2" "$@" line will basically be like writing this
Checking for SSE4.1 instructions
If the computer doesn’t have AVX2, we want to check to see if it has SSE4.1. We can use this command
grep sse4_1 /proc/cpuinfo > /dev/null for that. We add it to the else branch like this:
Now we can check the exit code of that command, and if successful, run the SSE4.1 version of MMseqs2. To do that, we will add another
if/else statement similar to the first one. (There may be a cleaner way to do this avoiding the nested if statements, but we can keep it like this for now.)
Handling CPUs without AVX2 or SSE4.1
Finally, we need to do something in the last
else branch. If the computer has neither AVX2 or SSE4.1, then MMseqs2 will not work. So let’s print a message to stderr and return a failing exit code. Since
0 is usually used to denote success, we will use
1 to indicate failure.
In bash, to print something to stderr, we can use the
>&2 operator with the
echo command like this:
>&2 echo "hello". Check out this article for more info on redirection and the difference between
To exit our script with a failing error code, we can use the
exit command with an argument. For example,
exit 1 would terminate the script and return an exit code of
Adding those two things to our
if/else statement, we now have
Alright that’s everything! The final script looks like this:
If you want to download and use the above code, here is a link to the code.
If you enjoyed this post, consider sharing it on Twitter and subscribing to the RSS feed! If you have questions or comments, you can find me on Twitter or send me an email directly.← Go back