Keystone – The Ultimate Assembler

Keystone is a lightweight multi-platform, multi-architecture assembler framework.

Highlight features:

Multi-architecture, with support for Arm, Arm64 (AArch64/Armv8), Ethereum Virtual Machine, Hexagon, Mips, PowerPC, Sparc, SystemZ, & X86 (include 16/32/64bit).
Clean/simple/lightweight/intuitive architecture-neutral API.
Implemented in C/C++ languages, with bindings for Java, Masm, Visual Basic, C#, PowerShell, Perl, Python, NodeJS, Ruby, Go, Rust, Haskell & OCaml available.
Native support for Windows & *nix (with Mac OSX, Linux, *BSD & Solaris confirmed).
Thread-safe by design.
Open source.

Keystone is based on LLVM, but it goes much further with a lot more to offer.

Find in this Blackhat USA 2016 slides more technical details behind our assembler engine.

Version 0.9.2

June 21, 2020

We are thrilling to announce a stable release, version 0.9.2, of Keystone Engine!

Full source code & precompiled binaries are available in the download section.
A Github repo for Keystone is ready at https://github.com/keystone-engine/keystone.
See documentation for how to compile and install Keystone.
Learn quick from this tutorial on how to program with Keystone in C & Python.

This version fixes some important bugs inside the core of Keystone, added some new bindings & made some minor improvements. All users of Keystone are encouraged to upgrade to v0.9.1.

NOTE: Keystone is now available on PyPi in keystone-engine package. Python 3 users can easily install Keystone with:

pip install keystone-engine

In case you wish to upgrade from older version of Keystone, do:

pip install keystone-engine --upgrade

Remember to stick “sudo” in front for root privilege if needed.

Note that our Python binding also supports Python 2.

Keypatch 2.1

January 17, 2017

We are very excited to release Keypatch 2.1, the award-winning assembler for IDA Pro!

New features provided by this version includes:

Added a new “Search” function to search for assembly instructions, so it is easy to grep for ROP gadgets in the binary. This will be helpful for exploitation writers.
Removed the “Assembler” function, which is now redundant since you can also do that with the “Search” function above.
Better documentation for Linux & Windows installs.

Get full list of new features & source code of Keypatch at keystone-engine.org/keypatch

A quick tutorial for Keypatch is available.

Programming Tutorial with C & Python

October 2, 2016

This short tutorial shows how the Keystone API works. There are more APIs than those used here, but this is all we need to get started.

1. Tutorial for C language

The following sample code presents how to compile 32-bit assembly instructions of X86 in C language.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
  /* test1.c */
  #include <stdio.h>
  #include <keystone/keystone.h>
  
  // separate assembly instructions by ; or \n
  #define CODE "INC ecx; DEC edx"
  
  int main(int argc, char **argv)
  {
      ks_engine *ks;
      ks_err err;
      size_t count;
      unsigned char *encode;
      size_t size;
  
      err = ks_open(KS_ARCH_X86, KS_MODE_32, &ks);
      if (err != KS_ERR_OK) {
          printf("ERROR: failed on ks_open(), quit\n");
          return -1;
      }
  
      if (ks_asm(ks, CODE, 0, &encode, &size, &count) != KS_ERR_OK) {
          printf("ERROR: ks_asm() failed & count = %lu, error = %u\n",
		         count, ks_errno(ks));
      } else {
          size_t i;
  
          printf("%s = ", CODE);
          for (i = 0; i < size; i++) {
              printf("%02x ", encode[i]);
          }
          printf("\n");
          printf("Compiled: %lu bytes, statements: %lu\n", size, count);
      }
  
      // NOTE: free encode after usage to avoid leaking memory
      ks_free(encode);
  
      // close Keystone instance when done
      ks_close(ks);
  
      return 0;
  }

To compile this file, we need a Makefile like below.

KEYSTONE_LDFLAGS = -lkeystone -lstdc++ -lm

all:
	${CC} -o test1 test1.c ${KEYSTONE_LDFLAGS}

clean:
	rm -rf *.o test1

Readers can get this sample code in a zip file here. Compile and run it as follows.

$ make
cc -o test1 test1.c -lkeystone -lstdc++ -lm

$ ./test1
INC ecx; DEC edx = 41 4a 
Compiled: 2 bytes, statements: 2

The C sample is intuitive, but just in case, readers can find below the explanation for each line of test1.c.

Line 3: Include header file keystone.h before we do anything.
Line 6: Assembly string we want to compile. The code in this sample is X86 32bit, in Intel format. You can either separate assembly instructions in this string by “;” or “\n”.
Line 10: Declare a handle variable of the pointer to data type ks_engine. This handle will be used for every API of Keystone.
Line 11: Declare an error variable of the data type ks_err. This variable will be used to verify the result returned from all the API.
Line 12: Declare a variable to contain number of statements this program will compile (line 22).
Line 13: Declare encode, a pointer variable of the type unsigned char, which points to an array containing the encoding of compiled instructions.
Line 14: Declare size, a variable to contain the size (in bytes) of encode variable.
Line 16 ~ 20: Initialize Keystone with function ks_open. This API accepts 3 arguments: the hardware architecture, hardware mode and pointer to Keystone handle. In this sample, we want to assemble 32-bit code for X86 architecture. In return, we have the handle updated in variable ks. This API can fail in extreme cases, so our sample verifies the returned result against the error code KS_ERR_OK.
Line 22: Compile the input assembly string using the API ks_asm with the handle we got from the ks_open. The 2nd argument of ks_asm() is the assembly string we want to compile. The 3rd argument is the address of the first instruction, which can be ignored in some architectures such as X86. In return, this API gives back a dynamically allocated memory in the next argument encode, as well as its size in size. Keystone also lets us know how many statements in the input assembly was handled during this process, thus give us a hint where it stops in case the input has error.
Line 25 ~ 34: Print out instruction encoding of the input assembly returned in the memory array kept in encode variable.
Line 37: Using the API ks_free() to free memory kept in variable encode, which was allocated by ks_asm().
Line 40: Close the handle when we are done with the API ks_close().

By default, Keystone accepts X86 assembly in Intel syntax. Keystone has an API named ks_option to customize its engine at run-time. Before running ks_asm, we can switch to X86 AT&T syntax by calling ks_option like below.

    ks_option(ks, KS_OPT_SYNTAX, KS_OPT_SYNTAX_ATT);

Sample code test2.c demonstrates X86 AT&T support.

2. Tutorial for Python language

The following code presents the same example as above, but in Python, to compile 32-bit assembly code of X86.

1
2
3
4
5
6
7
8
9
10
11
12
 from keystone import *

 # separate assembly instructions by ; or \n
 CODE = b"INC ecx; DEC edx"
 
 try:
   # Initialize engine in X86-32bit mode
   ks = Ks(KS_ARCH_X86, KS_MODE_32)
   encoding, count = ks.asm(CODE)
   print("%s = %s (number of statements: %u)" %(CODE, encoding, count))
 except KsError as e:
   print("ERROR: %s" %e)

Readers can get this sample code here. Run it with Python as follows.

$ ./test1.py
INC ecx; DEC edx = [65, 74] (number of statements: 2)

This Python sample is intuitive, but just in case, readers can find below the explanation for each line of test1.py.

Line 1: Import Keystone module before using it.
Line 4: Assembly string we want to compile. The code in this sample is X86 32bit, in Intel format. You can either separate assembly instructions in this string by “;” or “\n”.
Line 8: Initialize Keystone with class Ks. This class accepts 2 arguments: the hardware architecture and hardware mode. This sample deals with 32-bit code for X86 architecture. In return, we have a variable of this class in ks.
Line 9: Compile assembly instruction using method asm. In return, we have a list of encoding bytes, and number of input statements that Keystone handled during compilation process, which gives us a hint where it stops in case the input has error.
Line 10: Print out the instruction encoding and number of assembly statements processed.
Line 11 ~ 12: handle exception in the type of KsError, which is triggered when something is wrong.

By default, Keystone accepts X86 assembly in Intel syntax. To handle X86 AT&T syntax, we can simply switch to syntax AT&T like below.

   ks = Ks(KS_ARCH_X86, KS_MODE_32)
   ks.syntax = KS_OPT_SYNTAX_ATT

Sample code test2.py demonstrates X86 AT&T support.

3. More examples

This tutorial does not explain all the API of Keystone yet.

For C sample, see code in directory samples/ in Keystone source.
For Python sample, see code in directory bindings/python/ in Keystone source.

Keypatch 2.0

September 14, 2016

We are very excited to release Keypatch 2.0, a better assembler for IDA Pro!

Following are the new features provided by this version.

Fix some issues with ARM architecture (including Thumb mode)
Better support for Python 2.6 & older IDA versions (confirmed to work on IDA 6.4)
Save original instructions (before patching) in IDA comments.
NOP padding also works when new instruction is longer than original instruction.
You can fill a range of selected code via a new function “Fill Range”
It is now possible to “undo” (revert) the last modification.
All the functions are now available via a popup menu (right-mouse click)

Get full list of new features & source code of Keypatch at keystone-engine.org/keypatch

A quick tutorial for Keypatch is available.

Introduction to Keypatch

August 2, 2016

Update: our Blackhat USA 2016 talk was over. Keypatch is now available at http://keystone-engine.org/keypatch.

1. Problem of the built-in IDA assembler

IDA Pro is the de-facto binary analysis tool widely used in the security community. While browsing the assembly code in IDA, we may want to modify the original code to change the behavior of the executable file. IDA offers this functionality in its menu “Edit \ Patch program \ Assemble”, in which we can type in new assembly to overwrite the existing code, as in the screenshot below.

However, this built-in assembler suffers from several significant issues, as follows.

Except X86, it does not support any other architectures. Due to this, when we open the menu on an ARM binary, IDA refuses with a message “Sorry, this processor module doesn’t support the assembler”.

Even on X86, IDA assembler fails to handle many simple X86_64 instructions. For example, the instruction “PUSH RAX” is refused with error “Invalid operand”.

We anticipated that IDA assembler misses all the latest X86 instructions (such as those from SGX extension), but actually it also fails on many not-so-modern X86 instructions. For example, AVX instruction “VDIVSS XMM2, XMM6, XMM4” is (wrongly) considered illegal with error “Invalid mnemonic”.

X86 assembler seems quite buggy, with many minor issues here and there. Example: if you enter invalid code “PUSH ESI” on an X86_64 binary, IDA assembler would happily accept that, but then overwrite the existing code with one byte “56”, which is actually for “PUSH RSI”.

If the new patched code is shorter than the original code, the orphan bytes after the new code are kept intact, which is mostly undesired. In the example below, 3 original bytes “48 89 FB” (for “MOV RBX, RDI”) are overwritten with 2 bytes “31 C0” (for “XOR EAX, EAX”). The orphan byte “FB” is still there, and decoded as instruction “STI”. Due to this, we need to perform one more step to patch this left-over byte with “NOP” opcode. Unfortunately, IDA does not do clear the orphan code.

IDA assembler does not log any changes, making it hard to track what and where code were modified. We would have to keep note on what we patched, which is cumbersome.

2. Keypatch

Unfortunately, there was no solution for all the above problems of IDA assembler. We decided to accept the challange, and build a new assembler plugin for IDA named Keypatch to solve all the existing issues.

Our tool offers some nice features as follows.

Keypatch leverages the power of Keystone assembler engine, so it can support 8 CPUs: X86, ARM, ARM64, Hexagon, Mips, PowerPC, Sparc & SystemZ. On each architecture, Keystone is able to handle the latest CPU instruction sets.
Our GUI makes it much easier to see what you would do: it shows the original code (before modifying), and new code that will patch your binary.

We have an option to automatically pad all the orphan bytes with NOP opcode.
Keypatch can understand & accept IDA symbols, so you can conveniently use them in assembly code, without having to convert them to immediates beforehand.
We make it easier to track what and where the code were modified by logging all the changes in the “Output” window of IDA, with content like:

...
Keypatch: attempt to modify "mov rbx, rdi" at 0x166 to "xor eax, eax"
Keypatch: patched 3 byte(s) at 0x166 from [48 89 FB] to [31 C0 90]

Keypatch has another functionality in its own menu “Edit \ Keypatch \ Assembler”, in which you can experimentally assemble arbitrary code on any architectures supported by Keystone. This convenient tool does not modify the original binary under analysis, so can be an extra weapon in reversing process.

Last but not least, Keypatch is open source, so it easy to fix bugs & add more features.

To summary, Keypatch has everything to replace the internal IDA assembler because it can do more, and do better. We believe that this little IDA plugin will be indispensible in your toolset of reverse engineering.

Version 0.9.1

July 27, 2016

We are thrilling to announce a stable release, version 0.9.1, of Keystone Engine!

Full source code & precompiled binaries are available in the download section.
A Github repo for Keystone is ready at https://github.com/keystone-engine/keystone.
See documentation for how to compile and install Keystone.
Learn quick from this tutorial on how to program with Keystone in C & Python.

This version fixes some important bugs inside the core of Keystone (especially X86 assembler), added some new bindings & made some minor improvements, without breaking compatibility. All users of Keystone are encouraged to upgrade to v0.9.1.

NOTE: Keystone is now available on PyPi in keystone-engine package. This package includes the core, Cmake is required to build the shared library. Then Python users can easily install Keystone with:

$ sudo pip install keystone-engine

See below for the changelog.

Core & tool

Fix a segfault in kstool (on missing assembly input).
kstool now allows to specify instruction address.
Build Mac libraries in universal format by default.
Add “lib32” option to cross-compile to 32-bit *nix (on 64-bit system).
Add “lib_only” option to only build libraries (skip kstool).
New bindings: Haskell & OCaml.

X86

Fix instructions: LJMP, LCALL, CDQE, SHR, SHL, SAR, SAL, LOOP, LOOPE, LOOPNE.
Better handling a lot of tricky input caught by assert() before.
Better support for Nasm syntax.

Arm

Fix BLX instruction.

Python binding

Better Python3 support.
Expose @stat_count in KsError class when ks_asm() returns with error. See sample code in bindings/python/sample_asm_count.py

Go binding

Fix Go binding for 32-bit

First public release!

May 31, 2016

We are very excited to announce the first public release, version 0.9, of Keystone Engine!

Full source code & precompiled binaries are available in the download section.
A Github repo for Keystone is ready at https://github.com/keystone-engine/keystone.
See documentation for how to compile and install Keystone.
Learn quick from this tutorial on how to program with Keystone in C & Python.

We would like show our gratitude to all the Indiegogo supporters, who financially contributed to the development of Keystone. We will never forget all the testers for incredible bug reports & code contributions during the beta phase! Without the invaluable helps of community, our project would not have gone this far!

Keystone aims to lay the ground for innovative works. We look forward to seeing many advanced research & development in the security area built on this engine. Let the fun begin!

Beta test

April 30, 2016

We are very excited to announce that we already released Keystone source code to some early adopters! Together, we will work hard to find and clean as many bugs as possible before making it public later.

We would like to thank those who are willing to put valuable time & efforts to help us in this phase! Believe us, the code are in good hands right now :-)

Thank you!

April 24, 2016

This post is to tribute to all of 99 Indiegogo contributors, who helped us to achieve Keystone fundraising goal, thus effectively made this project possible!

Especially, we would like to express our deep gratitude to 4 project sponsors, who made big contributions to our project!

Mike Guidry
Synacktiv Digital Security
Tim “diff” Strazzere
Veris Group

Also find below our heros (excluded anonymous contributors & listed in no particular order).

Sascha Schirra
Le Viet Cuong
Lôi Anh Tuấn
Duncan Ogilvie
Bruce Dang
Blah Cat
Michael Guidry
Joaquim Espinhara
Miao Yu
Rextency
Jaideep Jha
Peter.Dohm
Edgar Barbosa
Robert Yates
@elvanderb
Derek Morris
Phillip Moore
Matt Graeber
Son Tran
Spl3en
Mike A.
Sébastien Duquette
Bill Armstrong
Carlos.G.Prado
Pawel Wylecial
Michal Malik
davkaplan
Yannick F.
SYNACKTIV
Ward Wouts
Michael Eisendle
Oliv’
Ha Que Anh
dnet
Alex Tereshkin
Daniel Collin
Jason Jones
Matteo Favaro
Gionne Cannister
Markus Vervier (X41 D-Sec GmbH)
cstone
@JusticeRage
Vitaly Osipov
Alexander Hanel
Philip Da Silva
Dan Caselden
Edward Marczak
An Bach
Tina Wuest
@ChaosDatumz
Antonio Parata
Greg Lindor
Alex Bender
Jaime Peñalba
Long Le
Antonio Bianchi
Francisco Alonso
Anton Kochkov
Snare
Aditya K Sood
Richö Butts
Brent Dukes
cji
Blackwing
Firmware.RE
me
Colin Newell
Daniel Tomlinson
Robert Grimshaw
Nick Freeman
Postmodern Modulus III
Orta Therox
Fabio Pagani
William Sandin
butter
Haroon Meer (Thinkst)
Remco Verhoef
Sune Marcher
Matthew Daniel
Peter Fillmore
Sébastien Chapuis
Rascagneres Paul
Stefan Koehler
Grant Willcox
Benedikt Schmotzle

IndieGogo campaign closed

April 13, 2016

Our Indiegogo fundraising for Keystone project was successful with 165% funded! Thanks a lot to all the awesome 99 backers, you are the motivation and the very reason why Keystone sees the light of day, and will become available to public soon!

Since our stretch goal was also met, we will have support for GNU Gas & Nasm syntax in the first release of Keystone.

Here is our quick plan:

We will ship stickers to all the backers from level-32 and up soon from next week.
We are still collecting requests on T-shirt at the moment. We will order T-shirt printing after that, then post to all the backers from level-128 and up. We hope to start shipping from next week.
We are cleaning up code and fixing some issues. All the backers from level-512 and up will get the source code in about 2 weeks.
We will send the source code to all the backers of level-256 sometime in May.
If we can fix all the major issues, the first release of Keystone to pubic will be out in May or June.

More update about our project will be shared soon.

IndieGogo stretch goal

March 30, 2016

We have passed the initial IndieGogo funding goal of $10000 in just 1 week! Thanks a lot to everybody who believed in this project and supported us, you are awesome!

With about 10 more days to go, we decided to set out a stretch goal of $15000 to do support more types of assembly syntax.

Our motivation is that Keystone is based on LLVM, which only supports LLVM syntax. If we only compile simple instructions, we will be fine as we are. But if the code has directives, macros, comments and so on, then the assembly syntax matters because each assembler has different way to express their languages.

If the stretch goal of $15000 is reached, the first public version of Keystone will support GNU Gas & Nasm syntaxes, while leaving the support for other assemblers open through a plugin interface.

Things we must do to support Gas & Nasm for this stretch goal.

Investigate the syntaxes of these assemblers.
Refactor the assembly parser of Keystone to support external syntaxes.
Design a plugin interface for external assembly syntaxes, so that it is easy to add more syntaxes in the future.
Implement Gas & Nasm support, and allow to choose non-default syntax at run-time.

When ready, we can enable these syntaxes when setting up the engine, like below.

ks_engine *ks;
ks_open(KS_ARCH_X86, KS_MODE_32, &ks);
ks_option(ks, KS_OPT_SYNTAX, KS_OPT_SYNTAX_NASM);

With Python, this can be done simply like below.

ks = Ks(KS_ARCH_X86, KS_MODE_32)
ks.syntax = KS_OPT_SYNTAX_NASM

We think the option of freely choosing the assembly syntax is important. Please help to spread the news of this stretch goal, and do back us so we will finally have a nice assembler with full feature when Keystone is released!

Indiegogo campaign!

March 17, 2016

We are very excited to launch the crowd-funding campaign for Keystone assembler engine on IndieGogo!

A multi-architecture, multi-platform open source assembler framework is a missing piece in a chain of fundamental engines for reverse engineering. After Capstone & Unicorn, Keystone is the latest of our on-going effort to bring better tools to the security community.

Keystone involves a lot of hardwork, however. Therefore, we hope to have community support via this IndieGogo campaign, so we can push this to the end goal, and all of us can finally have a nice asseembler engine!

Get behind Keystone project, so together we can solve the problem of missing an assembler framework once, and for all: https://igg.me/at/keystone/.

Update: if you do not want to use IndieGogo, we accept donation on Paypal & Bitcoin. All the perks from IndieGogo still apply.

Paypal address: keystone.engine -at- gmail.com.
Bitcoin: 1fGz2GYSjiJxUoACpsHXcGmaAhbEDTuWi (link)

Find how we will update information on the project below.

Keystone now has a mailing list. Subscribe to the list for updated information & do conversation
Users are also encouraged to follow us on Twitter for important announcements.
Email to keystone.engine -at- gmail.com if you are too shy for public discussion.