Tutorial for Keystone

This short tutorial shows how the Keystone API works. There are more APIs than those used here, but this is all we need to get started.

1. Tutorial for C language

The following sample code presents how to compile 32-bit assembly instructions of X86 in C language.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
  /* test1.c */
  #include <stdio.h>
  #include <keystone/keystone.h>
  
  // separate assembly instructions by ; or \n
  #define CODE "INC ecx; DEC edx"
  
  int main(int argc, char **argv)
  {
      ks_engine *ks;
      ks_err err;
      size_t count;
      unsigned char *encode;
      size_t size;
  
      err = ks_open(KS_ARCH_X86, KS_MODE_32, &ks);
      if (err != KS_ERR_OK) {
          printf("ERROR: failed on ks_open(), quit\n");
          return -1;
      }
  
      if (ks_asm(ks, CODE, 0, &encode, &size, &count) != KS_ERR_OK) {
          printf("ERROR: ks_asm() failed & count = %lu, error = %u\n",
		         count, ks_errno(ks));
      } else {
          size_t i;
  
          printf("%s = ", CODE);
          for (i = 0; i < size; i++) {
              printf("%02x ", encode[i]);
          }
          printf("\n");
          printf("Compiled: %lu bytes, statements: %lu\n", size, count);
      }
  
      // NOTE: free encode after usage to avoid leaking memory
      ks_free(encode);
  
      // close Keystone instance when done
      ks_close(ks);
  
      return 0;
  }


To compile this file, we need a Makefile like below.

KEYSTONE_LDFLAGS = -lkeystone -lstdc++ -lm

all:
	${CC} -o test1 test1.c ${KEYSTONE_LDFLAGS}

clean:
	rm -rf *.o test1


Readers can get this sample code in a zip file here. Compile and run it as follows.

$ make
cc -o test1 test1.c -lkeystone -lstdc++ -lm

$ ./test1
INC ecx; DEC edx = 41 4a 
Compiled: 2 bytes, statements: 2


The C sample is intuitive, but just in case, readers can find below the explanation for each line of test1.c.


By default, Keystone accepts X86 assembly in Intel syntax. Keystone has an API named ks_option to customize its engine at run-time. Before running ks_asm, we can switch to X86 AT&T syntax by calling ks_option like below.

    ks_option(ks, KS_OPT_SYNTAX, KS_OPT_SYNTAX_ATT);


Sample code test2.c demonstrates X86 AT&T support.


2. Tutorial for Python language

The following code presents the same example as above, but in Python, to compile 32-bit assembly code of X86.

1
2
3
4
5
6
7
8
9
10
11
12
 from keystone import *

 # separate assembly instructions by ; or \n
 CODE = b"INC ecx; DEC edx"
 
 try:
   # Initialize engine in X86-32bit mode
   ks = Ks(KS_ARCH_X86, KS_MODE_32)
   encoding, count = ks.asm(CODE)
   print("%s = %s (number of statements: %u)" %(CODE, encoding, count))
 except KsError as e:
   print("ERROR: %s" %e)


Readers can get this sample code here. Run it with Python as follows.

$ ./test1.py
INC ecx; DEC edx = [65, 74] (number of statements: 2)


This Python sample is intuitive, but just in case, readers can find below the explanation for each line of test1.py.


By default, Keystone accepts X86 assembly in Intel syntax. To handle X86 AT&T syntax, we can simply switch to syntax AT&T like below.

   ks = Ks(KS_ARCH_X86, KS_MODE_32)
   ks.syntax = KS_OPT_SYNTAX_ATT


Sample code test2.py demonstrates X86 AT&T support.


3. More examples

This tutorial does not explain all the API of Keystone yet.