DragonFFI

Foreign Function Interface and JIT for C code

https://github.com/aguinet/dragonffi

 

FOSDEM 2018 - Adrien Guinet (@adriengnt)
2018/02/04

Content of this talk

  • whoami
  • FFI? (and related work)
  • FFI for C with Clang/LLVM
  • Demo time
  • What's next

Whoami

Adrien Guinet (@adriengnt)

  • Quarkslab with the two previous guys
  • Working on an LLVM-based obfuscator

FFI?

Wikipedia: A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written in another.

 

In our case: (compiling and) calling C functions from any language

 

Python code calling a C function

import pydffi
CU = pydffi.FFI().cdef("int puts(const char* s);");
CU.funcs.puts("hello world!")

            

What's the big deal?

C functions are usually called from "higher" level languages for performances...

  • ...but C functions are compiled for a specific ABI
  • There isn't *one* ABI, this is system/arch dependant
  • It's a huge mess!

=> We don't want to deal with it, we want a library that makes this for us!

Related work

  • libffi: reference library, implements a lot of existing ABI and provides an interface to call a C function
  • 
    ffi_cif cif;
    ffi_type *args[] = {&ffi_type_pointer};
    void* values[] = &s;
    
    ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 1, 
    	       &ffi_type_sint, args);
    s = "Hello World!";
    ffi_call(&cif, puts, &rc, values);
    
  • cffi: uses libffi to provide this interface to Python, and uses pycparser to let the user define C functions/types easily

Why another one?

:
  • libffi: far from trivial to insert a new ABI (hand-written assembly) ; the ms_abi calling convention under Linux isn't supported.
  • cffi: does not support a lot of C construction:
    
    cffi.FFI().cdef("#include ")
    CDefError: cannot parse "#include "
    :2: Directives not supported yet
    
    
    cffi.FFI().cdef("__attribute__((ms_abi)) int foo(int a, int b) { return a+b; }")
    CDefError: cannot parse "__attribute__((ms_abi)) int foo(int a, int b) { return a+b; }"
    :2:15: before: (
    
                
  • I want to be able to use my libraries' headers out-of-the box!

FFI for C with Clang/LLVM

Why Clang/LLVM?

 

  • Clang can parse C code: parse headers to gather definitions (types/functions/attributes...)
  • Clang support lots of these ABIs, and LLVM can compile the whole thing
  • So let's put all of this together \o/

But that wouldn't be that easy right...? :)

FFI for C with Clang/LLVM

Finding the right type abstraction

Let's take this C code:

typedef struct {
  short a;
  int b;
} A;

void print_A(A s) {
  printf("%d %d\n", s.a, s.b);
}

            

$ clang -Xclang -ast-dump a.c
-RecordDecl 0x[..]  line:2:9 struct definition
| |-FieldDecl 0x[..]  col:9 referenced a 'short'
| `-FieldDecl 0x[..]  col:7 referenced b 'int'

						
Too high level: no information about the layout of the structure! (padding?)

FFI for C with Clang/LLVM

Finding the right type abstraction


typedef struct {
  short a;
  int b;
} A;

void print_A(A s) {
  printf("%d %d\n", s.a, s.b);
}

            

$ clang -S -emit-llvm -o - a.c
[..]
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [7 x i8] c"%d %d\0A\00", align 1

define void @print_A(i64) local_unnamed_addr #0 {
...
}

						
Too Low Level (LLVM right?): no structure type is defined (due to structure coercion, ABI-specific!)

FFI for C with Clang/LLVM

Finding the right type abstraction


typedef struct {
  short a;
  int b;
} A;

void print_A(A s) {
  printf("%d %d\n", s.a, s.b);
}

            

$ clang -S -emit-llvm -o - -m32 a.c
target triple = "i386-pc-linux-gnu"
%struct.A = type { i16, i32 }

@.str = private unnamed_addr constant [7 x i8] c"%d %d\0A\00", align 1

; Function Attrs: nounwind
define void @print_A(%struct.A* byval nocapture readonly align 4) local_unnamed_addr #0 {
...
}

FFI for C with Clang/LLVM

Finding the right type abstraction

DWARF debug information to the rescue!

typedef struct {
  short a;
  int b;
} A;

void print_A(A s) {
  printf("%d %d\n", s.a, s.b);
}

            

$ clang -S -emit-llvm -o - -m32 a.c -g
!11 = distinct !DICompositeType(tag: DW_TAG_structure_type, size: 64, elements: !12)
!12 = !{!13, !15}
!13 = !DIDerivedType(tag: DW_TAG_member, name: "a", baseType: !14, size: 16)
!14 = !DIBasicType(name: "short", size: 16, encoding: DW_ATE_signed)
!15 = !DIDerivedType(tag: DW_TAG_member, name: "b", baseType: !16, size: 32, offset: 32)
!16 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

            

FFI for C with Clang/LLVM

DragonFFI type system

DWARF metadata are parsed to create DFFI types:

  • All basic C types (w/ non standards like (u)int128_t)
  • Arrays, pointers
  • Structures, unions, enums (w/ field offsets)
  • Function types

Every type can be const-qualified!

FFI for C with Clang/LLVM

Calling a C function

A DFFI function type is parsed to create a function call wrapper:

// For this function declaration
int puts(const char* s);

// We generate this wrapper
void __dffi_wrapper_0(int32_t ( __attribute__((cdecl)) *__FPtr)(char *),
  int32_t *__Ret,void** __Args) {
  *__Ret = (__FPtr)(*((char **)__Args[0]));
}

            
  • Clang handle all the ABI issues here!
  • Clang emits the associated LLVM IR, that can be jitted, and there we go!

FFI for C with Clang/LLVM

Issues with Clang

Main one

Unused declarations aren't emitted by Clang
(even with -g -femit-all-decls)

typedef struct { 
  short a;
  int b;
} A;

void print_A(A s);

$ clang -S -emit-llvm -g -femit-all-decls -o - a.c |grep print_A |wc -l
0

            

Demo time!

 

What's next

  • Reducing binary size: pydffi.cpython-36m-x86_64-linux-gnu.so is 55Mb
  • Force clang to emit type definitions
  • JIT and optimize the full glue from Python/Ruby/... to the C function call (easy::jit?)
  • OSX/Windows support
  • Support parsing of debug informations from shared libraries directly!

Thanks for your attention!

 

https://github.com/aguinet/dragonffi

 

pip install pydffi
For Linux x86/64 users only

 

Twitter: @adriengnt
Mail: adrien@guinet.me