How to start hacking on Valgrind

Presenter Notes

Easy hacks for valgrind

Mark Wielaard

First show next slide, so people can fetch sources, then talk.

Is it really easy?

No. Experience. Recognizing what has been done before.

Contains lies by omission. Looks easy, till you try yourself.

Better be called "cleanup stuff we could use some help with".

Presenter Notes

Get and build the sources

svn co svn://svn.valgrind.org/valgrind/trunk valgrind
cd valgrind
./autogen.sh
./configure --prefix=/opt/local/install
make -j4

And lets build and run the testsuite

make check -j4
make regtest

Presenter Notes

Easy Hack

(Not really)

Figure out why there are failing tests.

Hard because depends on:

  • architecture/instruction set
  • kernel version
  • glibc version
  • random stuff

But you learn a lot and there are setups where we get zero fail.

== 709 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures,
   0 stdoutB failures, 0 post failures ==

x86_64 Xeon E5-1620 Linux 3.10.0 glibc-2.17

Presenter Notes

Help is all around you. Please ask!

  • There will be a hack session after this talk.
  • There is an irc channel #valgrind-dev on irc.freenode.net
  • Mailinglist valgrind-developers@lists.sourceforge.net
  • bugzilla: https://bugs.kde.org/buglist.cgi?product=valgrind
    • Please, please, please file issues for everything.

Presenter Notes

Get your git

You will need a git repro (or mercurial) to easily find commits that fixed similar issues.

Valgrind uses svn submodules (VEX inside valgrind):

git svn clone svn://svn.valgrind.org/valgrind/trunk valgrind
cd valgrind
git svn clone svn://svn.valgrind.org/vex VEX

The above takes forever, don't do it right now!

Make sure to update them together:

 git svn fetch \
   && git svn rebase \
   && cd VEX \
   && git svn fetch \
   && git svn rebase \
   && cd ..

Presenter Notes

Easy Hack

(More like a full blown infrastructure project)

Create the canonical git mirror, git stitch valgrind and VEX together, convince project to adopt it to maintain sources.

Presenter Notes

true

$ ./vg-in-place /bin/true
==8633== Memcheck, a memory error detector
==8633== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==8633== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright
==8633== Command: /bin/true
==8633== 
==8633== 
==8633== HEAP SUMMARY:
==8633==     in use at exit: 0 bytes in 0 blocks
==8633==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==8633== 
==8633== All heap blocks were freed -- no leaks are possible
==8633== 
==8633== For counts of detected and suppressed errors, rerun with: -v
==8633== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Perfect! Or...

Presenter Notes

I am being suppressed!

For counts of detected and suppressed errors, rerun with: -v
ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

$ ./vg-in-place -v /bin/true
[...]
--9318-- used_suppression:
2 dl-hack3-cond-1 /home/mark/src/valgrind/./.in_place/default.supp:1206

What is in the default suppressions?

{
   dl-hack3-cond-1
   Memcheck:Cond
   obj:*/lib*/ld-2.17*.so*
   obj:*/lib*/ld-2.17*.so*
   obj:*/lib*/ld-2.17*.so*
}

Presenter Notes

Easy Hack

(A history project)

Clean up the default suppressions

Configure will tell you which ones are used:

Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp
glibc-2.X-drd.supp glibc-2.34567-NPTL-helgrind.supp glibc-2.X.supp

Some comments in those files:

# Errors to suppress by default with XFree86 4.1.0)

# *** And a bunch of other stuff which is completely unrelated
# to X.  The default suppressions are a bit of a mess and could do
# with a good tidying up.

# Resulting from R H 8.0
# MontaVista Linux 4.0.1 on ppc32
# Ubuntu 10.04 on ARM (Thumb).  Not sure why this is necessary.

http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress

Presenter Notes

What is going on?

$ ./vg-in-place --default-suppressions=no /bin/true
==19444== 
==19444== Conditional jump or move depends on uninitialised value(s)
==19444==    at 0x401856B: index (strchr.S:58)
==19444==    by 0x40079D3: expand_dynamic_string_token (dl-load.c:429)
==19444==    by 0x40084F5: _dl_map_object (dl-load.c:2311)
==19444==    by 0x400160D: map_doit (rtld.c:624)
==19444==    by 0x400F313: _dl_catch_error (dl-error.c:177)
==19444==    by 0x4000E8D: do_preload (rtld.c:813)
==19444==    by 0x4004296: dl_main (rtld.c:1629)
==19444==    by 0x4016244: _dl_sysdep_start (dl-sysdep.c:244)
==19444==    by 0x4004DA3: _dl_start_final (rtld.c:329)
==19444==    by 0x4004DA3: _dl_start (rtld.c:555)
==19444==    by 0x4001427: ??? (in /usr/lib64/ld-2.17.so)

index (strchr) is an implementation using sse instructions in /lib64/ld-linux-x86-64.so.2 -> ld-2.17.so.

Presenter Notes

Why does this happen?

  • Real bug in program, bad use of unadressable or undefined memory.
  • Bug in valgrind/VEX instruction emulation or memcheck.

But most likely, see shared/vg_replace_strmem.c

We have our own versions of these functions for two reasons:
(a) it allows us to do overlap checking
(b) some of the normal versions are hyper-optimised, which fools
    Memcheck and cause spurious value warnings.  Our versions are
    simpler.
(c) the glibc SSE-variants can read past the end of the input data
    ranges. This can cause false-positive Memcheck / Helgrind / DRD
    reports.

Note that sometimes that last point may be worked around with --partial-loads-ok=yes. That allows "large" naturally aligned loads with some 'not addressabele' bytes.

Presenter Notes

Easy Hack

(but not in this case)

Provide more vg_preload functions. Check different arches.

Presenter Notes

What does vg_preload do?

shared/vg_replace_strmem.c:

#define STRCHR(soname, fnname) \
   char* VG_REPLACE_FUNCTION_EZU(20020,soname,fnname) ( const char* s, int c ); \
   char* VG_REPLACE_FUNCTION_EZU(20020,soname,fnname) ( const char* s, int c ) \
   { \
      HChar  ch = (HChar)c ; \
      const HChar* p  = s;   \
      while (True) { \
         if (*p == ch) return (HChar *)p; \
         if (*p == 0) return NULL; \
         p++; \
      } \
   }

// Apparently index() is the same thing as strchr()
#if defined(VGO_linux)
 STRCHR(VG_Z_LIBC_SONAME,          strchr)
 STRCHR(VG_Z_LIBC_SONAME,          __GI_strchr)
 STRCHR(VG_Z_LIBC_SONAME,          __strchr_sse2)
 STRCHR(VG_Z_LIBC_SONAME,          __strchr_sse2_no_bsf)
 STRCHR(VG_Z_LIBC_SONAME,          index)
# if !defined(VGP_x86_linux)
  STRCHR(VG_Z_LD_LINUX_SO_2,        strchr)
  STRCHR(VG_Z_LD_LINUX_SO_2,        index)
  STRCHR(VG_Z_LD_LINUX_X86_64_SO_2, strchr)
  STRCHR(VG_Z_LD_LINUX_X86_64_SO_2, index)
# endif

Presenter Notes

But why doesn't it work?

Hint. Look at the backtrace:

==19444== Conditional jump or move depends on uninitialised value(s)
==19444==    at 0x401856B: index (strchr.S:58)
==19444==    by 0x40079D3: expand_dynamic_string_token (dl-load.c:429)
==19444==    by 0x40084F5: _dl_map_object (dl-load.c:2311)
==19444==    by 0x400160D: map_doit (rtld.c:624)
==19444==    by 0x400F313: _dl_catch_error (dl-error.c:177)
==19444==    by 0x4000E8D: do_preload (rtld.c:813)
==19444==    by 0x4004296: dl_main (rtld.c:1629)
==19444==    by 0x4016244: _dl_sysdep_start (dl-sysdep.c:244)
==19444==    by 0x4004DA3: _dl_start_final (rtld.c:329)
==19444==    by 0x4004DA3: _dl_start (rtld.c:555)
==19444==    by 0x4001427: ??? (in /usr/lib64/ld-2.17.so)

Drat, it is doing the actual preloading...

Presenter Notes

Why don't we need this for x86?

Now your git repo comes in handy... valgrind r11992.

x86-linux: don't add redirections for strchr/index in ld.so since they
are already hardwiredly-redirected at startup, and so these are
redundant.

See coregrind/m_trampoline.S

/* ------------------ SIMULATED CPU HELPERS ------------------ */

Presenter Notes

What do we do for x86?

coregrind/m_trampoline.S:

/* There's no particular reason that this needs to be handwritten
   assembly, but since that's what this file contains, here's a
   simple index implementation (written in C and compiled by gcc.)

   unsigned char* REDIR_FOR_index ( const char* s, int c )
   {
      unsigned char  ch = (unsigned char)((unsigned int)c); 
      unsigned char* p  = (unsigned char*)s; 
      while (1) {
         if (*p == ch) return p;
         if (*p == 0)  return 0;
         p++; 
      }
   }
*/
.global VG_(x86_linux_REDIR_FOR_index)
.type   VG_(x86_linux_REDIR_FOR_index), @function
VG_(x86_linux_REDIR_FOR_index):
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        movzbl  12(%ebp), %ecx
        movzbl  (%eax), %edx
        cmpb    %dl, %cl
        jne     .L9
        jmp     .L2
.L11:
        addl    $1, %eax
        movzbl  (%eax), %edx
        cmpb    %dl, %cl
        je      .L2
.L9:
        testb   %dl, %dl
        jne     .L11
        xorl    %eax, %eax
.L2:
        popl    %ebp
        ret
.size VG_(x86_linux_REDIR_FOR_index), .-VG_(x86_linux_REDIR_FOR_index)

Presenter Notes

Lets patch amd64 to do the same

diff --git a/coregrind/m_redir.c b/coregrind/m_redir.c
index d87110b..6e5f1f3 100644
--- a/coregrind/m_redir.c
+++ b/coregrind/m_redir.c
@@ -1275,6 +1275,9 @@ void VG_(redir_initialise) ( void )
    if (0==VG_(strcmp)("Memcheck", VG_(details).name)) {

       add_hardwired_spec(
 +         "ld-linux-x86-64.so.2", "index",
 +         (Addr)&VG_(amd64_linux_REDIR_FOR_index), NULL);
 +      add_hardwired_spec(
           "ld-linux-x86-64.so.2", "strlen",
           (Addr)&VG_(amd64_linux_REDIR_FOR_strlen),
  #        ifndef GLIBC_MANDATORY_STRLEN_REDIRECT
diff --git a/coregrind/m_trampoline.S b/coregrind/m_trampoline.S
index 3d2be09..a3ff66c 100644
--- a/coregrind/m_trampoline.S
+++ b/coregrind/m_trampoline.S
@@ -220,6 +220,30 @@ VG_(amd64_linux_REDIR_FOR_strlen):
.LfnE5:
.size VG_(amd64_linux_REDIR_FOR_strlen), .-VG_(amd64_linux_REDIR_FOR_strlen)

+.global VG_(amd64_linux_REDIR_FOR_index)
+.type   VG_(amd64_linux_REDIR_FOR_index), @function
+VG_(amd64_linux_REDIR_FOR_index):
+        movzbl  (%rdi), %eax
+        movl    %esi, %edx
+        cmpb    %sil, %al
+        jne     .L4
+        jmp     .L5
+.L10:
+        addq    $1, %rdi
+        movzbl  (%rdi), %eax
+        cmpb    %dl, %al
+        je      .L5
+.L4:
+        testb   %al, %al
+        jne     .L10
+        xorl    %eax, %eax
+        ret
+.L5:
+        movq    %rdi, %rax
+        ret
+.size VG_(amd64_linux_REDIR_FOR_index), .-VG_(amd64_linux_REDIR_FOR_strlen)
diff --git a/coregrind/pub_core_trampoline.h b/coregrind/pub_core_trampoline.h
index 411a23c..eba3798 100644
--- a/coregrind/pub_core_trampoline.h
+++ b/coregrind/pub_core_trampoline.h
@@ -71,6 +71,7 @@ extern Addr VG_(amd64_linux_REDIR_FOR_vgettimeofday);
 extern Addr VG_(amd64_linux_REDIR_FOR_vtime);
 extern Addr VG_(amd64_linux_REDIR_FOR_vgetcpu);
 extern UInt VG_(amd64_linux_REDIR_FOR_strlen)( void* );
+extern Char* VG_(amd64_linux_REDIR_FOR_index) ( const Char*, Int );
 #endif

 #if defined(VGP_ppc32_linux)

Presenter Notes

Victory!

$ ./vg-in-place --default-suppressions=no /bin/true
==16308== Memcheck, a memory error detector
==16308== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==16308== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright
==16308== Command: /bin/true
==16308==
==16308==
==16308== HEAP SUMMARY:
==16308==     in use at exit: 0 bytes in 0 blocks
==16308==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==16308==
==16308== All heap blocks were freed -- no leaks are possible
==16308==
==16308== For counts of detected and suppressed errors, rerun with: -v
==16308== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Presenter Notes

Easy Hack

Write normal C functions to be used with add_hardwired_spec.

There really is nothing special about them and we now have multiple architectures wanting to hardwire the same implementation. Having to copy/paste the output of gcc -S is somewhat silly.

Presenter Notes

Is there are drawback to hardwiring?

Yes

It is less flexible than the LD_PRELOAD mechanism and it needs the symbol table of the file to replace a function in. We used to avoid having hardwires because of that. But these days we need some hardwires anyway (see patch above, there is already strlen).

Some distros however strip everything, even the symbol table of ld.so. Unpleasant user experience

Presenter Notes

Easy Hack

Convince distros (Debian) of the importance of README_PACKAGERS:

Do not ship your Linux distro with a completely stripped /lib/ld.so. At least leave the debugging symbol names on -- line number info isn't necessary. If you don't want to leave symbols on ld.so, alternatively you can have your distro install ld.so's debuginfo package by default, or make ld.so.debuginfo be a requirement of your Valgrind RPM/DEB/whatever.

Reason for this is that Valgrind's Memcheck tool needs to intercept calls to, and provide replacements for, some symbols in ld.so at startup (most importantly strlen). If it cannot do that, Memcheck shows a large number of false positives due to the highly optimised strlen (etc) routines in ld.so. This has caused some trouble in the past. As of version 3.3.0, on some targets (ppc32-linux, ppc64-linux), Memcheck will simply stop at startup (and print an error message) if such symbols are not present, because it is infeasible to continue.

It's not like this is going to cost you much space. We only need the symbols for ld.so (a few K at most). Not the debug info and not any debuginfo or extra symbols for any other libraries.

Presenter Notes

Easy Hack

Find and implement missing syscalls (or ioctls).

README_MISSING_SYSCALL_OR_IOCTL is a really good tutorial.

man syscalls has a list of syscalls per kernel version. http://kernelnewbies.org/LinuxVersions background on new syscalls.

Nice example is memfd_create added in valgrind svn r14875. http://bugs.kde.org/343012

Presenter Notes

syscall bonus for some arches

Some arches have diverged and just need to catch up the coregrind/m_syswrap/syswrap-<arch>-linux.c table.

Bug 340922 arm64: unhandled getgroups/setgroups syscalls.

--- a/coregrind/m_syswrap/syswrap-arm64-linux.c
+++ b/coregrind/m_syswrap/syswrap-arm64-linux.c
@@ -968,6 +968,8 @@ static SyscallTableEntry syscall_main_table[] = {
    GENX_(__NR_getpgid,           sys_getpgid),           // 155
    GENX_(__NR_getsid,            sys_getsid),            // 156
    GENX_(__NR_setsid,            sys_setsid),            // 157
 +  GENXY(__NR_getgroups,         sys_getgroups),         // 158
 +  GENX_(__NR_setgroups,         sys_setgroups),         // 159
    GENXY(__NR_uname,             sys_newuname),          // 160
    GENXY(__NR_getrlimit,         sys_old_getrlimit),     // 163
    GENX_(__NR_setrlimit,         sys_setrlimit),         // 164
@@ -1237,8 +1239,6 @@ static SyscallTableEntry syscall_main_table[] = {
  //ZZ    GENX_(__NR_setreuid32,        sys_setreuid),       // 203
  //ZZ    GENX_(__NR_setregid32,        sys_setregid),       // 204
  //ZZ
 -//ZZ    GENXY(__NR_getgroups32,       sys_getgroups),      // 205
 -//ZZ    GENX_(__NR_setgroups32,       sys_setgroups),      // 206
  //ZZ    LINX_(__NR_setresuid32,       sys_setresuid),      // 208
  //ZZ    LINXY(__NR_getresuid32,       sys_getresuid),      // 209

Presenter Notes

Easy Hack

Find and fix bad syscall wrappers.

LTP. Linux Test Project http://linux-test-project.github.io/

The LTP testsuite contains a collection of tools for testing the Linux kernel and related features. Our goal is to improve the Linux kernel and system libraries by bringing test automation to the testing effort.

git clone git://github.com/linux-test-project/ltp
cd ltp
make autotools
./configure
make -j4

Presenter Notes

A crash is easy to find...

$ valgrind ./testcases/kernel/syscalls/getsockname/getsockname01
--32037-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--32037-- si_code=1;  Faulting address: 0x1;  sp: 0x802c99ce0

valgrind: the 'impossible' happened:
   Killed by fatal signal

host stacktrace:
==32037==    at 0x380FEB16: deref_UInt (syswrap-generic.c:1145)
==32037==    by 0x380FEB16: vgModuleLocal_buf_and_len_pre_check (syswrap-generic.c:1152)
==32037==    by 0x380FC750: vgPlain_client_syscall (syswrap-main.c:1586)
==32037==    by 0x380F902A: handle_syscall (scheduler.c:1103)
==32037==    by 0x380FA706: vgPlain_scheduler (scheduler.c:1416)
==32037==    by 0x38109F90: thread_wrapper (syswrap-linux.c:103)
==32037==    by 0x38109F90: run_a_thread_NORETURN (syswrap-linux.c:156)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==32037==    at 0x4F36577: getsockname (syscall-template.S:81)
==32037==    by 0x401406: main (getsockname01.c:124)

Presenter Notes

getsockname

int getsockname(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

The addrlen argument should be initialized to indicate the amount
of space (in bytes) pointed to by addr.  On return it contains the
actual size of the socket address.

coregrind/m_syswrap/syswrap-generic.c:

static UInt deref_UInt ( ThreadId tid, Addr a, const HChar* s )
{
   UInt* a_p = (UInt*)a;
   PRE_MEM_READ( s, (Addr)a_p, sizeof(UInt) );
   if (a_p == NULL)
      return 0;
   else
      return *a_p;
}

void ML_(buf_and_len_pre_check) ( ThreadId tid, Addr buf_p, Addr buflen_p,
                                  const HChar* buf_s, const HChar* buflen_s )
{
   if (VG_(tdict).track_pre_mem_write) {
      UInt buflen_in = deref_UInt( tid, buflen_p, buflen_s);
      if (buflen_in > 0) {
         VG_(tdict).track_pre_mem_write(
            Vg_CoreSysCall, tid, buf_s, buf_p, buflen_in );
      }
   }
}

Presenter Notes

Patch

--- a/coregrind/m_syswrap/syswrap-generic.c
+++ b/coregrind/m_syswrap/syswrap-generic.c
@@ -1138,7 +1138,7 @@ static UInt deref_UInt ( ThreadId tid, Addr a, const HChar* s )
 {
    UInt* a_p = (UInt*)a;
    PRE_MEM_READ( s, (Addr)a_p, sizeof(UInt) );
-   if (a_p == NULL)
+   if (a_p == NULL || ! ML_(safe_to_deref) (a_p, sizeof(UInt)))
       return 0;
    else
       return *a_p;

Presenter Notes

Easy Hacks

  • Figure out why there are failing tests.
  • Create the canonical git mirror.
  • Clean up the default suppressions.
  • Check vg_preload functions for different arches.
  • Write normal C functions for add_hardwired_spec.
  • Convince distros of README_PACKAGERS (ld.so symtab)
  • Find and implement missing syscalls (or ioctls). README_MISSING_SYSCALL_OR_IOCTL
  • Find and fix bad syscall wrappers. (Wrap LTP)

Presenter Notes