Zero'ing memory, compiler optimizations and memset_s
tl;dr: use this code
When a program uses a secret key for some cryptographic operation, it will store it somewhere in memory. This is a problem because it is trivial to read what has been previously stored in memory from a different program, just create something like this:
#include <stdio.h>
int main(){
    unsigned char a[5000];
    for(int i = 0; i < 10000; i++) {
        printf("x", a[i]);
    }
    printf("\n");
}
This will print out whatever was previously there in memory, because the buffer a is not initialized to zeros. Actually, C seldom initializes things to zeros, it can if you specifically use something like calloc instead of malloc or static in front of a global variable/struct/…
EDIT: as Fred Akalin pointed to me, it looks like this is fixed in most modern OS. Colin Perceval notes that there are other issues with not zero’ing memory:
if someone is able to exploit an unrelated problem — a vulnerability which yields remote code execution, or a feature which allows uninitialized memory to be read remotely, for example — then ensuring that sensitive data (e.g., cryptographic keys) is no longer accessible will reduce the impact of the attack. In short, zeroing buffers which contained sensitive information is an exploit mitigation technique.
This is a problem.
To remove a key from memory, developers tend to write something like this:
memset(private_key, 0, sizeof(*private_key));
Unfortunately, when the compiler sees something like this, it will remove it. Indeed, this code is useless since the variable is not used anymore after, and the compiler will optimize it out.
How to fix this issue?
A memset_s function was proposed and introduced in C11. It is basically a safe memset (you need to pass in the size of the pointer you’re zero’ing as argument) that will not get optimized out. Unfortunately as Martin Sebor notes:
memset_s is an optional feature of the C11 standard and as such isn’t really portable. (AFAIK, there also are no conforming C11 implementations that provide the optional Annex K in which the function is defined.)
To use it, a #define at the right place can be used, and another #define is used as a notice that you can now use the memset_s function. 
#define __STDC_WANT_LIB_EXT1__ 1
#include <string.h>
#include <stdlib.h>
// ...
#ifdef __STDC_LIB_EXT1__
memset_s(pointer, size_data, 0, size_to_remove);
Unfortunately you cannot rely on this for portability. For example on macOS the two #define are not used and you need to use memset_s directly.
Martin Sebor adds in the same comment:
The GCC -fno-builtin-memset option can be used to prevent compatible compilers from optimizing away calls to memset that aren’t strictly speaking necessary.
Unfortunately, it seems like macOS’ gcc (which is really clang) ignores this argument.
What else can we do?
I asked Robert Seacord who always have all the answers, here’s what he gave me in return:
void *erase_from_memory(void *pointer, size_t size_data, size_t size_to_remove) {
    if(size_to_remove > size_data) size_to_remove = size_data;
    volatile unsigned char *p = pointer;
    while (size_to_remove--){
       *p++ = 0;
    }
    return pointer;
}
Does this volatile keyword works?
Time to open gdb (or lldb) to verify what the compiler has done. (This can be done after compiling with or without -O1, -O2, -O3 (different levels of optimization).)
Let’s write a small program that uses this code and debug it:
int main(){
    char a[6] = "hello";
    printf("%s\n", a);
    erase_from_memory(a, 6, 6);
}

- we open gdb with the program we just compiled
- we set a break point on main
- we run the program which will stop in main

We notice a bunch of movb $0x0 ...
Is this it? Let’s put a breakpoint on the first one and see what the stack pointer (rsp) is pointing to.

It’s pointing to the string “hello” as we guessed.

Going to the next instruction via ni, we can then see that the first letter h has been removed. Going over the next instructions, we see that the full string end up being zero’ed.

It’s a success!
The full code can be seen here as an erase_from_memory.h header file that you can just include in your codebase:
#ifndef __ERASE_FROM_MEMORY_H__
#define __ERASE_FROM_MEMORY_H__ 1
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h> 
#include <string.h>
void *erase_from_memory(void *pointer, size_t size_data, size_t size_to_remove) {
  #ifdef __STDC_LIB_EXT1__
   memset_s(pointer, size_data, 0, size_to_remove);
  #else
   if(size_to_remove > size_data) size_to_remove = size_data;
     volatile unsigned char *p = pointer;
     while (size_to_remove--){
         *p++ = 0;
     }
  #endif
    return pointer;
}
#endif // __ERASE_FROM_MEMORY_H__
Many thanks to Robert Seacord!
PS: here is how libsodium does it
EDIT: As Colin Percival wrote here, this problem is far from being solved. Secrets can get copied around in (special) registers which won’t allow you to easily remove them.
