I am currently procrastinating undoing the mess I made with CMake files for a bigger project I’m working on. It’s not hard– it’s just annoying, and I have no one to blame but myself. I did this intentionally because confusingly designed code seemed like a good anti-reversing trick. It’s an anti-coding trick. Don’t do it intentionally, you’ll just confuse yourself in the end.

But one of these things I can use as an excuse to Not Do What I Need To: talk about the way I’m encrypting the code in the DLL Hell I’ve crafted for myself. If you’ve been following along, prepare for another application of forbidden PE knowledge.

The Basic Concept

The tl;dr if you’ve been following along with the previous blogs:

  • Designate specific code and data sections
  • Write a build step that encrypts those sections
  • Decrypt the sections at runtime with a TLS directory

Let’s cover the details of that.

So you’re writing a program. People wanna snoop on your program somehow, and you don’t like that! The most common technique I’ve seen in malware is encrypted strings and other data within the program. It’s just an asshole thing to do– it wastes the analyst’s time trying to undo whatever technique was applied manually to figure out what’s going on. Ideally we would want to get as granular as encrypting the funcitons like SmokeLoader, but we can still be a runtime pest without being an elite maldev. Enter the TLS Directory.

Whether or not it was intended, we can use the TLS directory callbacks as a pre-load function within our binary. Meaning, we can run code before the main routine. In a previous post, we covered how to allocate code and data in specific .exe segments for the sake of creating shellcode. Now we’ll apply that same technique to creating an encrypted binary!

Ultimately there’s multiple ways you can do this, but they come down to two critical forms: does the analyst have the key to decrypt or don’t they? Naturally letting them have the key in some way is much easier to contend with, so let’s cover it first and then show the ways we can pull the rug out from under them afterward. Explore the repo to follow along and see how things are built!

Encrypting the Binary

Encrypting the executable is a little bit tricky. We could just encrypt the .text section, but where would we host our decryption code? Not to mention, even if we allocated the decryption routine into its own section of the binary, there’s still possibly code relied upon within the text section in said routine. It would be beneficial to allocate new sections for code and data you want encrypted with the section pragma and its friends.

#pragma code_seg(push, r1, ".encc")
#pragma data_seg(push, d1, ".encd")

This puts code and global data into these respective sections for the length of the definition until a corresponding pop on each pragma. We can then encrypt the generated sections post-compile.

That lays out our target code, but where do we host decryption code? Easy: allocate the decryption functionality in their own sections independent of the encrypted sections! Now how do we encrypt our binary to begin with? We create a post-build command for our build that triggers, encrypts the target sections and– for our static key method– inserts the decryption key into the decryption section. Then at runtime, we use a TLS callback to decrypt the section and data we encrypted externally. Here’s what the decryption code can look like.

VOID WINAPI decrypt_sheep(PVOID dll_handle, DWORD reason, PVOID reserved) {
   static bool decrypted = false;

   if (decrypted)
      return;

   uint8_t *bin_data = (uint8_t *)dll_handle;
   PIMAGE_DOS_HEADER dos_header = (PIMAGE_DOS_HEADER)bin_data;
   PIMAGE_NT_HEADERS64 nt_headers = (PIMAGE_NT_HEADERS)&bin_data[dos_header->e_lfanew];
   PIMAGE_SECTION_HEADER section_table = get_section_table(bin_data);
   PIMAGE_SECTION_HEADER etext, edata;
   etext = NULL;
   edata = NULL;

   for (size_t i=0; i<nt_headers->FileHeader.NumberOfSections; ++i) {
      if (memcmp(&section_table[i].Name[0], ".encc", strlen(".encc")) == 0)
         etext = &section_table[i];
      else if (memcmp(&section_table[i].Name[0], ".encd", strlen(".encd")) == 0)
         edata = &section_table[i];
   }

   assert(etext != NULL && edata != NULL);
   DWORD old_etext, old_edata;
   
   assert(VirtualProtect(&bin_data[etext->VirtualAddress], etext->Misc.VirtualSize, PAGE_EXECUTE_READWRITE, &old_etext));
   assert(VirtualProtect(&bin_data[edata->VirtualAddress], edata->Misc.VirtualSize, PAGE_READWRITE, &old_edata));
   
   IMAGE_NT_HEADERS64 original_headers = original_nt_headers((HMODULE)dll_handle);
   
   relocate_section(bin_data, ".encc", (uintptr_t)bin_data, original_headers.OptionalHeader.ImageBase);
   relocate_section(bin_data, ".encd", (uintptr_t)bin_data, original_headers.OptionalHeader.ImageBase);
   rc4(&bin_data[etext->VirtualAddress], etext->SizeOfRawData, (const uint8_t *)RC4_KEY, strlen(RC4_KEY));
   rc4(&bin_data[edata->VirtualAddress], edata->SizeOfRawData, (const uint8_t *)RC4_KEY, strlen(RC4_KEY));
   relocate_section(bin_data, ".encc", original_headers.OptionalHeader.ImageBase, (uintptr_t)bin_data);
   relocate_section(bin_data, ".encd", original_headers.OptionalHeader.ImageBase, (uintptr_t)bin_data);

   DWORD new_etext, new_edata;
   assert(VirtualProtect(&bin_data[etext->VirtualAddress], etext->Misc.VirtualSize, old_etext, &new_etext));
   assert(VirtualProtect(&bin_data[edata->VirtualAddress], edata->Misc.VirtualSize, old_edata, &new_edata));

   decrypted = true;
}

#pragma code_seg(pop, r1)

#pragma comment(linker, "/INCLUDE:_tls_used")
#pragma comment(linker, "/INCLUDE:decrypt_callback")
#pragma const_seg(push, c1, ".CRT$XLAAA")
const PIMAGE_TLS_CALLBACK decrypt_callback = decrypt_sheep;
#pragma const_seg(pop, c1)

VirtualProtect is called on the section RVAs because the loader loads them into independent segments as designated by the PE section table. By default, those sections are not writable, so we change that to be able to rewrite our ciphertext.

When creating a TLS callback, it’s important to be aware of the fact that this function gets called every time a DLL is loaded or unloaded in the binary. So it’s important we create a failsafe that doesn’t double-decrypt our binary data.

You might be wondering why I’m calling a function called original_nt_headers in the code. This calls GetModuleFileNameA on the module handle to read the headers from disk. We need this in order to get the original ImageBase, because upon loading, the Windows loader overwrites this value in the headers with its loaded address base. With the original ImageBase value, we can properly undo the relocations on our encrypted sections.

The following code is the crux of how we externally encrypt our binary’s juicy data. If you’d like to know why in the hell the section table is acquired the way I’m doing it, that’s because we’re calculating exactly where the offset of the section table is. Don’t ask me why it’s this way, I probably grumble about it when mostly describing the PE format.

bool encrypt_section(uint8_t *bin_data, size_t bin_size, const char *section, const char *key) {
   PIMAGE_DOS_HEADER dos_header = (PIMAGE_DOS_HEADER)bin_data;
   PIMAGE_FILE_HEADER file_header = (PIMAGE_FILE_HEADER)&bin_data[dos_header->e_lfanew+sizeof(DWORD)];
   PIMAGE_SECTION_HEADER section_table = (PIMAGE_SECTION_HEADER)&bin_data[dos_header->e_lfanew+sizeof(DWORD)+sizeof(IMAGE_FILE_HEADER)+file_header->SizeOfOptionalHeader];

   for (size_t i=0; i<file_header->NumberOfSections; ++i) {
      if (memcmp(&section_table[i].Name[0], &section[0], strlen(section)) != 0)
         continue;

      rc4(&bin_data[section_table[i].PointerToRawData], section_table[i].SizeOfRawData, (const uint8_t *)key, strlen(key));

      return true;
   }

   return false;
}

RC4 is an absolute meme for a reason: it’s incredibly easy to implement, it produces relatively entropic ciphertext, and if you open a random sample over on Malware Bazaar, you’ll probably find someone using it. To glue it all together, we stick this encryption in a build helper binary and then stick it in the build chain.

add_custom_command(TARGET static_key
  POST_BUILD
  COMMAND encrypt_section ARGS -b "$<TARGET_FILE:static_key>" -x .encc -d .encd -s .tls1)

The build helper will generate an encryption key for the encryptable sections, then insert that key into our prewritten stub section. It will use this inserted key to decrypt our binary on program startup. Voila, we have an encrypted sheep downloader with an embedded decryption key.

So it’s great that we’ve encrypted our binary, but what could we do to be a further asshole? Take the key away from the analyst!

Taking the Key

Do you know what gets an analyst to hate you? Making your key ephemeral in some way or another. Making it bound to the laws of Internet decay, cursing the analyst to arrive just in time to decrypt the data and fail. A common way malware operators do this is by having the key be a command line argument, forcing the analyst to capture the command line arguments at runtime. “But wait,” you may think, “there’s no command line argument in the TLS callback!” Thankfully Windows is the kitchen sink and more, and provides you the function GetCommandLine. This function provides the process command line as a single string, not an argv array. For that you have CommandLineToArgvW, but fuck you if you want CommandLineToArgvA, you’re on your own. I hate Microsoft’s inconsistencies. Here is an implementation of that function.

Anything else you can think of that ultimately keeps the decryption keys out of the analyst’s hands is the idea here. Things like having the cryptographic key hidden behind some sort of command-and-control gateway works too. Honestly you can’t keep the key entirely away from the analyst (you need to decrypt your code), but you can do everything in your power to make them lose it.

But there’s only so much you can do with the key. You can obfuscate it, sure, why the hell not? But how many layers of obfuscation are you gonna throw on that bad boy before you realize your onion is taking up the majority of your code? We’re fucking with a binary, so what the hell else can we do? Take advantage of the fun enegineering edgecases of the Windows loader and incorporating it into our decryption process!

Taking the Loader

How many times have you come across a specification or data structure and some motherfucker stuck in a big fat reserved field? Do you know what hackers really love? Undefined behavior. Why do you think there’s such a love affair for C, C++ and assembly?

Let’s take a look at the function definition of a TLS callback.

VOID WINAPI decrypt_sheep(PVOID dll_handle, DWORD reason, PVOID reserved) {

In a normal loading scenario, there is absolutely no way to set the reserved variable. It is literally dead space. Its normal value is NULL, and Microsoft has no plans to actually make it do anything more. Previously, we covered how to write a custom PE loader. All we have to do to really annoy the person who extracts the binary from the loader is add our decryption key to the reserved variable.

We take the previous technique.

   char *rc4_key = argv[1];

And combine it with the custom loader.

   if (tls_rva != 0) {
      PIMAGE_TLS_DIRECTORY64 tls_dir = (PIMAGE_TLS_DIRECTORY64)&valloc_buffer[tls_rva];
      void (**callbacks)(PVOID, DWORD, PVOID) = (void (**)(PVOID, DWORD, PVOID))tls_dir->AddressOfCallBacks;

      while (*callbacks != NULL) {
         (*callbacks)(valloc_buffer, DLL_PROCESS_ATTACH, rc4_key);
         ++callbacks;
      }
   }

This stupid little trick will force the analyst to use the loader provided to the program in order to get the program to work. It can simply be undone by decrypting the sections and removing the TLS directory from the PE headers and resuming analysis, but that’s a significant timesink, which is exactly what we want as a maldev.

Conclusions

To be as elite as VMProtect takes significant effort. The most we can hope for is to either hunker down and craft something similar, or do our damndest to waste the analyst’s time with tedium. Either way, I hope you learned something reading this!