How to Get Hacked by North Korea

Good news: exploiting memory safety vulnerabilities is becoming more difficult. Traditional security vulnerabilities will remain a serious threat, but attackers prefer to take the path of least resistance, and nowadays that is to attack developers rather than the software itself. Once the attackers control your computer, they can attempt to perform a supply chain attack and insert backdoors into your software, compromising all of your users at once.

If you’re a software developer, it’s time to start focusing on the possibility that attackers will target you personally. Yes, you. If you use Linux, macOS, or Windows, take a moment to check your home directory for a hidden .n2 folder. If it exists, alas! You have been hacked by the North Koreans. (Future malware campaigns will presumably be more stealthy than this.)

Attackers who target developers are currently employing two common strategies:

  • Fake job interview: you’re applying to job postings and the recruiter asks you to solve a programming problem of some sort, which involves installing software from NPM (or PyPI, or another language ecosystem’s package manager).
  • Fake debugging request: you receive a bug report and the reporter helpfully provides a script for reproducing the bug. The script may have dependencies on packages in NPM (or PyPI, or another language ecosystem’s package manager) to make it harder to notice that it’s malware. I saw a hopefully innocent bug report that was indistinguishable from such an attack just last week.

But of course you would never execute such a script without auditing the dependencies thoroughly! (Insert eyeroll emoji here.)

By now, most of us are hopefully already aware of typosquatting attacks, and the high risk that untrustworthy programming language packages may be malicious. But you might not have considered that attackers might target you personally. Exercise caution whenever somebody asks you to install packages from these sources. Take the time to start a virtual machine before running the code. (An isolated podman container probably works too?) Remember, attackers will target the path of least resistance. Virtualization or containerization raises the bar considerably, and the attacker will probably move along to an easier target.

Forcibly Set Array Size in Vala

Vala programmers probably already know that a Vala array is equivalent to a C array plus an integer size. (Unfortunately, the size is gint rather than size_t, which seems likely to be a source of serious bugs.) When you create an array in Vala, the size is set for you automatically by Vala. But what happens when receiving an array that’s created by a library?

Here is an example. I recently wanted to use the GTK function gdk_texture_new_for_surface(), which converts from a cairo_surface_t to a GdkTexture, but unfortunately this function is internal to GTK (presumably to avoid exposing more cairo stuff in GTK’s public API). I decided to copy it into my application, which is written in Vala. I could have put it in a separate C source file, but it’s nicer to avoid a new file and rewrite it in Vala. Alas, I hit an array size roadblock along the way.

Now, gdk_texture_new_for_surface() calls cairo_image_surface_get_data() to return an array of data to be used for creating a GBytes object to pass to gdk_memory_texture_new(). The size of the array is cairo_image_surface_get_height (surface) * cairo_image_surface_get_stride (surface). This is GTK’s code to create the GBytes object:

bytes = g_bytes_new_with_free_func (cairo_image_surface_get_data (surface),
                                    cairo_image_surface_get_height (surface)
                                    * cairo_image_surface_get_stride (surface),
                                    (GDestroyNotify) cairo_surface_destroy,
                                    cairo_surface_reference (surface));

The C function declaration of g_bytes_new_with_free_func() looks like this:

GBytes*
g_bytes_new_with_free_func (
  gconstpointer data,
  gsize size,
  GDestroyNotify free_func,
  gpointer user_data
)

Notice it takes both the array data and its size size. But because Vala arrays already know their size, Vala bindings do not contain a separate parameter for the size of the array. The corresponding Vala API looks like this:

public Bytes.with_free_func (owned uint8[]? data, DestroyNotify? free_func, void* user_data)

Notice there is no way to pass the size of the data array, because the array is expected to know its own size.

I actually couldn’t figure out how to pass a DestroyNotify in Vala. There’s probably some way to do it (please let me know in the comments!), but I don’t know how. Anyway, I compromised by creating a GBytes that copies its data instead, using this simpler constructor:

public Bytes (uint8[]? data)

My code looked something like this:

unowned uchar[] data = surface.get_data ();
return new Gdk.MemoryTexture (surface.get_width (),
                              surface.get_height (),
                              BYTE_ORDER == ByteOrder.LITTLE_ENDIAN ? Gdk.MemoryFormat.B8G8R8A8_PREMULTIPLIED : Gdk.MemoryFormat.A8R8G8B8_PREMULTIPLIED,
                              new Bytes (data),
                              surface.get_stride ());

But this code crashes because the size of data is unset. Without any clear way to pass the size of the array to the Bytes constructor, I was initially stumped.

My first solution was to create my own array on the stack, use Posix.memcpy to copy the data to my new array, then pass the new array to the Bytes constructor. And that actually worked! But now the data is being copied twice, when the C version of the code doesn’t need to copy the data at all. I knew I had to copy the data at least once (because I didn’t know how else to manually call cairo_surface_destroy() and cairo_surface_ref() in Vala, at least not without resorting to custom language bindings), but copying it twice seemed slightly egregious.

Eventually I discovered there is an easier way:

data.length = surface.get_height () * surface.get_stride ();

It’s that simple. Vala arrays have a length property, and you can assign to it. In this case, the bindings didn’t know how to set the length, so it defaulted to the maximum possible value, leading to the crash. After setting the length manually, the code worked properly. Here it is.