GPU side channel attacks can enable spying on web activity, password stealing

( Original text )

Computer scientists at the University of California, Riverside have revealed for the first time how easily attackers can use a computer’s graphics processing unit, or GPU, to spy on web activity, steal passwords, and break into cloud-based applications.

GPU side channel attacks

Threat scenarios

Marlan and Rosemary Bourns College of Engineering computer science doctoral student Hoda Naghibijouybari and post-doctoral researcher Ajaya Neupane, along with Associate Professor Zhiyun Qian and Professor Nael Abu-Ghazaleh, reverse engineered a Nvidia GPU to demonstrate three attacks on both graphics and computational stacks, as well as across them.

All three attacks require the victim to first acquire a malicious program embedded in a downloaded app. The program is designed to spy on the victim’s computer.

Web browsers use GPUs to render graphics on desktops, laptops, and smart phones. GPUs are also used to accelerate applications on the cloud and data centers. Web graphics can expose user information and activity. Computational workloads enhanced by the GPU include applications with sensitive data or algorithms that might be exposed by the new attacks.

GPUs are usually programmed using application programming interfaces, or APIs, such as OpenGL. OpenGL is accessible by any application on a desktop with user-level privileges, making all attacks practical on a desktop. Since desktop or laptop machines by default come with the graphics libraries and drivers installed, the attack can be implemented easily using graphics APIs.

The first attack tracks user activity on the web. When the victim opens the malicious app, it uses OpenGL to create a spy to infer the behavior of the browser as it uses the GPU. Every website has a unique trace in terms of GPU memory utilization due to the different number of objects and different sizes of objects being rendered. This signal is consistent across loading the same website several times and is unaffected by caching.

The researchers monitored either GPU memory allocations over time or GPU performance counters and fed these features to a machine learning based classifier, achieving website fingerprinting with high accuracy. The spy can reliably obtain all allocation events to see what the user has been doing on the web.

In the second attack, the authors extracted user passwords. Each time the user types a character, the whole password textbox is uploaded to GPU as a texture to be rendered. Monitoring the interval time of consecutive memory allocation events leaked the number of password characters and inter-keystroke timing, well-established techniques for learning passwords.

The third attack targets a computational application in the cloud. The attacker launches a malicious computational workload on the GPU which operates alongside the victim’s application. Depending on neural network parameters, the intensity and pattern of contention on the cache, memory and functional units differ over time, creating measurable leakage. The attacker uses machine learning-based classification on performance counter traces to extract the victim’s secret neural network structure, such as number of neurons in a specific layer of a deep neural network.

The researchers reported their findings to Nvidia, who responded that they intend to publish a patch that offers system administrators the option to disable access to performance counters from user-level processes. They also shared a draft of the paper with the AMD and Intel security teams to enable them to evaluate their GPUs with respect to such vulnerabilities.

In the future the group plans to test the feasibility of GPU side channel attacks on Android phones.

Реклама

Patching nVidia GPU driver for hot-unplug on Linux

( original text by @whitequark )

Recently, I’ve using an extremely cursed setup where my XPS 13 9360 laptop is connected to a Sonnet EchoExpress 2 box rewired for Thunderbolt 3 that has an nVidia Quadro 600 GPU, and Linux is set up for render offload to the eGPU and then frame transfer back to iGPU to be displayed on the laptop’s integrated display, which (to my sheer surprise) not only works quire reliably, but even gives me higher FPS in Team Fortress 2 than the iGPU.Картинки по запросу nvidiaThere’s only really one downside: if the eGPU falls off the bus, either because someone™ pulled out the cable, or because the stars didn’t align quite right this morning and it decided to enumerate seemingly at random (sometimes this is preceeded by whining from PCIe AER, sometimes not, I think it’s some sort of hardware issue like a badly inserted PCIe card, but I’m not entirely sure), the nVidia driver… hangs. Hangs quite deliberately, as the sources to the kernel driver show. This leaves the Xorg instance bound to the eGPU hung forever (which confuses bumblebee, but is otherwise not especially bad), and also prevents any new ones from using the eGPU (which is bad).

Anyway, I was kind of annoyed of rebooting every time it happens, so I decided to reboot a few more dozen times instead while patching the driver. This has indeed worked, and left me with something similar to a functional hot-unplug, mildly crippled by the fact that nvidia-modeset is a completely opaque blob that keeps some internal state and tries to act on it, getting stuck when it tries to do something to the now-missing eGPU.

Turns out, there are only a few issues preventing functional hot-unplug.

  1. In nvidia_remove, the driver actually checks if anyone’s still trying to use it, and if yes, it tries to just hang the removal process. This doesn’t actually work, or rather, it mostly works by accident. It starts an infinite loop calling os_schedule() while having taken the NV_LINUX_DEVICES lock. While in the default configuration this indeed hangs any reentrant requests into the driver by virtue of NV_CHECK_PCI_CONFIG_SPACE taking the same lock (in verify_pci_bars, passing the NVreg_CheckPCIConfigSpace=0 module option eliminates that accidental safety mechanism, and allows reentrant requests to proceed. They do not crash due to memory being deallocated in nvidia_remove (so you don’t get an unhandled kernel page fault), but they still crash due to being unable to access the GPU.
  2. The NVKMS component (in the nvidia-modeset module) tries to maintain some state, and change it when e.g. the Xorg instance quits and closes the /dev/nvidia-modeset file. Unfortunately, it does not expect the GPU to go away, and first spews a few messages to dmesg similar to nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000857d:0:0:0x0000000f, after which it appears to hang somewhere inside the blob, which has been conveniently stripped of all symbols. This needs to be prevented, but…
  3. The NVKMS component effectively only exposes a single opaque ioctl, and all the communication, including communication of the GPU bus ID, happens out of band with regards to the open source parts of the nvidia-modesetmodule. Fortunately, NVKMS calls back into NVRM, and this allows us to associate each /dev/nvidia-modeset fd with the GPU bus ID.
  4. When unloading NVKMS, it also tries to act on its internal state and change the GPU state, which leads to the same hang.

All in all, this allows a patch to be written that detects when a GPU goes away, ignores all further NVKMS requests related to that specific GPU (and returns -ENOENT in response to ioctls, which Xorg appropriately interprets as a fault condition), correctly releases the resources by requesting NVRM, and improperly unloads NVKMS so it doesn’t try to reset the GPU state. (All actual resources should be released by this point, and NVKMS doesn’t have any resource allocation callbacks other than those we already intercept, so in theory this doesn’t have any bad consequences. But I’m not working for nVidia, so this might be completely wrong.)

After the GPU is plugged back in, NVKMS will try to act on its internal state again; in this case, it doesn’t hang, but it doesn’t initialize the GPU correctly either, so the nvidia-modeset kernel module has to be (manually) reloaded. It’s not easy to do this automatically because in a hypothetical system with more than one nVidia GPU the module would still be in use when one of them dies, and so just hard reloading NVKMS would have unfortunate consequences. (Though, I don’t really know whether NVKMS would try to access the dead GPU in response to the request acting on the other GPU anyway. I decided to do it conservatively.) Once it’s reloaded you’re back in the game though!

Here’s the patch, written against the nvidia-legacy-390xx-390.87 Debian source package:

nvidia-hot-gpu-on-gpu-unplug-action.patch (download)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
diff -ur original/common/inc/nv-linux.h patchedl/common/inc/nv-linux.h
--- original/common/inc/nv-linux.h	2018-09-23 12:20:02.000000000 +0000
+++ patched/common/inc/nv-linux.h	2018-10-28 07:19:21.526566940 +0000
@@ -1465,6 +1465,7 @@
 typedef struct nv_linux_state_s {
     nv_state_t nv_state;
     atomic_t usage_count;
+    atomic_t dead;
 
     struct pci_dev *dev;
 
diff -ur original/common/inc/nv-modeset-interface.h patched/common/inc/nv-modeset-interface.h
--- original/common/inc/nv-modeset-interface.h	2018-08-22 00:55:23.000000000 +0000
+++ patched/common/inc/nv-modeset-interface.h	2018-10-28 07:22:00.768238371 +0000
@@ -25,6 +25,8 @@
 
 #include "nv-gpu-info.h"
 
+#include <asm/atomic.h>
+
 /*
  * nvidia_modeset_rm_ops_t::op gets assigned a function pointer from
  * core RM, which uses the calling convention of arguments on the
@@ -115,6 +117,8 @@
 
     int (*set_callbacks)(const nvidia_modeset_callbacks_t *cb);
 
+    atomic_t * (*gpu_dead)(NvU32 gpu_id);
+
 } nvidia_modeset_rm_ops_t;
 
 NV_STATUS nvidia_get_rm_ops(nvidia_modeset_rm_ops_t *rm_ops);
diff -ur original/common/inc/nv-proto.h patched/common/inc/nv-proto.h
--- original/common/inc/nv-proto.h	2018-08-22 00:55:23.000000000 +0000
+++ patched/common/inc/nv-proto.h	2018-10-28 07:20:49.939494812 +0000
@@ -81,6 +81,7 @@
 NvBool      nvidia_get_gpuid_list       (NvU32 *gpu_ids, NvU32 *gpu_count);
 int         nvidia_dev_get              (NvU32, nvidia_stack_t *);
 void        nvidia_dev_put              (NvU32, nvidia_stack_t *);
+atomic_t *  nvidia_dev_dead             (NvU32);
 int         nvidia_dev_get_uuid         (const NvU8 *, nvidia_stack_t *);
 void        nvidia_dev_put_uuid         (const NvU8 *, nvidia_stack_t *);
 int         nvidia_dev_get_pci_info     (const NvU8 *, struct pci_dev **, NvU64 *, NvU64 *);
diff -ur original/nvidia/nv.c patched/nvidia/nv.c
--- original/nvidia/nv.c	2018-09-23 12:20:02.000000000 +0000
+++ patched/nvidia/nv.c	2018-10-28 07:48:05.895025112 +0000
@@ -1944,6 +1944,12 @@
     unsigned int i;
     NvBool bRemove = NV_FALSE;
 
+    if (NV_ATOMIC_READ(nvl->dead))
+    {
+        nv_printf(NV_DBG_ERRORS, "NVRM: nvidia_close called on dead device by pid %d!\n",
+                  current->pid);
+    }
+
     NV_CHECK_PCI_CONFIG_SPACE(sp, nv, TRUE, TRUE, NV_MAY_SLEEP());
 
     /* for control device, just jump to its open routine */
@@ -2106,6 +2112,12 @@
     size_t arg_size;
     int arg_cmd;
 
+    if (NV_ATOMIC_READ(nvl->dead))
+    {
+        nv_printf(NV_DBG_ERRORS, "NVRM: nvidia_ioctl called on dead device by pid %d!\n",
+                  current->pid);
+    }
+
     nv_printf(NV_DBG_INFO, "NVRM: ioctl(0x%x, 0x%x, 0x%x)\n",
         _IOC_NR(cmd), (unsigned int) i_arg, _IOC_SIZE(cmd));
 
@@ -3217,6 +3229,7 @@
     NV_INIT_MUTEX(&nvl->ldata_lock);
 
     NV_ATOMIC_SET(nvl->usage_count, 0);
+    NV_ATOMIC_SET(nvl->dead, 0);
 
     if (!rm_init_event_locks(sp, nv))
         return NV_FALSE;
@@ -4018,14 +4031,38 @@
         nv_printf(NV_DBG_ERRORS,
                   "NVRM: Attempting to remove minor device %u with non-zero usage count!\n",
                   nvl->minor_num);
+        nv_printf(NV_DBG_ERRORS,
+                  "NVRM: YOLO, waiting for usage count to drop to zero\n");
         WARN_ON(1);
 
-        /* We can't continue without corrupting state, so just hang to give the
-         * user some chance to do something about this before reboot */
-        while (1)
+        NV_ATOMIC_SET(nvl->dead, 1);
+
+        /* Insanity check: wait until all clients die, then hope for the best. */
+        while (1) {
+            UNLOCK_NV_LINUX_DEVICES();
             os_schedule();
-    }
+            LOCK_NV_LINUX_DEVICES();
+
+            nvl = pci_get_drvdata(dev);
+            if (!nvl || (nvl->dev != dev))
+            {
+                goto done;
+            }
+
+            if (NV_ATOMIC_READ(nvl->usage_count) == 0)
+            {
+                break;
+            }
+        }
 
+        nv_printf(NV_DBG_ERRORS,
+                  "NVRM: Usage count is now zero, proceeding to remove the GPU\n");
+        nv_printf(NV_DBG_ERRORS,
+                  "NVRM: This is not actually supposed to work lol. Hope it does tho 👍\n");
+        nv_printf(NV_DBG_ERRORS,
+                  "NVRM: You probably want to reload nvidia-modeset now if you want any "
+                  "of this to ever start up again, but like, man, that's your choice entirely\n");
+    }
     nv = NV_STATE_PTR(nvl);
     if (nvl == nv_linux_devices)
         nv_linux_devices = nvl->next;
@@ -4712,6 +4749,22 @@
     up(&nvl->ldata_lock);
 }
 
+atomic_t *nvidia_dev_dead(NvU32 gpu_id)
+{
+    nv_linux_state_t *nvl;
+    atomic_t *ret;
+
+    /* Takes nvl->ldata_lock */
+    nvl = find_gpu_id(gpu_id);
+    if (!nvl)
+        return NV_FALSE;
+
+    ret = &nvl->dead;
+    up(&nvl->ldata_lock);
+
+    return ret;
+}
+
 /*
  * Like nvidia_dev_get but uses UUID instead of gpu_id. Note that this may
  * trigger initialization and teardown of unrelated devices to look up their
diff -ur original/nvidia/nv-modeset-interface.c patched/nvidia/nv-modeset-interface.c
--- original/nvidia/nv-modeset-interface.c	2018-08-22 00:55:22.000000000 +0000
+++ patched/nvidia/nv-modeset-interface.c	2018-10-28 07:20:25.959243110 +0000
@@ -114,6 +114,7 @@
         .close_gpu      = nvidia_dev_put,
         .op             = rm_kernel_rmapi_op, /* provided by nv-kernel.o */
         .set_callbacks  = nvidia_modeset_set_callbacks,
+        .gpu_dead       = nvidia_dev_dead,
     };
 
     if (strcmp(rm_ops->version_string, NV_VERSION_STRING) != 0)
diff -ur original/nvidia/nv-reg.h patched/nvidia/nv-reg.h
diff -ur original/nvidia-modeset/nvidia-modeset-linux.c patched/nvidia-modeset/nvidia-modeset-linux.c
--- original/nvidia-modeset/nvidia-modeset-linux.c	2018-09-23 12:20:02.000000000 +0000
+++ patched/nvidia-modeset/nvidia-modeset-linux.c	2018-10-28 07:47:14.738703417 +0000
@@ -75,6 +75,9 @@
 
 static struct semaphore nvkms_lock;
 
+static NvU32 clopen_gpu_id;
+static NvBool leak_on_unload;
+
 /*************************************************************************
  * NVKMS executes queued work items on a single kthread.
  *************************************************************************/
@@ -89,6 +92,9 @@
 struct nvkms_per_open {
     void *data;
 
+    NvU32 gpu_id;
+    atomic_t *gpu_dead;
+
     enum NvKmsClientType type;
 
     union {
@@ -711,6 +717,9 @@
     nvidia_modeset_stack_ptr stack = NULL;
     NvBool ret;
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "nvkms_open_gpu called with %08x, pid %d\n",
+           gpuId, current->pid);
+
     if (__rm_ops.alloc_stack(&stack) != 0) {
         return NV_FALSE;
     }
@@ -719,6 +728,10 @@
 
     __rm_ops.free_stack(stack);
 
+    if (ret) {
+        clopen_gpu_id = gpuId;
+    }
+
     return ret;
 }
 
@@ -726,12 +739,17 @@
 {
     nvidia_modeset_stack_ptr stack = NULL;
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "nvkms_close_gpu called with %08x, pid %d\n",
+           gpuId, current->pid);
+
     if (__rm_ops.alloc_stack(&stack) != 0) {
         return;
     }
 
     __rm_ops.close_gpu(gpuId, stack);
 
+    clopen_gpu_id = gpuId;
+
     __rm_ops.free_stack(stack);
 }
 
@@ -771,8 +789,14 @@
 
     popen->type = type;
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_open_common, pid %d\n",
+           current->pid);
+
     *status = down_interruptible(&nvkms_lock);
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_open_common, pid %d\n",
+           current->pid);
+
     if (*status != 0) {
         goto failed;
     }
@@ -781,6 +805,9 @@
 
     up(&nvkms_lock);
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_open_common, pid %d\n",
+           current->pid);
+
     if (popen->data == NULL) {
         *status = -EPERM;
         goto failed;
@@ -799,10 +826,16 @@
 
     *status = 0;
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "exiting in nvkms_open_common, pid %d\n",
+           current->pid);
+
     return popen;
 
 failed:
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "error in nvkms_open_common, pid %d\n",
+           current->pid);
+
     nvkms_free(popen, sizeof(*popen));
 
     return NULL;
@@ -816,14 +849,36 @@
      * mutex.
      */
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_close_common, pid %d\n",
+           current->pid);
+
     down(&nvkms_lock);
 
-    nvKmsClose(popen->data);
+    printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_close_common, pid %d\n",
+           current->pid);
+
+    if (popen->gpu_id != 0 && atomic_read(popen->gpu_dead) != 0) {
+        printk(KERN_ERR NVKMS_LOG_PREFIX "awwww u need cleanup :3 "
+               "in nvkms_close_common, pid %d\n",
+               current->pid);
+
+        nvkms_close_gpu(popen->gpu_id);
+
+        popen->gpu_id = 0;
+        popen->gpu_dead = NULL;
+
+        leak_on_unload = NV_TRUE;
+    } else {
+        nvKmsClose(popen->data);
+    }
 
     popen->data = NULL;
 
     up(&nvkms_lock);
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_close_common, pid %d\n",
+           current->pid);
+
     if (popen->type == NVKMS_CLIENT_KERNEL_SPACE) {
         /*
          * Flush any outstanding nvkms_kapi_event_kthread_q_callback() work
@@ -844,6 +899,9 @@
     }
 
     nvkms_free(popen, sizeof(*popen));
+
+    printk(KERN_INFO NVKMS_LOG_PREFIX "exiting nvkms_close_common, pid %d\n",
+           current->pid);
 }
 
 int NVKMS_API_CALL nvkms_ioctl_common
@@ -855,20 +913,58 @@
     int status;
     NvBool ret;
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "entering nvkms_ioctl_common, pid %d\n",
+           current->pid);
+
     status = down_interruptible(&nvkms_lock);
     if (status != 0) {
         return status;
     }
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "taken lock in nvkms_ioctl_common, pid %d\n",
+           current->pid);
+
+    if (popen->gpu_id != 0 && atomic_read(popen->gpu_dead) != 0) {
+        goto dead;
+    }
+
+    clopen_gpu_id = 0;
+
     if (popen->data != NULL) {
         ret = nvKmsIoctl(popen->data, cmd, address, size);
     } else {
         ret = NV_FALSE;
     }
 
+    if (clopen_gpu_id != 0) {
+        if (!popen->gpu_id) {
+            printk(KERN_INFO NVKMS_LOG_PREFIX "detected gpu %08x open in nvkms_ioctl_common, "
+                   "pid %d\n", clopen_gpu_id, current->pid);
+            popen->gpu_id = clopen_gpu_id;
+            popen->gpu_dead = __rm_ops.gpu_dead(clopen_gpu_id);
+        } else {
+            printk(KERN_INFO NVKMS_LOG_PREFIX "detected gpu %08x close in nvkms_ioctl_common, "
+                   "pid %d\n", clopen_gpu_id, current->pid);
+            popen->gpu_id = 0;
+            popen->gpu_dead = NULL;
+        }
+    }
+
     up(&nvkms_lock);
 
+    printk(KERN_INFO NVKMS_LOG_PREFIX "given up lock in nvkms_ioctl_common, pid %d\n",
+           current->pid);
+
     return ret ? 0 : -EPERM;
+
+dead:
+    up(&nvkms_lock);
+
+    printk(KERN_ERR NVKMS_LOG_PREFIX "*notices ur gpu is dead* owo whats this "
+           "in nvkms_ioctl_common, pid %d\n",
+           current->pid);
+
+    return -ENOENT;
 }
 
 /*************************************************************************
@@ -1239,9 +1335,14 @@
 
     nvkms_proc_exit();
 
-    down(&nvkms_lock);
-    nvKmsModuleUnload();
-    up(&nvkms_lock);
+    if(leak_on_unload) {
+        printk(KERN_ERR NVKMS_LOG_PREFIX "im just gonna leak all the kms junk ok? "
+               "haha nvm wasnt a question. in nvkms_exit\n");
+    } else {
+        down(&nvkms_lock);
+        nvKmsModuleUnload();
+        up(&nvkms_lock);
+    }
 
     /*
      * At this point, any pending tasks should be marked canceled, but

Here’s some handy scripts I was using while debugging it:

insmod.sh
1
2
3
4
5
#!/bin/sh -ex
modprobe acpi_ipmi
insmod nvidia.ko NVreg_ResmanDebugLevel=-1 NVreg_CheckPCIConfigSpace=0
insmod nvidia-modeset.ko
dmesg -w
rmmod.sh
1
2
3
#!/bin/sh
rmmod nvidia-modeset
rmmod nvidia
xorg.sh
1
2
#!/bin/sh
exec Xorg :8 -config /etc/bumblebee/xorg.conf.nvidia -configdir /etc/bumblebee/xorg.conf.d -sharevts -nolisten tcp -noreset -verbose 3 -isolateDevice PCI:06:00:0 -modulepath /usr/lib/nvidia/nvidia,/usr/lib/xorg/modules

And finally, here are the relevant kernel and Xorg log messages, showing what happens when a GPU is unplugged:

dmesg.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[  219.524218] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  390.87  Tue Aug 21 12:33:05 PDT 2018 (using threaded interrupts)
[  219.527409] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  390.87  Tue Aug 21 16:16:14 PDT 2018
[  224.780721] nvidia-modeset: nvkms_open_gpu called with 00000600, pid 4560
[  224.807370] nvidia-modeset: detected gpu 00000600 open in nvkms_ioctl_common, pid 4560
[  239.061383] NVRM: GPU at PCI:0000:06:00: GPU-9fe1319c-8dd3-44e4-2b74-de93f8b02c6a
[  239.061387] NVRM: Xid (PCI:0000:06:00): 79, GPU has fallen off the bus.
[  239.061389] NVRM: GPU at 0000:06:00.0 has fallen off the bus.
[  239.061398] NVRM: A GPU crash dump has been created. If possible, please run
               NVRM: nvidia-bug-report.sh as root to collect this data before
               NVRM: the NVIDIA kernel module is unloaded.
[  240.209498] NVRM: Attempting to remove minor device 0 with non-zero usage count!
[  240.209501] NVRM: YOLO, waiting for usage count to drop to zero
[  241.433499] nvidia-modeset: *notices ur gpu is dead* owo whats this in nvkms_ioctl_common, pid 4560
[  241.433851] nvidia-modeset: awwww u need cleanup :3 in nvkms_close_common, pid 4560
[  241.433853] nvidia-modeset: nvkms_close_gpu called with 00000600, pid 4560
[  250.440498] NVRM: Usage count is now zero, proceeding to remove the GPU
[  250.440513] NVRM: This is not actually supposed to work lol. Hope it does tho 👍
[  250.440520] NVRM: You probably want to reload nvidia-modeset now if you want any of this to ever start up again, but like, man, that's your choice entirely
[  250.440870] pci 0000:06:00.1: Dropping the link to 0000:06:00.0
[  250.440950] pci_bus 0000:06: busn_res: [bus 06] is released
[  250.440982] pci_bus 0000:07: busn_res: [bus 07-38] is released
[  250.441012] pci_bus 0000:05: busn_res: [bus 05-38] is released
[  251.000794] pci_bus 0000:02: Allocating resources
[  251.001324] pci_bus 0000:02: Allocating resources
[  253.765953] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[  253.765969] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[  253.765976] pcieport 0000:00:1c.0:   device [8086:9d10] error status/mask=00002001/00002000
[  253.765982] pcieport 0000:00:1c.0:    [ 0] Receiver Error         (First)
[  253.841064] pcieport 0000:02:02.0: Refused to change power state, currently in D3
[  253.843882] pcieport 0000:02:00.0: Refused to change power state, currently in D3
[  253.846177] pci_bus 0000:03: busn_res: [bus 03] is released
[  253.846248] pci_bus 0000:04: busn_res: [bus 04-38] is released
[  253.846300] pci_bus 0000:39: busn_res: [bus 39] is released
[  253.846348] pci_bus 0000:02: busn_res: [bus 02-39] is released
[  353.369487] nvidia-modeset: im just gonna leak all the kms junk ok? haha nvm wasnt a question. in nvkms_exit
[  357.600350] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  390.87  Tue Aug 21 16:16:14 PDT 2018
Xorg.8.log
1
[   244.798] (EE) NVIDIA(GPU-0): WAIT (2, 8, 0x8000, 0x000011f4, 0x00001210)

Running system commands through Nvidia signed binaries

I started looking for other similar binaries developed by Nvidia that could execute system commands with the legitimacy of Nvidia.

I found this one:

Running it looked very promising:

 

The list of commands includes all the one found by Hexacorn
«AddUninstall, Call, CheckPath, CheckRAID, ClassSweep, Copy, CopyV, CreateDevice, CreateShortcut, Del, DelBoot, DelBootQuiet, DelIniIfMatched, DelOemInfs, DelReg, DelRegE, DirAndApply, Echo, EnumDevices, EnumRegCmd, EnumRegNamesCmd, Eval, FindOEMInf, GetDrivePort, GetFolderPath, GetInfGUID, GetReg, Help, If, InstallDriver, InstallDriverEx, KillApp, RemoveDevice, Run, RunOnce, SendMessage, Set, SetEnv, SetReg, Sleep, Splash, StartLogging, StopLogging, SysCallAndWait, System, UnifyUninst, Uninstall, UnInstallEx, UninstallGUI, UninstallService, WaitOnRegDel»
+ These one:

«Decrement Increment DisplayControlPanel AskToCloseAndExitIfRunning RemoveDriverStore RemoveDeviceEx DisableDevice RemoveUpperFilter StopService RmString DelAll»

Here is the description for all commands:

  • Decrement: Decrements a variable numerically.
  • Increment: Increments a variable numerically.
  • DisplayControlPanel:  Displays message about Display Control Panel uninstall.
  • AskToCloseAndExitIfRunning: Given an application name, enumerates all running application for a match. If found, prompts the user to close the application.
  • RemoveDriverStore: Remove any device matched with the given description from the system using setupdi calls. Enum can be (PCI, EISA, etc), HWID usually is VEN_10DE and device type can be DISPLAY, HDC, MEDIA, NET, SYSTEM.
  • RemoveDeviceEx: Remove any device matched with the given description from the system using setupdi calls. Enum can be (PCI, EISA, etc), HWID usually is VEN_10DE and device type can be DISPLAY, HDC, MEDIA, NET, SYSTEM.
  • DisableDevice: Disable any device matched with the given description from the system using setupdi calls. Enum can be (PCI, EISA, etc), HWID usually is VEN_10DE and device type can be DISPLAY, HDC, MEDIA, NET, SYSTEM.
  • RemoveUpperFilter: Remove filter service from any devices that specify it.
  • StopService: Uninstalls the given service name.
  • RmString: Removes the string from the original string of words if found and saves the result in new variable.
  • DelAll: Delete the given folder if it exists, it also deletes the contents within the folder.

Running calc.exe

Dumping the manifest shows us that the file requires Administrator privileges (exactly like the binaries nvuhda.exe and nvuhda6.exe described by Hexacorn).

Sigcheck -m nvudisp.exe

This is a promising avenue to explore and could be used by real attackers to break standard EDR detection rules.

That’s all folks!