PE-sieve is a light-weight tool that helps to detect malware running on the system, as well as to collect the potentially malicious material for further analysis. Recognizes and dumps variety of implants within the scanned process: replaced/injected PEs, shellcodes, hooks, and other in-memory patches.
Detects inline hooks, Process Hollowing, Process Doppelgänging, Reflective DLL Injection, etc.
example: classic unmapping (2) vs remapping (3) — with remapping full virtual content of the section is preserved, so it helps i.e. if the full section was unpacked in memory, or if virtual caves were used
In recent years, the exploit of GDI objects to complete arbitrary memory address R/W in kernel exploitation has become more and more useful. In many types of vulnerabilityes such as pool overflow, arbitrary writes, and out-of-bound write, use after free and double free, you can use GDI objects to read and write arbitrary memory. We call this GDI data-only attack.
Microsoft introduced the win32k type isolation after the Windows 10 build 1709 release to mitigate GDI data-only attack in kernel exploitation. I discovered a mistake in Win32k TypeIsolation when I reverse win32kbase.sys. It have resulted GDI data-only attack worked again in certain common vulnerabilities. In this paper, I will share this new attack scenario.
Windows 10 rs3 16299.371
2 GDI data-only attack
GDI data-only attack is one of the common methods which used in kernel exploitation. Modify GDI object member-variables by common vulnerabilities, you can use the GDI API in win32k to complete arbitrary memory read and write. At present, two GDI objects commonly used in GDI data-only attacks are Bitmap and Palette. An important structure of Bitmap is:
In the kernel structure of Bitmap and Palette, two important member-variables related to GDI data-only attack are Bitmap->pvScan0 and Palette->pFirstColor. Two member-variables point to Bitmap and Palette’s data field, and you can read or write data from data field through the GDI APIs. As long as we modify two member-variables to any memory address by triggering a vulnerability, we can use GetBitmapBits/SetBitmapBits or GetPaletteEntries/SetPaletteEntries to read and write arbitrary memory address.
About using the Bitmap and Palette to complete the GDI data-only attack Now that there are many related technical papers on the Internet, and it is not the focus of this paper, there will be no more deeply sharing. The relevant information can refer to the fifth part.
3 Win32k TypeIsolation
The exploit of GDI data-only attack greatly reduces the difficulty of kernel exploitation and can be used in most common types of vulnerabilities. Microsoft has added a new mitigation after Windows 10 rs3 build 1709 —- Win32k Typeisolation, which manages the GDI objects through a doubly-linked list, and separates the head of the GDI object from the data field. This is not only mitigate the exploit of pool fengshui which create a predictable pool and uses a GDI object to occupy the pool hole and modify member-variables by vulnerabilities. but also mitigate attack scenario which modifies other member-variables of GDI object header to increase the controllable range of the data field, because the head and data field is no longer adjacent.
About win32k typeisolation mechanism can refer to the following figure:
Here I will explain the important parts of the mechanism of win32k typeisolation. The detailed operation mechanism of win32k typeisolation, including the allocation, and release of GDI object, can be referred to in the fifth part.
In win32k typeisolation, GDI object is managed uniformly through the CSectionEntry doubly linked list. The view field points to a 0x28000 memory space, and the head of the GDI object is managed here. The view field is managed by view array, and the array size is 0x1000. When assigning to a GDI object, RTL_BITMAP is used as an important basis for assigning a GDI object to a specified view field.
In CSectionEntry, bitmap_allocator points to CSectionBitmapAllocator, and xored_view, xor_key, xored_rtl_bitmap are stored in CSectionBitmapAllocator, where xored_view ^ xor_key points to the view field and xored_rtl_btimap ^ xor_key points to RTL_BITMAP.
In RTL_BITMAP, bitmap_buffer_ptr points to BitmapBuffer,and BitmapBuffer is used to record the status of the view field, which is 0 for idle and 1 for in use. When applying for a GDI object, it starts traversing the CSectionEntry list through win32kbase!gpTypeIsolation and checks whether the current view field contains a free memory by CSectionBitmapAllocator. If there is a free memory, a new GDI object header will be placed in the view field.
I did some research in the reverse engineering of the implementation of GDI object allocation and release about the CTypeIsolation class and the CSectionEntry class, and then I found a mistake. TypeIsolation traverses the CSectionEntry doubly linked list, uses the CSectionBitmapAllocator to determine the state of the view field, and manages the GDI object SURFACE which stored in the view field, but does not check the validity of CSectionEntry->view and CSectionEntry->bitmap_allocator pointers, that is to say if we can construct a fake view and fake bitmap_allocator, and we can use the vulnerability to modify CSectionEntry->view and CSectionEntry->bitmap_allocator to point to fake struct, we can re-use GDI object to complete the data-only attack.
4 Save and reborn gdi data-only attack!
In this section, I would like to share the idea of this attack scenario. HEVD is a practice driver developed by Hacksysteam that has typical kernel vulnerabilities. There is an Arbitrary Write vulnerability in HEVD. We use this vulnerability as example to share my attack scenario.
First look at the allocation of CSectionEntry, CSectionEntry will allocate 0x40 size session paged pool, CSectionEntry allocate pool memory implementation in NSInstrumentation::CSectionEntry::Create().
In other words, we can still use the pool fengshui to create a predictable session paged pool hole and it will be occupied with CSectionEntry. Therefore, in the exploit scenario of HEVD Arbitrary write, we use the tagWND to create a stable pool hole. , and use the HMValidateHandle to leak tagWND kernel object address. Because the current vulnerability instance is an arbitrary write vulnerability, if we can reveal the address of the kernel object, it will facilitate our understanding of this attack scenario, of course, in many attack scenarios, we only need to use pool fengshui to create a predictable pool.
Kd> g//make a stable pool hole by using tagWND
Break instruction exception - code 80000003 (first chance)
0033:00007ff6`89a61829 cc int 3
0033:00007ff6`89a6182a 488b842410010000 mov rax,qword ptr [rsp+110h]
0033:00007ff6`89a61832 4839842400010000 cmp qword ptr [rsp+100h],rax
Kd> r rax
Kd> !pool ffff862e827ca220
Pool page ffff862e827ca220 region is Unknown
Ffff862e827ca000 size: 150 previous size: 0 (Allocated) Gh04
Ffff862e827ca150 size: 10 previous size: 150 (Free) Free
Ffff862e827ca160 size: b0 previous size: 10 (Free ) Uscu
*ffff862e827ca210 size: 40 previous size: b0 (Allocated) *Ustx Process: ffffd40acb28c580
Pooltag Ustx : USERTAG_TEXT, Binary : win32k!NtUserDrawCaptionTemp
Ffff862e827ca250 size: e0 previous size: 40 (Allocated) Gla8
Ffff862e827ca330 size: e0 previous size: e0 (Allocated) Gla8```
0xffff862e827ca220 is a stable session paged pool hole, and 0xffff862e827ca220 will be released later, in a free state.
Now we need to create the CSecitionEntry to occupy 0xffff862e827ca220. This requires the use of a feature of TypeIsolation. As mentioned in the second section, when the GDI object is requested, it will traverse the CSectionEntry and determine whether there is any free in the view field, if the view field of the CSectionEntry is full, the traversal will continue to the next CSectionEntry, but if CTypeIsolation doubly linked list, all the view fields of the CSectionEntrys are full, then NSInstrumentation::CSectionEntry::Create is invoked to create a new CSectionEntry.
Therefore, we allocate a large number of GDI objects after we have finished creating the pool hole to fill up all the CSectionEntry’s view fields to ensure that a new CSectionEntry is created and occupy a pool hole of size 0x40.
Kd> g//create a large number of GDI objects, 0xffff862e827ca220 is occupied by CSectionEntry
Kd> !pool ffff862e827ca220
Pool page ffff862e827ca220 region is Unknown
Ffff862e827ca000 size: 150 previous size: 0 (Allocated) Gh04
Ffff862e827ca150 size: 10 previous size: 150 (Free) Free
Ffff862e827ca160 size: b0 previous size: 10 (Free) Uscu
*ffff862e827ca210 size: 40 previous size: b0 (Allocated) *Uiso
Pooltag Uiso : USERTAG_ISOHEAP, Binary : win32k!TypeIsolation::Create
Ffff862e827ca250 size: e0 previous size: 40 (Allocated) Gla8 ffff86b442563150 size:
Next we need to construct the fake CSectionEntry->view and fake CSectionEntry->bitmap_allocator and use the Arbitrary Write to modify the member-variable pointer in the CSectionEntry in the session paged pool hole to point to the fake struct we constructed.
The view field of the new CSectionEntry that was created when we allocate a large number of GDI objects may already be full or partially full by SURFACEs. If we construct the fake struct to construct the view field as empty, then we can deceive TypeIsolation that GDI object will place SURFACE in a known location.
We use VirtualAllocEx to allocate the memory in the userspace to store the fake struct, and we set the userspace memory property to READWRITE.
Among them, 0x1f0000 points to the view field, 0x1c0000 points to CSectionBitmapAllocator, and the fake view field is used to store the GDI object. The structure of CSectionBitmapAllocator needs thoughtful construction because we need to use it to deceive the typeisolation that the CSectionEntry we control is a free view item.
The above CSectionBitmapAllocator structure compares with 0x1c0000 structure, and I defined xor_key as 0xdeadbeefdeadb33f, as long as the xor_key ^ xor_view and xor_key ^ xor_rtl_bitmap operation point to the view field and RTL_BITMAP. In the debugging I found that the pushlock must point to a valid structure pointer, otherwise it will trigger BUGCHECK, so I allocate memory 0x1e0000 to store pushlock content.
As described in the second section, bitmap_hint_index is used as a condition to quickly index in the RTL_BITMAP, so this value also needs to be set to 0x00 to indicate the index in RTL_BITMAP. In the same way we look at the structure of RTL_BITMAP.
Here I select a valid RTL_BITMAP as a template, where the first member-variable represents the RTL_BITMAP size, the second member-variable points to the bitmap_buffer, and the immediately adjacent bitmap_buffer represents the state of the view field in bits. To deceive typeisolation, we will all of them are set to 0, indicating that the view field of the current CSectionEntry item is all idle, referring to the 0x190000 fake RTL_BITMAP structure.
Next, we only need to modify the CSectionEntry view and CSectionBitmapAllocator pointer through the HEVD’s Arbitrary write vulnerability.
Kd> dq ffff862e827ca220//before trigger
Ffff862e`827ca220 ffff862e`827cf4f0 ffff862e`827ef300
Ffff862e`827ca230 ffffc383`08613880 ffff862e`84780000
Ffff862e`827ca240 ffff862e`827f33c0 00000000`00000000
Kd> g / / trigger vulnerability, CSectionEntry-> view and CSectionEntry-> bitmap_allocator is modified
Break instruction exception - code 80000003 (first chance)
0033:00007ff7`abc21e35 cc int 3
Kd> dq ffff862e827ca220
Ffff862e`827ca220 ffff862e`827cf4f0 ffff862e`827ef300
Ffff862e`827ca230 ffffc383`08613880 00000000`001f0000
Ffff862e`827ca240 00000000`001c0000 00000000`00000000
Next, we normally allocate a GDI object, call CreateBitmap to create a bitmap object, and then observe the state of the view field.
You can see that the bitmap kernel object is placed in the fake view field. We can read the bitmap kernel object directly from the userspace. Next, we only need to directly modify the pvScan0 of the bitmap kernel object stored in the userspace, and then call the GetBitmapBits/SetBitmapBits to complete any memory address read and write.
Summarize the exploit process:
Fix for full exploit:
In the course of completing the exploit, I discovered that BSOD was generated some time, which greatly reduced the stability of the GDI data-only attack. For example,
Kd> !analyze -v
* Bugcheck Analysis *
An exception happened while performing a system service routine.
Arg1: 00000000c0000005, Exception code that caused the bugcheck
Arg2: ffffd7d895bd9847, Address of the instruction which caused the bugcheck
Arg3: ffff8c8f89e98cf0, Address of the context record for the exception that caused the bugcheck
Arg4: 0000000000000000, zero.
OVERLAPPED_MODULE: Address regions for 'dxgmms1' and 'dump_storport.sys' overlap
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - 0x%08lx
Ffffd7d8`95bd9847 488b1e mov rbx, qword ptr [rsi]
CONTEXT: ffff8c8f89e98cf0 -- (.cxr 0xffff8c8f89e98cf0)
Rax=ffffdb0039e7c080 rbx=ffffd7a7424e4e00 rcx=ffffdb0039e7c080
Rdx=ffffd7a7424e4e00 rsi=00000000001e0000 rdi=ffffd7a740000660
Rip=ffffd7d895bd9847 rsp=ffff8c8f89e996e0 rbp=0000000000000000
R8=ffff8c8f89e996b8 r9=0000000000000001 r10=7ffffffffffffffc
R11=0000000000000027 r12=00000000000000ea r13=ffffd7a740000680
Iopl=0 nv up ei pl nz na po nc
Cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00010206
Ffffd7d8`95bd9847 488b1e mov rbx, qword ptr [rsi] ds:002b:00000000`001e0000=????????????????
After many tracking, I discovered that the main reason for BSOD is that the fake struct we created when using VirtualAllocEx is located in the process space of our current process. This space is not shared by other processes, that is, if we modify the view field through a vulnerability. After the pointer to the CSectionBitmapAllocator, when other processes create the GDI object, it will also traverse the CSecitionEntry. When traversing to the CSectionEntry we modify through the vulnerability, it will generate BSoD because the address space of the process is invalid, so here I did my first fix when the vulnerability was triggered finish.
In the first fix, I modified the bitmap_hint_index and the rtl_bitmap to deceive the typeisolation when traverse the CSectionEntry and think that the view field of the fake CSectionEntry is currently full and will skip this CSectionEntry.
We know that the current CSectionEntry has been modified by us, so even if we end the exploit exit process, the CSectionEntry will still be part of the CTypeIsolation doubly linked list, and when our process exits, The current process space allocated by VirtualAllocEx will be released. This will lead to a lot of unknown errors. We have already had the ability to read and write at any address. So I did my second fix.
In the second fix, I obtained CSectionEntry->previous and CSectionEntry->next, which unlinks the current CSectionEntry so that when the GDI object allocates traversal CSectionEntry, it will deal with fake CSectionEntry no longer.
After completing the two fixes, you can successfully use GDI data-only attack to complete any memory address read and write. Here, I directly obtained the SYSTEM permissions for the latest version of Windows10 rs3, but once again when the process completely exits, it triggers BSoD. After the analysis, I found that this BSoD is due to the unlink after, the GDI handle is still stored in the GDI handle table, then it will find the corresponding kernel object in CSectionEntry and free away, and we store the bitmap kernel object CSectionEntry has been unlink, Caused the occurrence of BSoD.
The problem occurs in NtGdiCloseProcess, which is responsible for releasing the GDI object of the current process. The call chain associated with SURFACE is as follows
bDeleteSurface is responsible for releasing the SURFACE kernel object in the GDI handle table. We need to find the HBITMAP which stored in the fake view in the GDI handle table, and set it to 0x0. This will skip the subsequent free processing in bDeleteSurface. Then call HmgNextOwned to release the next GDI object. The key code for finding the location of HBITMAP in the GDI handle table is in HmgSharedLockCheck. The key code is as follows:
It is worth mentioning here is the need to leak the base address of win32kbase.sys, in the case of Low IL, we need vulnerability to leak info. And I use NtQuerySystemInformation in Medium IL to leak win32kbase.sys base address to calculate the gpHandleManager address, after Find the position of the target bitmap object in the GDI handle table in the fake view, and set it to 0x0. Finally complete the full exploit.
Now that the exploit of the kernel is getting harder and harder, a full exploitation often requires the support of other vulnerabilities, such as the info leak. Compared to the oob writes, uaf, double free, and write-what-where, the pool overflow is more complicated with this scenario, because it involves CSectionEntry->previous and CSectionEntry->next problems, but it is not impossible to use this scenario in pool overflow.
If you have any questions, welcome to discuss with me. Thank you!
PE-sieve (previously known as Hook Finder) is open source tool based on libpeconv.
It scans a given process, searching for manually loaded or modified modules. When found, it dumps the modified/suspicious PE along with a report in JSON format, detailing about the found indicators.
Currently it detects inline hooks, hollowed processes, Process Doppelgänging, injected PE files etc. In case if the PE file was patched in the memory, it gives a detailed report about where are the changed bytes (and few other properties).
The tool is under rapid development, so expect frequent updates.
PE-sieve is available in 2 versions – as standalone executable, and as a DLL. The DLL version became a base of my other project: HollowsHunter – that makes an automated scan of all the running processes. More about it in the further part of the post.
It has a simple, commandline interface. When run without parameters, it displays info about the version and required arguments:
When you run it giving a PID of the running process, it scans all the PE modules in its memory (the main executable, but also all the loaded DLLs). At the end, you can see the summary of how many anomalies have been detected of which type.
In case if some modified modules has been detected, they are dumped to a folder of a given process, for example:
Short history & features
Detecting inline hooks and patches
I started creating it for the purpose of searching and examining inline hooks. You can see it in action here (old version):
It not only detects that there IS an anomaly/patch, but also WHERE exactly it is. For each dumped PE where the patches were found, it creates a file with tags, that can be loaded by PE-bear.
Thanks to this, we can easily browse the found hooks and check the code that was overwritten.
For example – in the application presented above, the Entry Point was patched and the execution was redirected to the added, malicious section:
Detecting hollowed processes
Later, I extended it to detect process hollowing etc – and it turned out to be pretty convenient unpacker:
Detecting Process Doppelgänging
In a similar manner, it can detects some other methods of impersonating a processes, for example Process Doppelgänging. The malicious payload is directly dumped and ready to be analyzed:
Recovering erased imports
PE-sieve has an ability to recover erased imports. In order to enable it, deploy it with appropriate option. Example – unpacking manually loaded payloads with imports erased (Emotet):
The project is still not finished and I have many ideas how to make it better. I am planning to detect not only code modifications, but also other types of hooking, such as IAT and EAT patching.
Some in-memory patches are done by legitimate applications, so, in the future version I will provide capability of whitelisting defined patches.
I am also planning to extend its dumping capabilities against the malicious processes that are trying to defend themselves against dumpers etc.
PE-sieve as a DLL
During the development process I got an idea to make a DLL version of the PE-sieve, so that it can be incorporated in other projects.
Building PE-sieve from sources as a DLL is very easy – you just need to set one CMake option: PE_SIEVE_AS_DLL:
The PE-sieve DLL exposes a minimalistic API. Two functions are exported:
PESieve_help – displays a short info and the version of the DLL.
PESieve_scan – a typical scan with a given parameters, like in the PE-Sieve.exe
The necessary headers needs to be included from the folder “pe-sieve\include“:
This weekend I installed the Windows 10 Spring Update, and was pretty excited to start playing with the new, builtin OpenSSH tools.
Using OpenSSH natively in Windows is awesome since Windows admins no longer need to use Putty and PPK formatted keys. I started poking around and reading up more on what features were supported, and was pleasantly surprised to see ssh-agent.exe is included.
I found some references to using the new Windows ssh-agent in this MSDN article, and this part immediately grabbed my attention:
I’ve had some good fun in the past with hijacking SSH-agents, so I decided to start looking to see how Windows is «securely» storing your private keys with this new service.
I’ll outline in this post my methodology and steps to figuring it out. This was a fun investigative journey and I got better at working with PowerShell.
Private keys are protected with DPAPI and stored in the HKCU registry hive. I released some PoC code here to extract and reconstruct the RSA private key from the registry
Using OpenSSH in Windows 10
The first thing I tested was using the OpenSSH utilities normally to generate a few key-pairs and adding them to the ssh-agent.
First, I generated some password protected test key-pairs using ssh-keygen.exe:
Then I made sure the new ssh-agent service was running, and added the private key pairs to the running agent using ssh-add:
Running ssh-add.exe -L shows the keys currently managed by the SSH agent.
Finally, after adding the public keys to an Ubuntu box, I verified that I could SSH in from Windows 10 without needing the decrypt my private keys (since ssh-agent is taking care of that for me):
Monitoring SSH Agent
To figure out how the SSH Agent was storing and reading my private keys, I poked around a little and started by statically examining ssh-agent.exe. My static analysis skills proved very weak, however, so I gave up and just decided to dynamically trace the process and see what it was doing.
I used procmon.exe from Sysinternals and added a filter for any process name containing «ssh».
With procmon capturing events, I then SSH’d into my Ubuntu machine again. Looking through all the events, I saw ssh.exe open a TCP connection to Ubuntu, and then finally saw ssh-agent.exe kick into action and read some values from the Registry:
Two things jumped out at me:
The process ssh-agent.exe reads values from HKCU\Software\OpenSSH\Agent\Keys
After reading those values, it immediately opens dpapi.dll
Just from this, I now knew that some sort of protected data was being stored in and read from the Registry, and ssh-agent was using Microsoft’s Data Protection API
Testing Registry Values
Sure enough, looking in the Registry, I could see two entries for the keys I added using ssh-add. The key names were the fingerprint of the public key, and a few binary blobs were present:
After reading StackOverflow for an hour to remind myself of PowerShell’s ugly syntax (as is tradition), I was able to pull the registry values and manipulate them. The «comment» field was just ASCII encoded text and was the name of the key I added:
The (default) value was just a byte array that didn’t decode to anything meaningful. I had a hunch this was the «encrypted» private key if I could just pull it and figure out how to decrypt it. I pulled the bytes to a Powershell variable:
Unprotecting the Key
I wasn’t very familiar with DPAPI, although I knew a lot of post exploitation tools abused it to pull out secrets and credentials, so I knew other people had probably implemented a wrapper. A little Googling found me a simple oneliner by atifaziz that was way simpler than I imagined (okay, I guess I see why people like Powershell…. 😉 )
I still had no idea whether this would work or not, but I tried to unprotect the byte array using DPAPI. I was hoping maybe a perfectly formed OpenSSH private key would just come back, so I base64 encoded the result:
Could it be that the binary format is the same? I pulled down the Python scriptlinked from the blog and fed it the unprotected base64 blob I got from the Windows registry:
It worked! I have no idea how the original author soleblaze figured out the correct format of the binary data, but I am so thankful he did and shared. All credit due to him for the awesome Python tool and blogpost.
Putting it all together
After I had proved to myself it was possible to extract a private key from the registry, I put it all together in two scripts.
The first is a Powershell script (extract_ssh_keys.ps1) which queries the Registry for any saved keys in ssh-agent. It then uses DPAPI with the current user context to unprotect the binary and save it in Base64. Since I didn’t even know how to start parsing Binary data in Powershell, I just saved all the keys to a JSON file that I could then import in Python. The Powershell script is only a few lines:
I heavily borrowed the code from parse_mem_python.py by soleblaze and updated it to use Python3 for the next script: extractPrivateKeys.py. Feeding the JSON generated from the Powershell script will output all the RSA private keys found:
These RSA private keys are unencrypted. Even though when I created them I added a password, they are stored unencrypted with ssh-agent so I don’t need the password anymore.
To verify, I copied the key back to a Kali linux box and verified the fingerprint and used it to SSH in!
Obviously my PowerShell-fu is weak and the code I’m releasing is more for PoC. It’s probably possible to re-create the private keys entirely in PowerShell. I’m also not taking credit for the Python code — that should all go to soleblaze for his original implementation.
I present some work that I did involving automatic deobfuscation of obfuscated control flow constructs with abstract interpretation. Considering the image below, this project is responsible for taking graphs like the one on the left (where most of the «conditional» branches actually only go in one direction and are only present to thwart static analysis) and converting them into graphs like the one on the right.
Much work on deobfuscation relies on pattern-matching at least to some extent; I have coded such tools myself. I have some distaste for such methods, since they stop working when the patterns change (they are «syntactic»). I prefer to code my deobfuscation tools as generically («semantically») as possible, such that they capture innate properties of the obfuscation method in question, rather than hard-coding individual instances of the obfuscation.
The slides present a technique based on abstract interpretation, a form of static program analysis, for deobfuscating control flow transfers. I translate the x86 code into a different («intermediate») language, and then perform an analysis based on three-valued logic over the translated code. The end result is that certain classes of opaque predicates (conditional jumps that are either always taken or always not taken) are detected and resolved. I have successfully used this technique to break several protections making use of similar obfuscation techniques.
Although I invented and implemented these techniques independently, given the wealth of work in program analysis, it wouldn’t surprise me to learn that the particular technique has been previously invented. Proper references are appreciated.
Code is also included. The source relies upon my Pandemic program analysis framework, which is not publicly available. Hence, the code is for educational purposes only. Nonetheless, I believe it is one of very few examples of publicly-available source code involving abstract interpretation on binaries.
Since Office 2010 all the Office applications including Microsoft Access and VBA are available as a 64-bit edition in addition to the classic 32-bit edition.
To clear up an occasional misconception. You do not need to install Office/Access as 64-bit application just because you got a 64-bit operating system. Windows x64 provides an excellent 32-bit subsystem that allows you to run any 32-bit application without drawbacks.
For now, 64-bit Office/Access still is rather the exception than the norm, but this is changing more and more.
Access — 32-bit vs. 64-bit
If you are just focusing on Microsoft Access there is actually no compelling reason to use the 64-bit edition instead of the 32-bit edition. Rather the opposite is true. There are several reasons not to use 64Bit Access.
Many ActiveX-Controls that are frequently used in Access development are still not available for 64-bit. Yes, this is still a problem in 2017, more than 10 years after the first 64Bit Windows operating system was released.
Drivers/Connectors for external systems like ODBC-Databases and special hardware might not be available. – Though this should rarely be an issue nowadays. Only if you need to connect to some old legacy systems this might still be a factor.
And finally, Access applications using the Windows API in their VBA code will require some migration work to function properly in an x64-environment.
There is only one benefit of 64-bit Access I’m aware of. When you open multiple forms at the same time that contain a large number of sub-forms, most likely on a tab control, you might run into out-of-memory-errors on 32-bit systems. The basic problem exists with 64-bit Access as well, but it takes much longer until you will see any memory related error.
Unfortunately (in this regard) Access is part of the Office Suite as is Microsoft Excel. For Excel, there actually are use cases for the 64-Bit edition. If you use Excel to calculate large data models, e.g. financial risk calculations, you will probably benefit from the additional memory available to a 64-bit application.
So, whether you as an Access developer like it or not, you might be confronted with the 64-bit edition of Microsoft Access because someone in your or your client’s organization decided they will install the whole Office Suite in 64-bit. – It is not possible to mix and match 32- and 64-bit applications from the Microsoft Office suite.
I can’t do anything about the availability of third-party-components, so in this article, I’m going to focus on the migration of Win-API calls in VBA to 64-bit compatibility.
Migrate Windows API-Calls in VBA to 64-bit
Fortunately, the Windows API was completely ported to 64-bit. You will not encounter any function, which was available on 32-bit but isn’t anymore on 64-bit. – At least I do not know of any.
However, I frequently encounter several common misconceptions about how to migrate your Windows API calls. I hope I will be able to debunk them with this text.
But first things first. The very first thing you will encounter when you try to compile an Access application with an API declaration that was written for 32-bit in VBA in 64-bit Access is an error message.
Compile error: The code in this project must be updated for use on 64-bit systems. Please review and update Declare statements and then mark them with the PtrSafe attribute.
This message is pretty clear about the problem, but you need further information to implement the solution.
With the introduction of Access 2010, Microsoft published an article on 32- and 64-Compatibility in Access. In my opinion, that article was comprehensive and pretty good, but many developers had the opinion it was insufficient.
Just recently there was a new, and in my opinion excellent, introduction to the 64-bit extensions in VBA7 published on MSDN. It actually contains all the information you need. Nevertheless, it makes sense to elaborate on how to apply it to your project.
The PtrSafe keyword
With VBA7 (Office 2010) the new PtrSafe keyword was added to the VBA language. This new keyword can (should) be used in DeclareStatements for calls to external DLL-Libraries, like the Windows API.
What does PtrSafe do? It actually does … nothing. Correct, it has no effect on how the code works at all.
The only purpose of the PtrSafe attribute is that you, as the developer, explicitly confirm to the VBA runtime environment that you checked your code to handle any pointers in the declared external function call correctly.
As the data type for pointers is different in a 64-bit environment (more on that in a moment) this actually makes sense. If you would just run your 32-bit API code in a 64-bit context, it would work; sometimes. Sometimes it would just not work. And sometimes it would overwrite and corrupt random areas of your computer’s memory and cause all sorts of random application instability and crashes. These effects would be very hard to track down to the incorrect API-Declarations.
For this understandable reason, the PtrSafe keyword is mandatory in 64-bit VBA for each external function declaration with the DeclareStatement. The PtrSafe keyword can be used in 32-bit VBA as well but is optional there for downward compatibility.
The data types Integer (16-bit Integer) and Long (32-bit Integer) are unchanged in 64-bit VBA. They are still 2 bytes and 4 bytes in size and their range of possible values is the same as it were before on 32-bit. This is not only true for VBA but for the whole Windows 64-bit platform. Generic data types retain their original size.
Now, if you want to use a true 64-bit Integer in VBA, you have to use the new LongLong data type. This data type is actually only available in 64-bit VBA, not in the 32-bit version. In context with the Windows API, you will actually use this data type only very, very rarely. There is a much better alternative.
The LongPtr data type
On 32-bit Windows, all pointers to memory addresses are 32-bit Integers. In VBA, we used to declare those pointer variables as Long. On 64-bit Windows, these pointers were changed to 64-bit Integers to address the larger memory space. So, obviously, we cannot use the unchanged Long data type anymore.
In theory, you could use the new LongLong type to declare integer pointer variables in 64-bit VBA code. In practice, you absolutely should not. There is a much better alternative.
Particularly for pointers, Microsoft introduced an all new and very clever data type. The LongPtr data type. The really clever thing about the LongPtr type is, it is a 32-bit Integer if the code runs in 32-bit VBA and it becomes a 64-bit Integer if the code runs in 64-bit VBA.
LongPtr is the perfect type for any pointer or handle in your Declare Statement. You can use this data type in both environments and it will always be appropriately sized to handle the pointer size of your environment.
Misconception: “You should change all Long variables in your Declare Statements and Type declarations to be LongPtr variables when adapting your code for 64-bit.”
As mentioned above, the size of the existing, generic 32-bit data types has not changed. If an API-Function expected a Long Integer on 32-bit it will still expect a Long Integer on 64-bit.
Only if a function parameter or return value is representing a pointer to a memory location or a handle (e.g. Window Handle (HWND) or Picture Handle), it will be a 64-bit Integer. Only these types of function parameters should be declared as LongPtr.
If you use LongPtr incorrectly for parameters that should be plain Long Integer your API calls may not work or may have unexpected side effects. Particularly if you use LongPtr incorrectly in Type declarations. This will disrupt the sequential structure of the type and the API call will raise a type mismatch exception.
The hWnd argument is a handle of a window, so it needs to be a LongPtr. nCmdShow is an int32, it should be declared as Long in 32-bit and in 64-bit as well.
Do not forget a very important detail. Not only your Declare Statement should be written with the LongPtr data type, your procedures calling the external API function must, in fact, use the LongPtr type as well for all variables, which are passed to such a function argument.
VBA7 vs WIN64 compiler constants
Also new with VBA7 are the two new compiler constantsWin64 and VBA7. VBA7 is true if your code runs in the VBA7-Environment (Access/Office 2010 and above). Win64 is true if your code actually runs in the 64-bit VBA environment. Win64 is not true if you run a 32-Bit VBA Application on a 64-bit system.
Misconception: “You should use the WIN64 compiler constants to provide two versions of your code if you want to maintain compatibility with 32-bit VBA/Access.”
For 99% of all API declarations, it is completely irrelevant if your code runs in 32-bit VBA or in 64-bit VBA.
As explained above, the PtrSafe Keyword is available in 32-bit VBA as well. And, more importantly, the LongPtr data type is too. So, you can and should write API code that runs in both environments. If you do so, you’ll probably never need to use conditional compilation to support both platforms with your code.
However, there might be another problem. If you only target Access (Office) 2010 and newer, my above statement is unconditionally correct. But if your code should run with older version of Access as well, you need to use conditional compilation indeed. But you still do not need to care about 32/64-Bit. You need to care about the Access/VBA-Version you code is running in.
You can use the VBA7 compiler constant to write code for different versions of VBA. Here is an example for that.
PrivateConstSW_MAXIMIZEAsLong=3#If VBA7 Then PrivateDeclarePtrSafeFunctionShowWindowLib«USER32»_(ByValhwndAsLongPtr,ByValnCmdShowAsLong)AsBooleanPrivateDeclarePtrSafeFunctionFindWindowLib«USER32»Alias«FindWindowA»_(ByVallpClassNameAsString,ByVallpWindowNameAsString)AsLongPtr#Else PrivateDeclareFunctionShowWindowLib«USER32»_(ByValhwndAsLong,ByValnCmdShowAsLong)AsBooleanPrivateDeclareFunctionFindWindowLib«USER32»Alias«FindWindowA»_(ByVallpClassNameAsString,ByVallpWindowNameAsString)AsLong#End If PublicSubMaximizeWindow(ByValWindowTitleAsString)#If VBA7 Then DimhwndAsLongPtr#Else DimhwndAsLong#End If hwnd=FindWindow(vbNullString,WindowTitle)Ifhwnd<>0ThenCallShowWindow(hwnd,SW_MAXIMIZE)EndIfEndSub
Now, here is a screenshot of that code in the 64-bit VBA-Editor. Notice the red highlighting of the legacy declaration. This code section is marked, but it does not produce any actual error. Due to the conditional compilation, it will never be compiled in this environment.
When to use the WIN64 compiler constant?
There are situations where you still want to check for Win64. There are some new API functions available on the x64 platform that simply do not exist on the 32-bit platform. So, you might want to use a new API function on x64 and a different implementation on x86 (32-bit).
A good example for this is the GetTickCount function. This function returns the number of milliseconds since the system was started. Its return value is a Long. The function can only return the tick count for 49.7 days before the maximum value of Long is reached. To improve this, there is a newer GetTickCount64 function. This function returns an ULongLong, a 64-bit unsigned integer. The function is available on 32-bit Windows as well, but we cannot use it there because we have no suitable data type in VBA to handle its return value.
If you want to use this the 64bit version of the function when your code is running in a 64-bit environment, you need to use the Win64constant.
#If Win64 Then PublicDeclarePtrSafeFunctionGetTickCountLib«Kernel32»Alias«GetTickCount64»()AsLongPtr#Else PublicDeclarePtrSafeFunctionGetTickCountLib«Kernel32»()AsLongPtr#End If
In this sample, I reduced the platform dependent code to a minimum by declaring both versions of the function as GetTickCount. Only on 64-bit, I use the alias GetTickCount64 to map this to the new version of this function. The “correct” return value declaration would have been LongLong for the 64-bit version and just Long for the 32-bit version. I use LongPtr as return value type for both declarations to avoid platform dependencies in the calling code.
A common pitfall — The size of user-defined types
There is a common pitfall that, to my surprise, is hardly ever mentioned.
Many API-Functions that need a user-defined type passed as one of their arguments expect to be informed about the size of that type. This usually happens either by the size being stored in a member inside the structure or passed as a separate argument to the function.
Frequently developers use the Len-Function to determine the size of the type. That is incorrect, but it works on the 32-bit platform — by pure chance. Unfortunately, it frequently fails on the 64-bit platform.
To understand the issue, you need to know two things about Window’s inner workings.
The members of user-defined types are aligned sequentially in memory. One member after the other.
Windows manages its memory in small chunks. On a 32-bit system, these chunks are always 4 bytes big. On a 64-bit system, these chunks have a size of 8 bytes.
If several members of a user-defined type fit into such a chunk completely, they will be stored in just one of those. If a part of such a chunk is already filled and the next member in the structure will not fit in the remaining space, it will be put in the next chunk and the remaining space in the previous chunk will stay unused. This process is called padding.
Regarding the size of user-defined types, the Windows API expects to be told the complete size the type occupies in memory. Including those padded areas that are empty but need to be considered to manage the total memory area and to determine the exact positions of each of the members of the type.
The Len-Function adds up the size of all the members in a type, but it does not count the empty memory areas, which might have been created by the padding. So, the size computed by the Len-Function is not correct! — You need to use the LenB-Function to determine the total size of the type in memory.
On 32-bit the Integer is two bytes in size but it will occupy 4 bytes in memory because the Long is put in the next chunk of memory. The remaining two bytes in the first chunk of memory are not used. The size of the members adds up to 10 bytes, but the whole type is 12 bytes in memory.
On 64-bit the Integer and the Long are 6 bytes total and will fit into the first chunk together. The LongPtr (now 8 bytes in size) will be put into the net chunk of memory and once again the remaining two bytes in the first chunk of memory are not used. The size of the members adds up to 14 bytes, but the whole type is 16 bytes in memory.
So, if the underlying mechanism exists on both platforms, why is this not a problem with API calls on 32-bit? — Simply by pure chance. To my knowledge, there is no Windows API function that explicitly uses a datatype smaller than a DWORD (Long) as a member in any of its UDT arguments.
With the content covered in this article, you should be able to adapt most of your API-Declarations to 64-bit.
Many samples and articles on this topic available on the net today are lacking sufficient explanation to highlight the really important issues. I hope I was able to show the key facts for a successful migration.
Always keep in mind, it is actually not that difficult to write API code that is ready for 64-bit. — Good luck!