Bios NX/XD, Windows Data Execution Prevention (DEP) & LoadLibraryA & Error 998

So I noticed a very subtle use of LoadLibrary, namely LoadLibraryEx with the flag to load the exe/dll as a data file in some code posted by @also here

Last night, I kept getting random or intermittent Error 998’s with LoadLibraryA
System Error Codes (500-999) (WinError.h) - Win32 apps | Microsoft Learn

ERROR_NOACCESS
998 (0x3E6)
Invalid access to memory location.

I knew this worked, I knew the code had not been changed, and its worked reliably for months after all how many ways can you mess up LoadLibraryA?

This was intermittent, but probably because of the speed I was working at, anyway, what it appears to be is, but I could be wrong, is the /NXCOMPAT switch aka Data Execution Prevention or DEP.
/NXCOMPAT (Compatible with Data Execution Prevention) | Microsoft Learn

Now using LoadLibraryA to load the Clarion 11\bini\c4otSDX.dll aka the Developers edition of the ODBC driver addon, causes the SV window to popup, reminding everyone its the developer edition. However using LoadLibraryExA with the LOAD_LIBRARY_AS_DATAFILE flag means the SV ODBC developers edition window doesnt appear on screen, first clue that differences with the way a DLL or EXE is loaded is taking place. I code on a virtual XP with C6 for expediency before porting to C11 currently until I’m up to speed with the C11 IDE.

So this got me thinking, what else is LoadLibraryA calling? Could this be triggering Data Execution Prevention and thus causing the intermittent error 998 I’m seeing?

Bit of digging turns up
Data Execution Prevention - Win32 apps | Microsoft Learn

Data Execution Prevention (DEP) is a system-level memory protection feature that is built into the operating system starting with Windows XP and Windows Server 2003. DEP enables the system to mark one or more pages of memory as non-executable. Marking memory regions as non-executable means that code cannot be run from that region of memory, which makes it harder for the exploitation of buffer overruns

Part 3: Memory Protection Technologies | Microsoft Learn

Beginning with Windows XP Service Pack 2, the 32-bit version of Windows utilizes the no-execute page-protection (NX) processor feature as defined by AMD or the Execute Disable bit feature as defined by Intel. In order to use these processor features, the processor must be running in Physical Address Extension (PAE) mode. The 64-bit versions of Windows XP uses the NX processor feature on 64-bit extensions and certain values of the access rights page table entry (PTE) field on IPF processors.

So the Execute Disable bit, is something mentioned in computer BIOSes, certainly earlier ones but not always visible in the latest UEFI bioses.
Execute Disable Bit for Intel® Processors

Intel call it XD in the bios short for Execute Disable.
AMD call it NX in the bios short for No Execute.

Its recommended this bios option is switched on or enabled if your bios or your customers computer bios has this, its an old setting not always seen in newer UEFI bios versions, Intel consider it legacy, but their wording is such that it still exists but wont be mentioned.

But then you also need the Windows DEP to be switched on to see these LoadLibrary 998 errors. If one of them is not switched on or enabled, you wont see the 998 error. If its not a bios option, then it needs to be switched on in Windows:

XP, Control Panel, System Icon, Advanced Tab, Performance Setting button, Data Execution Prevention tab, toggle option accordingly & Reboot

Win10, Start buttio, Gear Icon called Settings, Update & Security, Windows Security, Open Windows Security, App & Browser control, Exploit Protection Settings, System Settings, Data Execution Prevention dopdown list toggle accordingly & Reboot

One of the things I noticed, despite running this on a virtual pc, namely XP, even rebooting the virtual PC didnt stop the behaviour! So I suspect DEP on the host, ie my Win10 machine is affecting the virtual PC and it was switched on by default.

But here is gets weird. So this morning again reliably able to get the 998 error by triggering an access violation. Switched off DEP in Win10, 998 no longer appears. Switch off DEP in XP, 998 still doesnt appear. Switch it DEP back on in XP to see if the virtual pc is running fully independent of the Win10 host, trigger the access violation, cant get the 998 to appear. Switch on DEP in Win10, repeat access violation and it still doesnt appear.

So not only is this an intermittent problem, its a standalone pc and whilst I’m aware that RAM chips can be made to transmit over wifi frequencies over a range of 180cm with some malware for one way snooping purposes, I’m fairly certain this pc is properly air gapped!

So I’m now wondering if there is some other malware I’ve yet to load in my resource editor.

TLDR, I thought an access violation with DEP triggered Windows error 998 when using LoadLibraryA after an access violation, found it to be reproducible but then find its not! :thinking:

Edit. Sometimes being a dog has its advantages!

The error 998 means that the Windows Kernel caught the Access Denied exception caused by mismatching of type of memory access operation and protection settings for the accessed memory region. Examples:

  • attempt to access memory with address less 64K
  • attempt to access memory outside any allocated region
  • attempt to execute code stored in the memory block having none of PAGE_EXECUTE, PAGE_EXECUTE_READ, PAGE_EXECUTE_READWRITE or PAGE_EXECUTE_WRITECOPY flags in the protection attribute.

DEP is only one of possible scenarios.

err998.clw (7.9 KB)
Attached program is a modification of the test for enumerating resources in EXE/DLL. It must cause the error 998 on execution. This is because EnumResTypes function is invoking with the second parameter equal to MAKEINTRESOURCE(n), The code generated to store CONST *CSTRING parameter to local of &CSTRING type involves call to the RTL’s _nullstrlen function. _nullstrlen handles the NULL parameter but the pointer equal to MAKEINTRESOURCE(n) will raise the Access Denied exception.

Where do you get this information from? I cant find this level of info anywhere and I’ve certainly spent years trying to find this sort of info.

Thanks for the update.

Edit.

Looking at what I was doing last night, which was accessing the Resource Name ID’s which are in the 64K region, this might have triggered it, namely trying to access the multitude of icons and cursors in the Resource Type RT_Group_Cursor and RT_Group_Icon sections. ie I was seeing if I could display more of the icons.

So what you have put above makes sense.

Having just used a long with a 64k value in instead of the MakeIntResource macro had worked, but I also created a procedure called MakeIntResource which took the Long/Ulong* and returned a Ushort in case there was some internal clarion runtime thing taking place.

MAKEINTRESOURCEA macro (winuser.h) - Win32 apps | Microsoft Learn
c++ - Why does MAKEINTRESOURCE() work? - Stack Overflow

*Some api’s which work with EnumResourceName’s show Ulongs and others Long’s. There’s a bit of mismatch with MS’s docs.

Edit.

I’m not too sure, because I’m getting plagued by 998’s again now and all I’ve done is shutdown and restart the computer, call LoadLibrary, and then call EnumResourceTypesA and it throws a 998 error, every time!

DEP is switched on on the Host Win10 machine and Virtual XP machine. The other change I have noticed is with the Clarion Debugger. It usually shows a source code window in the top left pane, but its stopped doing this and now shows nothing when I toggle the DEP settings. Weird to say the least and with the other problems with this computer makes me think this machine is not air gapped properly!

Edit.

I have DEP switched on on the Win10 host and virtual XP OS and its still throwing 998 errors, so its looking like malware.

There’s got to be some malware or something being loaded with LoadLibraryA which isnt being loaded with LoadLibraryExA and its totally random when it strikes, ie can reboot physical and virtual OS’s and it will strike straight away throwing a 998 with DEP switched off or on.

Ergo’s there has to be some malware messing around with this computer, that LoadLibraryA is triggering, thats assuming there isnt some stealth means to mess with this computer that bridges the airgap.

It needs to be compiled in Debug configuration or with Debugging information min or full to raise 998 here.

FWIW Deduced by your neat explanation, the previous test.clw with LONG parameter on EnumResTypes can raise 998 too by changing for example Q &= 0 + param with: Q &= 0 !+ param or by adding a PEEK(0,A#) This raises it with and without debug information

Where do you get the A# implicit variable from in PEEK(0,A#)? Its a non command because 0 is a 0 address and its peeking into a A# implicit variable, so where do you get that from?

Edit.

I’m beginning to think this website is a sock puppet website run by the British Security Services, like this.
www.theguardian.com

Once again. If called API function is need to invoke passed callback or to access some memory allocated data object (e.g. buffer) passed to call, Windows

  1. protects such actions with an exception handling frame
  2. if exception occurred and caught, the exception handler terminates execution of function and
    sets the error code returned by GetLastError according to exception parameters
  3. if exception code is C0000005 and the reason of exception is accessing memory in bad manner (e.g. writing to read-only block) or by wrong address, the error code is set to 998.

So my code in the callback procedure is directly adding data to a global Q. I’m not passing a reference to the Q like you do using the lParam.

However I do use this technique of passing a class reference to a callback procedure in the lParam defined in a class, in order to access the class properties and methods.

However the compiler doesnt complain when I directly use the global Q in the callback procedure, and this works intermittently, ie its worked for days and I’ve used this technique elsewhere previously, so are we not supposed to use global Q’s or global variables in callback procedures?

Richard, address 0 choosen to intentionally throw the exception, A# choosen just to make it simpler.

Accessing a global queue can’t itself be the reason for an exception, Something else is wrong.

There are no reasons to complain,

This is what I think and I’ll get to the bottom of it.

Edit.

Getting 998 with LoadLibraryExA and Load_Library_As_DataFile

So I’ll trying passing the Q as a reference in the lParam and see if that shows up any 998 errors and this is with DEP switched off on the virtual XP machine and the host Win10 machine.

Edit. So I’m getting an access violation with the line of code
Select(?SheetResource,4) !Bitmap
in this app.

These problems occurred when I reinstalled Clarion 11 on this machine. Before hand, it wasnt playing up like this, I’ve installed Clarion 11 off a CD, reinstalled the 2013 c++ redistributable and now I’m just getting intermittent inconsistent problems.

The load library and call back shouldnt give a problem, just like this line of code selecting a tab shouldnt cause an access violation and currently all the Windows security options are switched on.

Is this code in a callback function?

EnumResType in my test program can raise the exception if passed type is a pseudo-string produced by MAKEINTRESOURCE with some non-standard RT_* parameter. Corrected text is in attachment.
test.clw (8.1 KB)

No its not code in a callback function.

I’ll give this a go and see what happens, but I’m cutting back code and commenting out code in order to restore some stability and get to the bottom of the intermittent problems.

Thanks!

I can’t imagine how LoadLibraryEx with the LOAD_LIBRARY_AS_DATAFILE flag can cause the exception which is mapping then to the error 998. What is a EXE/DLL you’re trying to load?

I had swapped the Peek(type,GloQ.Cstringvalue) with Glo:Q.CstringValue = Addr2String(type) when I’d seen your test.clw, and the Addr2String was causing the problems. I’ve put it back to Peek(… and its working again. I’m guessing there is no datatype conversion or something else going on when I use Addr2String().

Addr2CString (Cla$PushCString) can cause GPF in 2 cases:

  1. Value passed as a parameter is not valid address
  2. Passed address points to not 0-terminated string

PEEK must cause GPF in case (1) too. Because PEEK copies SIZE(destination variable) bytes from the memory pointed by passed address, value of the destination CSTRING variable should become broken in case (2).

Cla$PushCString pushes characters from passed address to first 0-character to the string stack. This values is popping from the stack to a destination variable if Addr2CString is using in the right side of assignment. Popping from the string stack can cause GPF only if destination address is wrong.

At the moment I’ve been using Peek in the callback and also to load the manifest file from the resource’s #1 for exe, #2 for Dll into a cstring.

But I am also getting api calls with no handles returned, like with CreateIconFromResource(Ex) and no windows error either, ie GetLastError returns 0! How does that work in windows?

I’m using
Assert(Loc:Handle/Loc:Whatever,‘API whatever failed Windows Error’ & GetLastError())
after every API call so I know straight away when the API has failed. Its so intermittent, code thats previously worked, now doesnt work. Its an absolute mystery what is going. I’ve rebooted the virtual pc and rebooted the main win10 host in a bid to clear any potential instability caused. I’ve using code translated from StockOverFlow and I get it working and then next day it stops working. I have not got a clue what is going on with the machine!

At the moment, I have the manifests embedded in resources working, ie it loads them and puts the contents into a cstring and text control, some manifests in some of the exe’s/dll’s have the File signature EF BB BF which is UTF-8 text List of file signatures - Wikipedia before the start of the manifest file text <?xml version=“1.0”… . Because I didnt know if this file signature was causing a problem with the text control I’ve added code to strip it out just in case, but at the moment it doesnt matter, the text control displays the file signature, which is what I would prefer anyway.

TLDR I dont know WTF is going on with this computer, but I dont think David A Bayliss could criticize the defensive programming employed in this app.

Could you post your Clarion prototypes of APIs used

Its what I’m doing at the moment, double checking them all.

Question. Does Clarion allow windows API overloading?

So just like we can do procedure overloading by changing the parameter prototypes for procedures we create in Clarion, can we do the same with Windows API’s?

I have some windows API’s which are procedure overloaded, in that they have different parameter prototypes eg:

IconInfo Group,Type
fIcon long
xHotSpot ulong
yHotSpot ulong
hbmMask ulong
hgmColor ulong
         End

IS_GetIconInfo(long iconhandle, long lpIconInfoStruct),bool,raw,pascal,name('GetIconInfo')
IS_GetIconInfo(long iconhandle, *IconInfo),bool,raw,pascal,name('GetIconInfo')

Loc:IconInfo Group(IconInfo)
             End

     Code
Loc:RVBool = IS_GetIconInfo(Loc:Handle,Address(Loc:IconInfo))
DebugView('Loc:RVBool[' & Loc:RVBool &'] = IS_GetIconInfo(Loc:Handle[' & Loc:Handle &'],Address(Loc:Icon:Info)['& Address(Loc:IconInfo) &'])
Assert(Loc:RVBool,'GetIconInfo failed - Windows Error ' & IS_GetLastError() )

Loc:RVBool = IS_GetIconInfo(Loc:Handle,Loc:IconInfo)
DebugView('Loc:RVBool[' & Loc:RVBool &'] = IS_GetIconInfo(Loc:Handle[' & Loc:Handle &'],Loc:Icon:Info['& Address(Loc:IconInfo) &'])
Assert(Loc:RVBool,'GetIconInfo failed - Windows Error ' & IS_GetLastError() )

GetIconInfo function (winuser.h) - Win32 apps | Microsoft Learn

The other question I have is, with the first api with the long lpIconInfoStruct, I’ve added Raw, as a safeguard to not send the string size, in case the Clarion runtime is some how recognising the address is being used with an external API, but does anyone know if I need to add Raw if I’m using Address()?

I know I can add Raw with api prototypes even when its not necessary, and it wont cause any problems, but I’m wondering just how much the Clarion runtime can and cant do.

The other question I have is, where an EX version of the api exists, and where an EX version does not exist eg LoadResource
LoadResource function (libloaderapi.h) - Win32 apps | Microsoft Learn

Can I use the LoadLibraryExA with LoadResource,
LoadLibraryExA function (libloaderapi.h) - Win32 apps | Microsoft Learn
or do I have to use the non EX version, called LoadLibraryA
LoadLibraryA function (libloaderapi.h) - Win32 apps | Microsoft Learn

TIA

The compiler does not know about Windows API. If you use the NAME attribute in a prototype, the label of the procedure can be absolutely different from the parameter of NAME.

The RAW attribute instructs the compiler to pass only address part of the actual parameter. If all formal parameters have the simple numeric type (e.g. LONG, BYTE) or only address of value is passing (e.g. CONST *CSTRING), the RAW attribute is not required.

LoadLibraryEx with the LOAD_LIBRARY_AS_DATAFILE flag (or LOAD_LIBRARY_AS_DATAFILE_EXCLUSIVE, or LOAD_LIBRARY_AS_IMAGE_RESOURCE) is far more preferable because the loader does not resolves imports, does not invokes initialization of DLL’s statics and other actions performed by LoadLibrary.

i appreciate the clarification, but now even DebugStringOutputA aka Debugview has stopped working.

I think its time to wipe and reinstall it again. Its only been reinstalled since 22nd July and its been offline all that time, bar USB sticks being used to copy the Win10 SDK onto it and copying screen shots off of the machine to post on here.