Posted in

crawdstrike. Why?

crowdstrike logo

July 19th. Several Windows machines crashed

Displaying image001.png

We all know there was a bug in the latest CrowdStrike update that affected many companies around the world.

How did it happen? Why didn’t we have just a pop-up like this?

The short answer: CrowdStrike Falcon runs in “Kernel mode”.

What Kernel Mode is?

In the beginning, the Microsoft operating system (MS-DOS) was totally unrestricted. Any program was able to read or write all the memory or access any file in the disk as its will.

That lead to the development of technologies to protect the data and the system stability.

  • For logical file access, Microsoft implemented NTFS, that is supervised by the operating system.
  • For memory access, is protected by hardware.
The first time I heard about protected mode by hardware was when reading the "Assembler for Intel 80286 processor" book. The last chapter was very hard to understand for the young me. It explained how to switch and operate the protected mode, also called Kernel Mode.

When the CPU is turned on, it starts in real-mode (compatible with legacy software).

When a program (usually the OS kernel) requests the processor to switch to protected mode, the kernel can execute the programs and assign a virtual memory address space. That means that the program will see its assigned memory are the physical memory starting from address zero, but every memory segment is pointing to the block assigned to that program.

The operative system will then be able to move these memory blocks to the PageFile (disk) based on usage statistics, but that’s another chapter.

CrowdStrike Falcon

Falcon is not just an antivirus analyzing files, it also scans the memory. In order to do that, it needs to run in kernel-mode.

Even though there’s no hardware involved, CrosdStrike wrote a device driver, so it can be loaded in the kernel space.

For device drivers that need to run in kernel modem, Microsoft has the WHQL certification. Because running in kernel mode can affect the system stability, getting that certification the software involves passing several tests.

Once you get the certification, Microsoft provides you a certificate to authenticate your software.

Any hardware vendor, when updating the drivers, will submit the logs to Microsoft in order to get the new code signed by Microsoft.

So… How did it happen?

In this case, CrowdStrike wants to provide protection against new zero-day exploits. In order to bypass the “slow” Microsoft process, it seems it’s loading the updates from a “data file” that contains executable code and then dynamically import library from a file.

The bug

It seems that the Falcon driver references an invalid memory address. That action is catch by the CPU and sending a critical notification (called interrupt) to the OS.

The OS detects it’s not a user mode program, so it’s very dangerous to let the system continue running and triggers the blue screen.

Why isn’t Windows just disabling the offending driver on the next boot?

In this case, the driver is set a boot-start driver. These drivers are identified as critical and not allowed to be disabled. Like the one Windows uses to read the disk.

The only way to fix the issue is to delete the driver files from the disk before booting the OS.

– Did CrowdsStrike do it wrong?

– Should had let the clients decide how quickly to receive the updates?

– Should had tested it before?

– We can all make mistakes, like I did in the title

– make your own conclusions..

Leave a Reply

Your email address will not be published. Required fields are marked *