PHD Computer Consultants Ltd
... Writing Windows NT Device Drivers
Last modified: 8 August 1997.

Kicking ass in NT's Underwear

This article is a short introduction to writing device drivers for Windows NT. If you are still keen after reading this article, then I recommend Art Baker's The Windows NT Device Driver Book which I have also reviewed.

Before you start, ask whether you really need to write a device driver. There may be other tools or techniques which achieve the same end. I have been told that there are generic device drivers available which you can customise with a script.
Talking to your proprietary dongle or a graphics card will usually need a device driver.

Hardware

OK... so you have got some serious bit twiddling to do. Your first job will be to understand fully how the hardware works. If you are talking to another device at the far end of a link, try to get its source code as well. If they have not done it yet, get your hardware engineers to finalise their spec!

Standard parallel and serial ports have a pretty well defined interface. However, your device at the other end may not! There are several flavours of parallel port, configurable in the BIOS set up, eg unidirectional or bidirectional. Unidirectional parallel ports can be made readable by reading data 4 bits at a time via the status port.

I/O devices may be on standard PC motherboards or be on one of the various types of bus. A device may need to be serviced by the processor, or DMA transfers can be set up. Perhaps the card can even act as a bus master itself.

Hardware may or may not produce interrupts, and signals may be level sensitive or edge triggered. You may have to allow time for data values to become stable on output lines before toggling a signal wire. You may need to de-bounce very slow inputs. Devices have a habit of timing out, running out of paper, not being turned on, causing framing errors, CRC errors, overrunning their buffers, etc., etc.
I have not looked at Plug and Play, but presumably you have got to cope with configuring your device dynamically.

However, by the time you have got the driver infrastructure set up, talking to the electronics might seem the easy part.

Preparation

Arm yourself with a suitable compiler (eg Visual C++) and get a Microsoft Developer Network Professional Subscription. Hunt the Platform SDK and NT DDK CD-ROMs and install them, source and all. There are two batch files to run to set the environment variables properly. Check you can compile the DDK examples.

Brace yourself for a development style that went out with the ark: the command line. In fact you might need a brace of NT computers for deep debugging.
Most people will just write their driver in plain C and use Microsoft's command line utility build. "Free" (retail) and "checked" (debug) versions of your driver can be built. The rebase utility will strip out all remaining symbols from your final release retail version.

A Microsoft technical note says that you can write in C++ and debug from the Visual C++ IDE.

You really need to understand an awful lot before you can write code. And you need to design your driver well from the start. Remember that data areas can be accessed by several different parts of your driver, which may be running at more or less the same time (or even at the same time if in a multi-processor machine). Several Win 32 programs could slew off lots of overlapped I/O requests. So think "re-entrant" and avoid global variables. And remember that all strings must be in Unicode.

Your driver design might not take the expected course. The NT parallel port printer driver does not use interrupts for example. The parallel port interrupts often conflict with other devices, so it uses a system thread to poll the printer.

Before you get started proper, it is worth while tuning your NT start up. Like it or not, you will be rebooting your computer quite a lot, so cut out anything that is unnecessary. Hopefully you will not see the "blue screen of death" too often. However, it is easy to get device names left behind when a test driver unloads. And you will be fiddling with the registry quite a bit; some changes are only recognised at boot time. However, once you are past the initial stages, you should be able to install new versions of your driver using the Control Panel Drivers window.

Do not be afraid to pinch techniques from existing code: that's what all the DDK source is there for.

Background

A driver is a trusted part of the NT kernel. In between the kernel and user programs is the Win 32 sub-system.

To Win 32 programmers, your driver will appear as one or more "file" devices, eg they will open a device file, read and write and then close the handle. You can also handle DeviceIoControl() requests to do any sort of I/O you wish.

If you are writing a DongLpt driver for a dongle on the Parallel port, then you might provide devices "\\.\DongLpt1" for the first parallel port, "\\.\DongLpt2" for the second, etc. (If you want to support old Win 16 programs that access "LPT1", etc., see box.)

"\\.\DongLpt1" is just a Win 32 symbolic link to the real hidden NT kernel device name "\Device\DongLpt0". Note that kernel device numbers are zero-based by convention.

Drivers are controlled by the I/O Manager and talk to the electronics using NT's Hardware Abstraction Layer (HAL).

More sophisticated drivers fit into a hierarchy. For example, the "parport" low level driver exists simply to arbitrate between access requests for the parallel ports. The "parallel" printer class driver sits on top; it uses "parport" to get exclusive use of parallel port but then talks to the hardware directly. If your driver talks to the parallel port then it should use the same technique. The documentation recommends that the parallel port is just reserved on a per-IRP basis (ie for each read and write). However this does not seem appropriate for some applications, so you might want to allocate a port when a device is opened and release it when the handle is closed.
There is documentation for the "parport" driver interface. However only by looking at its source code will you find an extra internal device I/O control request.

Other I/O areas also are arranged in layers. NT's generic SCSI port driver does its job using SCSI mini-port drivers. To write a driver for your new SCSI card, you do not need to write a whole SCSI interface, just the documented interface defined for a mini-port driver.

A similar approach applies to video card drivers. The internals of file system drivers seem to be an undocumented wasteland.

"Filter" drivers can sit unseen above a driver, intercepting all I/O requests. A filter driver could, for example, transparently add compression or encryption without altering the underlying driver design.

Foreground

Your driver has one standard entry point, DriverEntry(). This discovers any hardware, creates any devices and loads up a table of other entry points.

Apart from the DriverUnload() routine, all the other main calls will be the result of I/O Request Packets (IRPs), discussed below. You can set up other call back routines; apart from interrupt handlers, there are deferred procedure calls, completion and cancel routines.

A useful convention is for all your driver routines to have a common name prefix, eg DongDispatchOpen(), DongDispatchWrite(), etc.

For each routine, it is worth while noting carefully which interrupt level it may run at. All the IRP dispatch routines run at PASSIVE_LEVEL while interrupt routines run at one of the processor-specific DIRQLs. In between, deferred procedure calls run at DISPATCH_LEVEL.

You cannot use any standard C libraries. Instead there are many kernel routines in various groups as identified by their initial letters:
DDK Kernel Support Routine Categories
Ex...() Executive Support
Hal...() Hardware Abstraction Layer
Io...() I/I Manager
Ke...() Kernel
Mm...() Memory Manager
Ob...() Object Manager
Ps...() Process Structure
Rtl...() Runtime Library
Se...() Security Reference Monitor
Zw...() Err.. Other Routines

In addition, there are Hardware Abstraction Layer routines such as READ_PORT_BUFFER_UCHAR() which reads a port byte.

As another example, ExAllocatePool() allocates memory, with one of its parameters specifying the memory type, eg Non-paged or Paged, in various guises.

There are different things you can and cannot do at each interrupt level. If running at DISPATCH_LEVEL or higher, you must not touch paged memory. And note that these routines are usually not running in the context of a user's thread. Interrupt routines need to run very quickly, and if necessary ask a deferred procedure call to do any post-transfer processing, eg mark an IRP as completed.

There is a data structure for each device. However most of your working variables will be in the associated device extension, which you define. Besides IRPs and devices, there are oodles of kernel data structures lurking around. Low down, there are funny UNICODE_STRINGs and LARGE_INTEGERs. Controller, adapter (DMA) and interrupt objects can only be used by one device at a time. Zone buffers, lookaside lists and linked lists are different ways of organising your memory.

You can use spin locks to guard access to data areas. The Cancel spin lock is used to protect access to the cancel fields of an IRP. It is a useful cheat to use this Cancel spin lock to guard data areas in all your dispatch points.

A minor point to note is that the status values you return to the I/O Manager are not identical to the values Win 32 programmers see, ie a non-obvious mapping occurs.

IRPs

I/O Request Packets are the basis of all interactions.

Here is a list of common IRPs.
Common IRPs
IRP Function Win 32 Call
IRP_MJ_CREATE Request for a handle CreateFile()
IRP_MJ_CLEANUP Cancel any pending IRPs CloseHandle()
IRP_MJ_CLOSE Close the handle CloseHandle()
IRP_MJ_READ Read data from device ReadFile()
IRP_MJ_WRITE Write data to device WriteFile()
IRP_MJ_DEVICE_CONTROL Control operation DeviceIoControl()
IRP_MJ_INTERNAL_DEVICE_CONTROL Control operation from other drivers
IRP_MJ_SHUTDOWN System shutting down InitiateSystemShutdown()

As described above, you must write a separate handler for each IRP. You need not implement all these IRP function codes. However, create, cleanup, close and read or write are a useful minimum.

An IRP has a header area followed by several stack locations. Each stack location holds a function code and various parameters, eg for the read, write and device I/O control functions.

If you are the first driver to process an IRP then there will be only one stack location. If a driver passes the IRP to a lower level driver to process then the next stack location is allocated. Note that the new stack location can have a different function code.

As an example, a transport network layer driver could accept data transfers of any length. The lower level driver might have a maximum transfer size, so the transport driver will keep calling the lower level driver until all the data is sent.

Alternatively, a higher level driver can allocate whole new IRPs. The transport driver could therefore allocate all the necessary IRPs and send them all off to the lower level driver at once. Obviously it would need to check carefully that all the IRPs completed successfully.

With Buffered I/O, the I/O Manager copies any user write data into non-paged memory for you (and vice versa). With Direct I/O, the I/O Manager locks the user buffer into physical memory for the duration. Direct I/O is slightly more complicated to use but has less overhead.
DOS Device Support
An NT device driver can be accessed from legacy DOS or Win 16 programs, provided you follow certain rules.

Only standard DOS device names can be used, eg LPT1, COM1, etc. Note that LPT1, etc. are output only, while COM1, etc. are bidirectional.

So you must set up a symbolic link from a DOS device name to your NT kernel device name. NT's own parallel and serial drivers will try to allocate any appropriate devices names. For your driver, an example simple approach is for you to make (an used) COM9 actually refer to your port; this could be a parallel port.

If you really need your DongLpt driver to allocate LPT1 for example, then you need to stop NT allocating it. In fact, what you need to do is nab LPT1 before NT tries to allocate it, by setting up the driver group load order correctly.

The NT parallel port arbitrator "parport" driver is in group "Parallel arbitrator". The parallel class driver "parallel" is one of several drivers in group "Extended Base". If we make "DongLpt" load after "parport" and "Parallel arbitrator" but before "Extended Base" then it can reserve the name LPT1 before "parallel" does. "parallel" will only moan minorly to the event log.

Note that this latter technique implies that your driver must start at boot time and so will reserve LPT1 for the entire NT session. In contrast, making a link from COM9 allows your driver to be started only when needed.

NB I did not find out how to allocate AUX before NT does.

Installation

From the word go, you will need various entries in the registry to make NT realise your driver is there, eg in C:\WINNT\System32\Drivers\DongLpt.sys. Use REGEDT32 instead of RegEdit as it can handle all the necessary registry types.

DongLpt's main driver registry key is HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DongLpt. In here are several standard values, eg if Start has a value of 2 then the driver is loaded automatically on re-boot. You can specify which Group your driver is in, and which groups and drivers must be loaded before yours. A driver may well have a Parameters sub-key. This could have an ErrorLogLevel value which indicates the level of event log messages required.

A separate series of registry entries tell the event viewer where to find the driver's event log messages.

For the final cut, you can write a script file or write an installation program by hand. The Win 32 Service Control Manager functions can be used to install, start and stop drivers, so a reboot will usually not be necessary.

Development and Debugging

It is best to write your driver in stages, checking that each part works before moving onto the next. First, get it to load and unload. Then find your hardware, allocate it on load and release it on unload. Make your devices and symbolic links on load and release them on unload.

Now start handling your IRPs. You may just dispatch these to be run straight away in your Start I/O routine, or you could put them in your own internal queue for processing later. Now catch your first interrupts. You can catch time-outs with the basic one second I/O Timer. Or custom timers can be set for any time interval. In NT 4.0, these can fire repeatedly, while earlier versions were one-shot.

You are recommended not to tie up the processor for more than 50 micro-seconds. If you need to run for longer, then consider using a design based on system threads. Event, mutex, semaphore and timer objects may be used to coordinate thread activities. Make sure you tell your thread to exit when it is not needed, as the kernel will not stop it if you forget.

For debugging, I found that writing messages to NT's event log was the easiest way to find out what the driver was up to (ie a bit like putting printfs in). Write the event logging straight away. Messages for the event log are stored in an .mc file and compiled into a resource using the mc utility.
However you can do source code debugging between two NT machines. Your development machine runs the retail NT and the windbg utility. A serial line connects to the other computer running the NT "checked" kernel build and your test driver.

Analysing a "blue screen of death" bugcheck screen can give a useful hint of where your driver failed. Usually only the top three lines give useful information. You can ask NT to do a bugcheck memory dump, but I would recommend avoiding this technique unless you have really have to use it.

Non-paged memory is a precious resource. The alloc_text pragma can be used to mark appropriate routines as pageable, and initialisation routines as discardable.

The DDK has a test suite to test drivers in stressful situations. Once you are happy with this, you can go all the way and submit your driver to Microsoft's Compatibility labs for certification testing.

Example

Here is an incomplete example to give you a taste of real device driver code.

This is the initialisation code for a DongLpt driver which talks to a dongle on the parallel port.

After initialising its event log, DriverEntry() sets the other entry points for the driver.

For each of the parallel ports that NT has found, DongCreateDevice() first creates an NT kernel device. This driver uses buffered I/O. The device extension is initialised.

DongCreateDevice() then links to the corresponding NT "parport" device to retrieve information about the port. Finally the appropriate Win 32 symbolic link name is created.

DongGetPortInfoFromPortDevice() builds a new Internal Device I/O Control IRP to send to "parport" to retrieve the port information. The routine simply uses a notification event to wait for "parport" to complete processing of this IRP. Various hardware details are stored on return. "parport"'s routine TryAllocatePort() is called directly by DongLpt later when it wants to do some I/O. FreePort() makes the port available again.

[ Note use of < and & in this web page source ]

#define	DONG_NT_DEVICE_NAME			L"\\Device\\DongLpt"
#define	DONG_NT_PORT_DEVICE_NAME	L"\\Device\\ParallelPort"
#define	DONG_WIN32_DEVICE_NAME		L"\\DosDevices\\DongLpt"
#define	DONG_DOS_DEVICES			L"\\DosDevices\\"
#define	DONG_DRIVER_NAME			L"DongLpt"

#define	DONG_MAX_NAME_LENGTH		50


NTSTATUS
DriverEntry(
	IN PDRIVER_OBJECT pDriverObject,
	IN PUNICODE_STRING pRegistryPath
	)
{
	ULONG NtDeviceNumber, NumParallelPorts;
	NTSTATUS status = STATUS_SUCCESS;

	DongInitializeEventLog(pDriverObject);

	// Export other driver entry points...
 	pDriverObject->DriverUnload = DongDriverUnload;

	pDriverObject->MajorFunction[ IRP_MJ_CREATE ] = DongDispatchOpen;
	pDriverObject->MajorFunction[ IRP_MJ_CLOSE ] = DongDispatchClose;
	pDriverObject->MajorFunction[ IRP_MJ_WRITE ] = DongDispatchWrite;
	pDriverObject->MajorFunction[ IRP_MJ_READ ] = DongDispatchRead;
	pDriverObject->MajorFunction[ IRP_MJ_CLEANUP ] = DongDispatchCleanup;

	// Initialize a Device object for each parallel port
	NumParallelPorts = IoGetConfigurationInformation()->ParallelCount;

	for( NtDeviceNumber=0; NtDeviceNumber<NumParallelPorts; NtDeviceNumber++)
	{
   		status = DongCreateDevice( pDriverObject, NtDeviceNumber);
   		if( !NT_SUCCESS(status))
			return status;
	}

	// Log that we've started
	// ...

	return status;
}

static NTSTATUS
DongCreateDevice (
	IN PDRIVER_OBJECT pDriverObject,
	IN ULONG NtDeviceNumber
	)
{
	NTSTATUS status;
	
	PDEVICE_OBJECT pDevObj;
	PDEVICE_EXTENSION pDevExt;

	UNICODE_STRING deviceName, portName, linkName, number;
	WCHAR deviceNameBuffer[DONG_MAX_NAME_LENGTH];
	WCHAR portNameBuffer[DONG_MAX_NAME_LENGTH];
	WCHAR linkNameBuffer[DONG_MAX_NAME_LENGTH];
	WCHAR numberBuffer[10];

	PFILE_OBJECT        pFileObject;

	// Initialise strings
	number.Buffer = numberBuffer;
	number.MaximumLength = 20;
	deviceName.Buffer = deviceNameBuffer;
	deviceName.MaximumLength = DONG_MAX_NAME_LENGTH*2;
	portName.Buffer = portNameBuffer;
	portName.MaximumLength = DONG_MAX_NAME_LENGTH*2;
	linkName.Buffer = linkNameBuffer;
	linkName.MaximumLength = DONG_MAX_NAME_LENGTH*2;

	/////////////////////////////////////////////////////////////////////////
   	// Form the base NT device name...

	deviceName.Length = 0;
   	RtlAppendUnicodeToString( &deviceName, DONG_NT_DEVICE_NAME);
	number.Length = 0;
	RtlIntegerToUnicodeString( NtDeviceNumber, 10, &number); 
	RtlAppendUnicodeStringToString( &deviceName, &number);

	// Create a Device object for this device...
	status = IoCreateDevice(
				pDriverObject,
				sizeof( DEVICE_EXTENSION ),
				&deviceName,
				FILE_DEVICE_PARALLEL_PORT,
				0,
				TRUE,
				&pDevObj);

	if( !NT_SUCCESS(status))
	{
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_IoCreateDevice);
		return status;
	}

	/////////////////////////////////////////////////////////////////////////
	// Use buffered I/O

	pDevObj->Flags |= DO_BUFFERED_IO;

	/////////////////////////////////////////////////////////////////////////
	// Initialize the Device Extension

	pDevExt = pDevObj->DeviceExtension;
	RtlZeroMemory(pDevExt, sizeof(DEVICE_EXTENSION));

	pDevExt->DeviceObject = pDevObj;
	pDevExt->NtDeviceNumber = NtDeviceNumber;

	/////////////////////////////////////////////////////////////////////////
	// Attach to parport device
	portName.Length = 0;
   	RtlAppendUnicodeToString( &portName, DONG_NT_PORT_DEVICE_NAME);
	number.Length = 0;
	RtlIntegerToUnicodeString( NtDeviceNumber, 10, &number); 
	RtlAppendUnicodeStringToString( &portName, &number);

	status = IoGetDeviceObjectPointer(&portName, FILE_READ_ATTRIBUTES,
										&pFileObject,
										&pDevExt->PortDeviceObject);
	if (!NT_SUCCESS(status))
	{
		IoDeleteDevice(pDevObj);
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_IoGetDeviceObjectPointer);
		return status;
	}

	ObReferenceObjectByPointer(	pDevExt->PortDeviceObject,FILE_READ_ATTRIBUTES,
								NULL,KernelMode);
	ObDereferenceObject(pFileObject);

	pDevExt->DeviceObject->StackSize = pDevExt->PortDeviceObject->StackSize + 1;

	// Get the port information from the port device object.
	status = DongGetPortInfoFromPortDevice(pDevExt);
	if (!NT_SUCCESS(status))
	{
		IoDeleteDevice(pDevObj);
		return status;
	}

	/////////////////////////////////////////////////////////////////////////
   	// Form the Win32 symbolic link name.

	linkName.Length = 0;
	RtlAppendUnicodeToString( &linkName, DONG_WIN32_DEVICE_NAME);
	number.Length = 0;
	RtlIntegerToUnicodeString( NtDeviceNumber + 1, 10, &number); 
	RtlAppendUnicodeStringToString( &linkName, &number);

	// Create a symbolic link so our device is visible to Win32...
 	status = IoCreateSymbolicLink( &linkName, &deviceName);
	if( !NT_SUCCESS(status)) 
	{
		IoDeleteDevice( pDevObj );
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_IoCreateSymbolicLink);
		return status;
	}

	return status;
}

static NTSTATUS
DongGetPortInfoFromPortDevice(
	IN OUT  PDEVICE_EXTENSION   pDevExt
	)
{
	KEVENT                      event;
	PIRP                        irp;
	PARALLEL_PORT_INFORMATION   portInfo;
	IO_STATUS_BLOCK             ioStatus;
	NTSTATUS                    status;

	/////////////////////////////////////////////////////////////////////////
	// Get parallel port information

	KeInitializeEvent(&event, NotificationEvent, FALSE);

	irp = IoBuildDeviceIoControlRequest(
				IOCTL_INTERNAL_GET_PARALLEL_PORT_INFO,
				pDevExt->PortDeviceObject,
				NULL, 0, &portInfo,
				sizeof(PARALLEL_PORT_INFORMATION),
				TRUE, &event, &ioStatus);

	if (!irp)
	{
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_IoBuildDeviceIoControlRequest);
		return STATUS_INSUFFICIENT_RESOURCES;
	}

	status = IoCallDriver(pDevExt->PortDeviceObject, irp);

	if (!NT_SUCCESS(status))
	{
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_IoCallDriver);
		return status;
	}

	status = KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, NULL);

	if (!NT_SUCCESS(status))
	{
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_KeWaitForSingleObject);
		return status;
	}

	pDevExt->OriginalController = portInfo.OriginalController;
	pDevExt->Controller = portInfo.Controller;
	pDevExt->SpanOfController = portInfo.SpanOfController;
	pDevExt->FreePort = portInfo.FreePort;
	pDevExt->TryAllocatePort = portInfo.TryAllocatePort;
	pDevExt->PortContext = portInfo.Context;

	// Check register span
	if (pDevExt->SpanOfController < DONG_REGISTER_SPAN)
	{
		DongReportUnexpectedFailure(DONG_ERRORLOG_INIT,DONG_INIT_RegisterSpan);
		return STATUS_INSUFFICIENT_RESOURCES;
	}

	return STATUS_SUCCESS;
}

Conclusion

Once you have done your homework, writing NT device drivers is a pretty well documented task: a useful skill in your repertoire if you really need to twiddle bits in NT's underwear.

Author

Chris Cant    © 1997 PHD Computer Consultants Ltd