I'm trying to implement a simple firewall which filters network connections made by Windows processes.
The firewall should either allow/block the connection.
In order to intercept connections by any process, I created a kernel driver which makes use of Windows Filtering Platform.
I registered a ClassifyFn (FWPS_CALLOUT_CLASSIFY_FN1) callback at the filtering layer FWPM_LAYER_ALE_AUTH_CONNECT_V4:
FWPM_CALLOUT m_callout = { 0 };
m_callout.applicableLayer = FWPM_LAYER_ALE_AUTH_CONNECT_V4;
...
status = FwpmCalloutAdd(filter_engine_handle, &m_callout, NULL, NULL);
The decision regarding connection allow/block should be taken by userlevel.
I communicate with Userlevel using FltSendMessage,
which cannot be used at IRQL DISPATCH_LEVEL.
Following the instructions of the Microsoft documentation regarding how to process callouts asynchronously,
I do call FwpsPendOperation0 before calling FltSendMessage.
After the call to FltSendMessage, I resume packet processing by calling FwpsCompleteOperation0.
FwpsPendOperation0 documentation states that calling this function should make possible to operate calls at PASSIVE_LEVEL:
A callout can pend the current processing operation on a packet when
the callout must perform processing on one of these layers that may
take a long interval to complete or that should occur at IRQL =
PASSIVE_LEVEL if the current IRQL > PASSIVE_LEVEL.
However, when the ClassifyFn callback is called at DISPATCH_LEVEL, I do sometimes still get a BSOD on FltSendMessage (INVALID_PROCESS_ATTACH_ATTEMPT).
I don't understand what's wrong.
Thank you in advance for any advice which could point me to the right direction.
Here is the relevant code of the ClassifyFn callback:
/*************************
ClassifyFn Function
**************************/
void example_classify(
const FWPS_INCOMING_VALUES * inFixedValues,
const FWPS_INCOMING_METADATA_VALUES * inMetaValues,
void * layerData,
const void * classifyContext,
const FWPS_FILTER * filter,
UINT64 flowContext,
FWPS_CLASSIFY_OUT * classifyOut)
{
NTSTATUS status;
BOOLEAN bIsReauthorize = FALSE;
BOOLEAN SafeToOpen = TRUE; // Value returned by userlevel which signals to allow/deny packet
classifyOut->actionType = FWP_ACTION_PERMIT;
remote_address = inFixedValues->incomingValue[FWPS_FIELD_ALE_AUTH_CONNECT_V4_IP_REMOTE_ADDRESS].value.uint32;
remote_port = inFixedValues->incomingValue[FWPS_FIELD_ALE_AUTH_CONNECT_V4_IP_REMOTE_PORT].value.uint16;
bIsReauthorize = IsAleReauthorize(inFixedValues);
if (!bIsReauthorize)
{
// First time receiving packet (not a reauthorized packet)
// Communicate with userlevel asynchronously
HANDLE hCompletion;
status = FwpsPendOperation0(inMetaValues->completionHandle, &hCompletion);
//
// FltSendMessage call here
// ERROR HERE:
// INVALID_PROCESS_ATTACH_ATTEMP BSOD on FltMessage call when at IRQL DISPATCH_LEVEL
//
FwpsCompleteOperation0(hCompletion, NULL);
}
if (!SafeToOpen) {
// Packet blocked
classifyOut->actionType = FWP_ACTION_BLOCK;
}
else {
// Packet allowed
}
return;
}
You need to invoke FltSendMessage() on another thread running at PASSIVE_LEVEL. You can use IoQueueWorkItem() or implement your own mechanism to process it on a system worker thread created via PsCreateSystemThread().
I am trying to code a simple firewall application which can allow or block network connection attempts made from userlevel processes.
To do so, following the WFPStarterKit tutorial, I created a WFP Driver which is set to intercept data at FWPM_LAYER_OUTBOUND_TRANSPORT_V4 layer.
The ClassifyFn callback function is responsible for intercepting the connection attempt, and either allow or deny it.
Once the ClassifyFn callback gets hit, the ProcessID of the packet is sent, along with a few other info, to a userlevel process through the FltSendMessage function.
The userlevel process receives the message, checks the ProcessID, and replies a boolean allow/deny command to the driver.
While this approach works when blocking a first packet, in some cases (expecially when allowing multiple packets) the code generates a BSOD with the INVALID_PROCESS_ATTACH_ATTEMPT error code.
The error is triggered at the call to FltSendMessage.
While I am still unable to pinpoint the exact problem,
it seems that making the callout thread wait (through FltSendMessage) for a reply from userlevel can generate this BSOD error on some conditions.
I would be very grateful if you can point me to the right direction.
Here is the function where I register the callout:
NTSTATUS register_example_callout(DEVICE_OBJECT * wdm_device)
{
NTSTATUS status = STATUS_SUCCESS;
FWPS_CALLOUT s_callout = { 0 };
FWPM_CALLOUT m_callout = { 0 };
FWPM_DISPLAY_DATA display_data = { 0 };
if (filter_engine_handle == NULL)
return STATUS_INVALID_HANDLE;
display_data.name = EXAMPLE_CALLOUT_NAME;
display_data.description = EXAMPLE_CALLOUT_DESCRIPTION;
// Register a new Callout with the Filter Engine using the provided callout functions
s_callout.calloutKey = EXAMPLE_CALLOUT_GUID;
s_callout.classifyFn = example_classify;
s_callout.notifyFn = example_notify;
s_callout.flowDeleteFn = example_flow_delete;
status = FwpsCalloutRegister((void *)wdm_device, &s_callout, &example_callout_id);
if (!NT_SUCCESS(status)) {
DbgPrint("Failed to register callout functions for example callout, status 0x%08x", status);
goto Exit;
}
// Setup a FWPM_CALLOUT structure to store/track the state associated with the FWPS_CALLOUT
m_callout.calloutKey = EXAMPLE_CALLOUT_GUID;
m_callout.displayData = display_data;
m_callout.applicableLayer = FWPM_LAYER_OUTBOUND_TRANSPORT_V4;
m_callout.flags = 0;
status = FwpmCalloutAdd(filter_engine_handle, &m_callout, NULL, NULL);
if (!NT_SUCCESS(status)) {
DbgPrint("Failed to register example callout, status 0x%08x", status);
}
else {
DbgPrint("Example Callout Registered");
}
Exit:
return status;
}
Here is the callout function:
/*************************
ClassifyFn Function
**************************/
void example_classify(
const FWPS_INCOMING_VALUES * inFixedValues,
const FWPS_INCOMING_METADATA_VALUES * inMetaValues,
void * layerData,
const void * classifyContext,
const FWPS_FILTER * filter,
UINT64 flowContext,
FWPS_CLASSIFY_OUT * classifyOut)
{
UNREFERENCED_PARAMETER(layerData);
UNREFERENCED_PARAMETER(classifyContext);
UNREFERENCED_PARAMETER(flowContext);
UNREFERENCED_PARAMETER(filter);
UNREFERENCED_PARAMETER(inMetaValues);
NETWORK_ACCESS_QUERY AccessQuery;
BOOLEAN SafeToOpen = TRUE;
classifyOut->actionType = FWP_ACTION_PERMIT;
AccessQuery.remote_address = inFixedValues->incomingValue[FWPS_FIELD_OUTBOUND_TRANSPORT_V4_IP_REMOTE_ADDRESS].value.uint32;
AccessQuery.remote_port = inFixedValues->incomingValue[FWPS_FIELD_OUTBOUND_TRANSPORT_V4_IP_REMOTE_PORT].value.uint16;
// Get Process ID
AccessQuery.ProcessId = (UINT64)PsGetCurrentProcessId();
if (!AccessQuery.ProcessId)
{
return;
}
// Here we connect to our userlevel application using FltSendMessage.
// Some checks are done and the SafeToOpen variable is populated with a BOOLEAN which indicates if to allow or block the packet.
// However, sometimes, a BSOD is generated with an INVALID_PROCESS_ATTACH_ATTEMPT error on the FltSendMessage call
QueryUserLevel(QUERY_NETWORK, &AccessQuery, sizeof(NETWORK_ACCESS_QUERY), &SafeToOpen, NULL, 0);
if (!SafeToOpen) {
classifyOut->actionType = FWP_ACTION_BLOCK;
}
return;
}
WFP drivers communicate to user-mode applications using the inverted call model. In this method, you keep an IRP from the user-mode pending at your kernel-mode driver instance and whenever you want to send data back to the user-mode you complete the IRP along with the data you want to send back.
The problem was that sometimes the ClassifyFn callback function can be called at IRQL DISPATCH_LEVEL.
FltSendMessage does not support DISPATCH_LEVEL, as it can only be run at IRQL <= APC_LEVEL.
Running at DISPATCH_LEVEL can cause this function to generate a BSOD.
I solved the problem by invoking FltSendMessage from a worker thread which runs at IRQL PASSIVE_LEVEL.
The worker thread can be created using IoQueueWorkItem.
My application has ablities to turn network adaptors of or enable them for either DHCP or static configuration. IP configuration is done via WMI Win32_NetworkApapterConfiguration class and disabling/enabling adapters is done via SetupApi for some reasons. Starting at the point where the adapter was enabled, I noticed following (Windows 7 SP1, 32bit):
EnableDHCP method return with error 84 (IP not enabled). So I thought I need to wait that property the "IpEnabled" becomes true and polled it every second - but it always returned false (BTW: I monitored the value using WMIC and could see that it has actually became true).
Next - in order to avoid and inifinite loop - I changed my "poll 'IpEnabled == true' loop" to jump out after 10 trials, and do the remaining stuff. And see: EnableDHCP succeeded (ret == 0), and also IpEnabled suddely became true.
EDIT
Situation 1:
int ret;
// ...
// Returns error 84
ret = wmiExecMethod(clsName, "EnableDHCP", true, objPath);
// ...
Situation 2:
int ret;
// ...
// Will never get out of this
while (!wmiGetBool(pWMIObj, "IPEnabled"))
{
printf("Interface.IpEnabled=False\n");
Sleep(1000);
}
// ...
ret = wmiExecMethod(clsName, "EnableDHCP", true, objPath);
Situation 3:
int count = 10;
int ret;
// ...
// Will occur until count becomes 0
while (wmiGetBool(pWMIObj, "IPEnabled") && count--)
{
printf("Interface.IpEnabled=False - remaining trials: %d\n", count);
Sleep(1000);
}
// ...
// After this "delay", EnableDHCP returns 0 (SUCCESS)
ret = wmiExecMethod(clsName, "EnableDHCP", true, objPath);
// wmiGetBool(pWMIObj, "IPEnabled") now returns true too...
Do you have any ideas what is going wrong here? Thanks in before for help.
Best regards
Willi K.
The "real" problem behind this is that the Win32_NetworkApapterConfiguration::EnableDHCP method fails if the interface is not connected to a network (offline). The only way I found to configure the interface for DHCP is to modify the registry....
I am using QMI SDK to start data session for the Sierra Wireless card MC7354 and Telus Sim Card. For now I can detect the device and the sim card like getting device info and IMSI number; however, I got some trouble with starting the data session. I follow the instructions in QMI SDK Documents and do the following code:
//set the default profile
ULONG rc3 = SetDefaultProfile(0,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL);
fprintf(stderr, "SetProfile - Return Code: %lu\n", rc3);
//start the session
ULONG technology = 1;
ULONG profile_idx = 1;
struct ssdatasession_params session;
session.action = 1;
session.pTechnology = &technology;
session.pProfileId3GPP = &profile_idx;
session.pProfileId3GPP2 = NULL;
session.ipfamily = 4;
ULONG rc4 = SLQSStartStopDataSession(&session);
fprintf(stderr, "Start Session - Return Code: %lu\n",rc4);
SetDefaultProfile is working fine because it returns me the success code, but for the SLQSStartStopDataSession method, it always gives me the return code "1026", which means
Requested operation would have no effect
Does anyone know where I make mistakes and how should I modify the code? What does this return code mean?
A "No Effect" error in WDS Start Network (the underlying command sent when you use SLQSStartStopDataSession()) actually means that the device is already connected. You likely have configured an automatic connection setup in the modem.
I am developing an NDIS filter driver, and I fount its FilterReceiveNetBufferLists is never called (the network is blocked) under certain condition (like open Wireshark or click the "Interface List" button of it). But When I start the capturing, the FilterReceiveNetBufferLists get to be normal (network restored), this is so strange.
I found that when I mannually return NDIS_STATUS_FAILURE for the NdisFOidRequest function in an OID originating place of WinPcap driver (BIOCQUERYOID & BIOCSETOID switch branch of NPF_IoControl), then driver won't block the network (Also the winpcap can't work).
Is there something wrong with the NdisFOidRequest call?
The DeviceIO routine in Packet.c that originates OID requests:
case BIOCQUERYOID:
case BIOCSETOID:
TRACE_MESSAGE(PACKET_DEBUG_LOUD, "BIOCSETOID - BIOCQUERYOID");
//
// gain ownership of the Ndis Handle
//
if (NPF_StartUsingBinding(Open) == FALSE)
{
//
// MAC unbindind or unbound
//
SET_FAILURE_INVALID_REQUEST();
break;
}
// Extract a request from the list of free ones
RequestListEntry = ExInterlockedRemoveHeadList(&Open->RequestList, &Open->RequestSpinLock);
if (RequestListEntry == NULL)
{
//
// Release ownership of the Ndis Handle
//
NPF_StopUsingBinding(Open);
SET_FAILURE_NOMEM();
break;
}
pRequest = CONTAINING_RECORD(RequestListEntry, INTERNAL_REQUEST, ListElement);
//
// See if it is an Ndis request
//
OidData = Irp->AssociatedIrp.SystemBuffer;
if ((IrpSp->Parameters.DeviceIoControl.InputBufferLength == IrpSp->Parameters.DeviceIoControl.OutputBufferLength) &&
(IrpSp->Parameters.DeviceIoControl.InputBufferLength >= sizeof(PACKET_OID_DATA)) &&
(IrpSp->Parameters.DeviceIoControl.InputBufferLength >= sizeof(PACKET_OID_DATA) - 1 + OidData->Length))
{
TRACE_MESSAGE2(PACKET_DEBUG_LOUD, "BIOCSETOID|BIOCQUERYOID Request: Oid=%08lx, Length=%08lx", OidData->Oid, OidData->Length);
//
// The buffer is valid
//
NdisZeroMemory(&pRequest->Request, sizeof(NDIS_OID_REQUEST));
pRequest->Request.Header.Type = NDIS_OBJECT_TYPE_OID_REQUEST;
pRequest->Request.Header.Revision = NDIS_OID_REQUEST_REVISION_1;
pRequest->Request.Header.Size = NDIS_SIZEOF_OID_REQUEST_REVISION_1;
if (FunctionCode == BIOCSETOID)
{
pRequest->Request.RequestType = NdisRequestSetInformation;
pRequest->Request.DATA.SET_INFORMATION.Oid = OidData->Oid;
pRequest->Request.DATA.SET_INFORMATION.InformationBuffer = OidData->Data;
pRequest->Request.DATA.SET_INFORMATION.InformationBufferLength = OidData->Length;
}
else
{
pRequest->Request.RequestType = NdisRequestQueryInformation;
pRequest->Request.DATA.QUERY_INFORMATION.Oid = OidData->Oid;
pRequest->Request.DATA.QUERY_INFORMATION.InformationBuffer = OidData->Data;
pRequest->Request.DATA.QUERY_INFORMATION.InformationBufferLength = OidData->Length;
}
NdisResetEvent(&pRequest->InternalRequestCompletedEvent);
if (*((PVOID *) pRequest->Request.SourceReserved) != NULL)
{
*((PVOID *) pRequest->Request.SourceReserved) = NULL;
}
//
// submit the request
//
pRequest->Request.RequestId = (PVOID) NPF6X_REQUEST_ID;
ASSERT(Open->AdapterHandle != NULL);
Status = NdisFOidRequest(Open->AdapterHandle, &pRequest->Request);
//Status = NDIS_STATUS_FAILURE;
}
else
{
//
// Release ownership of the Ndis Handle
//
NPF_StopUsingBinding(Open);
//
// buffer too small
//
SET_FAILURE_BUFFER_SMALL();
break;
}
if (Status == NDIS_STATUS_PENDING)
{
NdisWaitEvent(&pRequest->InternalRequestCompletedEvent, 1000);
Status = pRequest->RequestStatus;
}
//
// Release ownership of the Ndis Handle
//
NPF_StopUsingBinding(Open);
//
// Complete the request
//
if (FunctionCode == BIOCSETOID)
{
OidData->Length = pRequest->Request.DATA.SET_INFORMATION.BytesRead;
TRACE_MESSAGE1(PACKET_DEBUG_LOUD, "BIOCSETOID completed, BytesRead = %u", OidData->Length);
}
else
{
if (FunctionCode == BIOCQUERYOID)
{
OidData->Length = pRequest->Request.DATA.QUERY_INFORMATION.BytesWritten;
if (Status == NDIS_STATUS_SUCCESS)
{
//
// check for the stupid bug of the Nortel driver ipsecw2k.sys v. 4.10.0.0 that doesn't set the BytesWritten correctly
// The driver is the one shipped with Nortel client Contivity VPN Client V04_65.18, and the MD5 for the buggy (unsigned) driver
// is 3c2ff8886976214959db7d7ffaefe724 *ipsecw2k.sys (there are multiple copies of this binary with the same exact version info!)
//
// The (certified) driver shipped with Nortel client Contivity VPN Client V04_65.320 doesn't seem affected by the bug.
//
if (pRequest->Request.DATA.QUERY_INFORMATION.BytesWritten > pRequest->Request.DATA.QUERY_INFORMATION.InformationBufferLength)
{
TRACE_MESSAGE2(PACKET_DEBUG_LOUD, "Bogus return from NdisRequest (query): Bytes Written (%u) > InfoBufferLength (%u)!!", pRequest->Request.DATA.QUERY_INFORMATION.BytesWritten, pRequest->Request.DATA.QUERY_INFORMATION.InformationBufferLength);
Status = NDIS_STATUS_INVALID_DATA;
}
}
TRACE_MESSAGE1(PACKET_DEBUG_LOUD, "BIOCQUERYOID completed, BytesWritten = %u", OidData->Length);
}
}
ExInterlockedInsertTailList(&Open->RequestList, &pRequest->ListElement, &Open->RequestSpinLock);
if (Status == NDIS_STATUS_SUCCESS)
{
SET_RESULT_SUCCESS(sizeof(PACKET_OID_DATA) - 1 + OidData->Length);
}
else
{
SET_FAILURE_INVALID_REQUEST();
}
break;
Three Filter OID routines:
_Use_decl_annotations_
NDIS_STATUS
NPF_OidRequest(
NDIS_HANDLE FilterModuleContext,
PNDIS_OID_REQUEST Request
)
{
POPEN_INSTANCE Open = (POPEN_INSTANCE) FilterModuleContext;
NDIS_STATUS Status;
PNDIS_OID_REQUEST ClonedRequest=NULL;
BOOLEAN bSubmitted = FALSE;
PFILTER_REQUEST_CONTEXT Context;
BOOLEAN bFalse = FALSE;
TRACE_ENTER();
do
{
Status = NdisAllocateCloneOidRequest(Open->AdapterHandle,
Request,
NPF6X_ALLOC_TAG,
&ClonedRequest);
if (Status != NDIS_STATUS_SUCCESS)
{
TRACE_MESSAGE(PACKET_DEBUG_LOUD, "FilerOidRequest: Cannot Clone Request\n");
break;
}
Context = (PFILTER_REQUEST_CONTEXT)(&ClonedRequest->SourceReserved[0]);
*Context = Request;
bSubmitted = TRUE;
//
// Use same request ID
//
ClonedRequest->RequestId = Request->RequestId;
Open->PendingOidRequest = ClonedRequest;
Status = NdisFOidRequest(Open->AdapterHandle, ClonedRequest);
if (Status != NDIS_STATUS_PENDING)
{
NPF_OidRequestComplete(Open, ClonedRequest, Status);
Status = NDIS_STATUS_PENDING;
}
}while (bFalse);
if (bSubmitted == FALSE)
{
switch(Request->RequestType)
{
case NdisRequestMethod:
Request->DATA.METHOD_INFORMATION.BytesRead = 0;
Request->DATA.METHOD_INFORMATION.BytesNeeded = 0;
Request->DATA.METHOD_INFORMATION.BytesWritten = 0;
break;
case NdisRequestSetInformation:
Request->DATA.SET_INFORMATION.BytesRead = 0;
Request->DATA.SET_INFORMATION.BytesNeeded = 0;
break;
case NdisRequestQueryInformation:
case NdisRequestQueryStatistics:
default:
Request->DATA.QUERY_INFORMATION.BytesWritten = 0;
Request->DATA.QUERY_INFORMATION.BytesNeeded = 0;
break;
}
}
TRACE_EXIT();
return Status;
}
//-------------------------------------------------------------------
_Use_decl_annotations_
VOID
NPF_CancelOidRequest(
NDIS_HANDLE FilterModuleContext,
PVOID RequestId
)
{
POPEN_INSTANCE Open = (POPEN_INSTANCE) FilterModuleContext;
PNDIS_OID_REQUEST Request = NULL;
PFILTER_REQUEST_CONTEXT Context;
PNDIS_OID_REQUEST OriginalRequest = NULL;
BOOLEAN bFalse = FALSE;
FILTER_ACQUIRE_LOCK(&Open->OIDLock, bFalse);
Request = Open->PendingOidRequest;
if (Request != NULL)
{
Context = (PFILTER_REQUEST_CONTEXT)(&Request->SourceReserved[0]);
OriginalRequest = (*Context);
}
if ((OriginalRequest != NULL) && (OriginalRequest->RequestId == RequestId))
{
FILTER_RELEASE_LOCK(&Open->OIDLock, bFalse);
NdisFCancelOidRequest(Open->AdapterHandle, RequestId);
}
else
{
FILTER_RELEASE_LOCK(&Open->OIDLock, bFalse);
}
}
//-------------------------------------------------------------------
_Use_decl_annotations_
VOID
NPF_OidRequestComplete(
NDIS_HANDLE FilterModuleContext,
PNDIS_OID_REQUEST Request,
NDIS_STATUS Status
)
{
POPEN_INSTANCE Open = (POPEN_INSTANCE) FilterModuleContext;
PNDIS_OID_REQUEST OriginalRequest;
PFILTER_REQUEST_CONTEXT Context;
BOOLEAN bFalse = FALSE;
TRACE_ENTER();
Context = (PFILTER_REQUEST_CONTEXT)(&Request->SourceReserved[0]);
OriginalRequest = (*Context);
//
// This is an internal request
//
if (OriginalRequest == NULL)
{
TRACE_MESSAGE1(PACKET_DEBUG_LOUD, "Status= %p", Status);
NPF_InternalRequestComplete(Open, Request, Status);
TRACE_EXIT();
return;
}
FILTER_ACQUIRE_LOCK(&Open->OIDLock, bFalse);
ASSERT(Open->PendingOidRequest == Request);
Open->PendingOidRequest = NULL;
FILTER_RELEASE_LOCK(&Open->OIDLock, bFalse);
//
// Copy the information from the returned request to the original request
//
switch(Request->RequestType)
{
case NdisRequestMethod:
OriginalRequest->DATA.METHOD_INFORMATION.OutputBufferLength = Request->DATA.METHOD_INFORMATION.OutputBufferLength;
OriginalRequest->DATA.METHOD_INFORMATION.BytesRead = Request->DATA.METHOD_INFORMATION.BytesRead;
OriginalRequest->DATA.METHOD_INFORMATION.BytesNeeded = Request->DATA.METHOD_INFORMATION.BytesNeeded;
OriginalRequest->DATA.METHOD_INFORMATION.BytesWritten = Request->DATA.METHOD_INFORMATION.BytesWritten;
break;
case NdisRequestSetInformation:
OriginalRequest->DATA.SET_INFORMATION.BytesRead = Request->DATA.SET_INFORMATION.BytesRead;
OriginalRequest->DATA.SET_INFORMATION.BytesNeeded = Request->DATA.SET_INFORMATION.BytesNeeded;
break;
case NdisRequestQueryInformation:
case NdisRequestQueryStatistics:
default:
OriginalRequest->DATA.QUERY_INFORMATION.BytesWritten = Request->DATA.QUERY_INFORMATION.BytesWritten;
OriginalRequest->DATA.QUERY_INFORMATION.BytesNeeded = Request->DATA.QUERY_INFORMATION.BytesNeeded;
break;
}
(*Context) = NULL;
NdisFreeCloneOidRequest(Open->AdapterHandle, Request);
NdisFOidRequestComplete(Open->AdapterHandle, OriginalRequest, Status);
TRACE_EXIT();
}
Below is the mail I received from Jeffrey, I think it is the best answer for this question:)
The packet filter works differently for LWFs versus Protocols. Let me give you some background. You’ll already know some of this, I’m sure, but it’s always helpful to review the basics, so we can be sure that we’re both on the same page. The NDIS datapath is organized like a tree:
Packet filtering happens at two places in this stack:
(a) once in the miniport hardware, and
(b) at the top of the stack, just below the protocols.
NDIS will track each protocols’ packet filter separately, for efficiency. If one protocol asks to see ALL packets (promiscuous mode), then not all protocols have to sort through all that traffic. So really, there are (P+1) different packet filters in the system, where P is the number of protocols:
Now if there are all these different packet filters, how does an OID_GEN_CURRENT_PACKET_FILTER actually work? What NDIS does is NDIS tracks each protocols’ packet filter, but also merges the filter at the top of the miniport stack. So suppose protocol0 requests a packet filter of A+B, and protocol1 requests a packet filter of C, and protocol2 requests a packet filter of B+D:
Then at the top of the stack, NDIS merges the packet filters to A+B+C+D. This is what gets sent down the filter stack, and eventually to the miniport.
Because of this merging process, no matter what protocol2 sets as its packet filter, protocol2 cannot affect the other protocols. So protocols don’t have to worry about “sharing” the packet filter. However, the same is not true for a LWF. If LWF1 decides to set a new packet filter, it does not get merged:
In the above picture, LWF1 decided to change the packet filter to C+E. This overwrote the protocols’ packet filter of A+B+C+D, meaning that flags A, B, and D will never make it to the hardware. If the protocols were relying on flags A, B, or D, then the protocols’ functionality will be broken.
This is by design – LWFs have great power, and they can do anything to the stack. They are designed to have the power to veto the packet filters of all other protocols. But in your case, you don’t want to mess with other protocols; you want your filter to have minimal effects on the rest of the system.
So what you want to do is to always keep track of what the packet filter is, and never remove flags from the current packet filter. That means that you should query the packet filter when your filter attaches, and update your cached value whenever you see an OID_GEN_CURRENT_PACKET_FILTER come down from above.
If your usermode app needs more flags than what the current packet filter has, you can issue the OID and add additional flags. This means that the hardware’s packet filter will have more flags. But no protocol’s packet filter will change, so the protocols will still see the same stuff.
In the above example, the filter LWF1 is playing nice. Even though LWF1 only cares about flag E, LWF1 has still passed down all flags A, B, C, and D too, since LWF1 knows that the protocols above it want those flags to be set.
The code to manage this isn’t too bad, once you get the idea of what needs to be done to manage the packet filter:
Always track the latest packet filter from protocols above.
Never let the NIC see a packet filter that has fewer flags than the protocols’ packet filter.
Add in your own flags as needed.
Ok, hopefully that gives you a good idea of what the packet filter is and how to manage it. The next question is how to map “promiscuous mode” and “non-promiscuous mode” into actual flags? Let’s define these two modes carefully:
Non-promiscuous mode: The capture tool only sees the receive traffic that the operating system would normally have received. If the hardware filters out traffic, then we don’t want to see that traffic. The user wants to diagnose the local operating system in its normal state.
Promiscuous mode: Give the capture tool as many receive packets as possible – ideally every bit that is transferred on the wire. It doesn’t matter whether the packet was destined for the local host or not. The user wants to diagnose the network, and so wants to see everything happening on the network.
I think when you look at it that way, the consequences for the packet filter flags are fairly straightforward. For non-promiscuous mode, do not change the packet filter. Just let the hardware packet filter be whatever the operating system wants it to be. Then for promiscuous mode, add in the NDIS_PACKET_TYPE_PROMISCUOUS flag, and the NIC hardware will give you everything it possibly can.
So if it’s that simple for a LWF, why did the old protocol-based NPF driver need so many more flags? The old protocol-based driver had a couple problems:
It can’t get “non-promiscuous mode” perfectly correct
It can’t easily capture the send-packets of other protocols
The first problem with NPF-protocol is that it can’t easily implement our definition of “non-promiscuous mode” correctly. If NPF-the-protocol wants to see receive traffic just as the OS sees it, then what packet filter should it use? If it sets a packet filter of zero, then NPF won’t see any traffic. So NPF can set a packet filter of Directed|Broadcast|Multicast. But that’s only an assumption of what TCPIP and other protocols are setting. If TCPIP decided to set a Promiscuous flag (certain socket flags cause this to happen), then NPF would actually be seeing fewer packets than what TCPIP would see, which is wrong. But if NPF sets the Promiscuous flag, then it will see more traffic than TCPIP would see, which is also wrong. So it’s tough for a capturing protocol to decide which flags to set so that it sees exactly the same packets that the rest of the OS sees. LWFs don’t have that problem, since LWFs get to see the combined OID after all protocols’ filters are merged.
The second problem with NPF-protocol is that it needed loopback mode to capture sent-packets. LWFs don’t need loopback -- in fact, it would be actively harmful. Let’s use the same diagram to see why. Here’s NPF capturing the receive path in promiscuous mode:
Now let’s see what happens when a unicast packet is received:
Since the packet matches the hardware’s filter, the packet comes up the stack. Then when the packet gets to the protocol layer, NDIS gives the packet to both protocols, tcpip and npf, since both protocols’ packet filters match the packet. So that works well enough.
But now the send path is tricky:
tcpip sent a packet, but npf never got a chance to see it! To solve this problem, NDIS added the notion of a “loopback” packet filter flag. This flag is a little bit special, since it doesn’t go to the hardware. Instead, the loopback packet filter tells NDIS to bounce all send-traffic back up the receive path, so that diagnostics tools like npf can see the packets. It looks like this:
Now the loopback path is really only used for diagnostics tools, so we haven’t spent much time optimizing it. And, since it means that all send packets travel across the stack twice (once for the normal send path, and again in the receive path), it has at least double the CPU cost. This is why I said that an NDIS LWF would be able to be capture at a higher throughput than a protocol, since LWFs don’t need the loopback path.
Why not? Why don’t LWFs need loopback? Well if you go back and look at the last few diagrams, you’ll see that all of our LWFs saw all the traffic – both send and receive – without any loopback. So the LWF meets the requirements of seeing all traffic, without needing to bother with loopback. That’s why a LWF should normally never set any loopback flags.
Ok, that email got longer than I wanted, but I hope that clears up some of the questions around the packet filter, the loopback path, and how LWFs are different from protocols. Please let me know if anything wasn’t clear, or if the diagrams didn’t come through.