Finding Memory Utilization using GetProcessMemoryInfo - c

In my application i was trying to figure out memory utilization for a particular process in windows machine using below mentioned api.
GetProcessMemoryInfo(hProcess, &info, sizeof(info));
when i checked the value of info.WorkingSetSize it was exactly 14757395258967641292.
So i want to clear whether the returned value is in bytes(for naked eye this cannot be in bytes format)? if not how to convert it into bytes or kilobytes.
void PrintProcessNameAndID( DWORD processID )
{
TCHAR szProcessName[MAX_PATH] = TEXT("<unknown>");
PROCESS_MEMORY_COUNTERS info, info1, info2;
SIZE_T MemoryUsage;
SIZE_T one,two,three, four;
// Get a handle to the process.
HANDLE hProcess = OpenProcess( PROCESS_QUERY_INFORMATION |
PROCESS_VM_READ,
FALSE, processID );
// Get the process name.
if (NULL != hProcess )
{
HMODULE hMod;
DWORD cbNeeded;
if ( EnumProcessModules( hProcess, &hMod, sizeof(hMod),
&cbNeeded) )
{
GetModuleBaseName( hProcess, hMod, szProcessName,
sizeof(szProcessName)/sizeof(TCHAR) );
}
}
// Print the process name and identifier.
//_tprintf( TEXT("%s (PID: %u)"), szProcessName, processID );
GetProcessMemoryInfo(hProcess, &info, sizeof(info));
MemoryUsage = (info.WorkingSetSize);
}

Some windows processes require a lesser value than PROCESS_QUERY_INFORMATION (e.g. PROCESS_QUERY_LIMITED_INFORMATION).
The result is that OpenProcess may return NULL.
This is handled in your test, then however, you always call GetProcessMemoryInfo.
The result will be a failed call. With un-initialized memory for info resulting in some random value (0xccccccccccccd000).

Related

MmCopyVirtualMemory failing, code is correct

My situation is that MmCopyVirtualMemory almost always (%99 of the time) returns STATUS_PARTIAL_COPY.
(Im operating in a Ring0 Driver)
I've tried so many different things, like using different variables sizes and types, different addresses etc... It always returns STATUS_PARTIAL_COPY.
Nothing online has helped either, its not a really common error.
Error description:
{Partial Copy} Due to protection conflicts not all the requested bytes could be copied.
My way of reading a processes memory:
DWORD64 Read(DWORD64 SourceAddress, SIZE_T Size)
{
SIZE_T Bytes;
NTSTATUS Status = STATUS_SUCCESS;
DWORD64 TempRead;
DbgPrintEx(0, 0, "\nRead Address:%p\n", SourceAddress); // Always Prints Correct Address
DbgPrintEx(0, 0, "Read szAddress:%x\n", Size); // Prints Correct Size 8 bytes
Status = MmCopyVirtualMemory(Process, SourceAddress, PsGetCurrentProcess(), &TempRead, Size, KernelMode, &Bytes);
DbgPrintEx(0, 0, "Read Bytes:%x\n", Bytes); // Copied bytes - prints 0
DbgPrintEx(0, 0, "Read Output:%p\n", TempRead); // prints 0 as expected since it failed
if (!NT_SUCCESS(Status))
{
DbgPrintEx(0, 0, "Read Failed:%p\n", Status);
return NULL;
}
return TempRead;
}
Example of how I use it:
NetMan = Read(BaseAddr + NET_MAN, sizeof(DWORD64));
//BaseAddr is a DWORD64 and NetMan is also a DWORD64
I've tripped checked my code to many times, all of it appears to be right.
After investigating MiDoPoolCopy (MmCopyVirtualMemoy calls this) it seems that its failing during the move operation:
// MiDoPoolCopy function
// Probe to make sure that the specified buffer is accessible in
// the target process.
//Wont execute (supplied KernelMode)
if ((InVa == FromAddress) && (PreviousMode != KernelMode)){
Probing = TRUE;
ProbeForRead (FromAddress, BufferSize, sizeof(CHAR));
Probing = FALSE;
}
//Failing either here, copying inside the target process's address space to the buffer
RtlCopyMemory (PoolArea, InVa, AmountToMove);
KeUnstackDetachProcess (&ApcState);
KeStackAttachProcess (&ToProcess->Pcb, &ApcState);
//
// Now operating in the context of the ToProcess.
//
//Wont execute (supplied KernelMode)
if ((InVa == FromAddress) && (PreviousMode != KernelMode)){
Probing = TRUE;
ProbeForWrite (ToAddress, BufferSize, sizeof(CHAR));
Probing = FALSE;
}
Moving = TRUE;
//or failing here - moving from the Target Process to Source (target process->Kernel)
RtlCopyMemory (OutVa, PoolArea, AmountToMove);
Here's the SEH returning STATUS_PARTIAL_COPY (wrapped in try and except)
//(wrapped in try and except)
//
// If the failure occurred during the move operation, determine
// which move failed, and calculate the number of bytes
// actually moved.
//
*NumberOfBytesRead = BufferSize - LeftToMove;
if (Moving == TRUE) {
//
// The failure occurred writing the data.
//
if (ExceptionAddressConfirmed == TRUE) {
*NumberOfBytesRead = (SIZE_T)((ULONG_PTR)(BadVa - (ULONG_PTR)FromAddress));
}
}
return STATUS_PARTIAL_COPY;
MmCopyVirtualMemory (undocumented struct)
NTSTATUS NTAPI MmCopyVirtualMemory
(
PEPROCESS SourceProcess,
PVOID SourceAddress,
PEPROCESS TargetProcess,
PVOID TargetAddress,
SIZE_T BufferSize,
KPROCESSOR_MODE PreviousMode,
PSIZE_T ReturnSize
);
Here is the source for both MmCopyVirtualMemory and MiDoPoolCopy:
https://lacicloud.net/custom/open/leaks/Windows%20Leaked%20Source/wrk-v1.2/base/ntos/mm/readwrt.c
Any help would be greatly appreciated, I've been stuck on this for to long...
I know this is old, but because I had the same issue and fixed it, maybe someone else will stumble on this in the future.
Reading your code the issue is the target address you want to copy the data. You are using a single defined DWORD64, which is when running the function a simple variable (register or on stack) and not a memory region where you can write to.
The solution is to provide a correct buffer for which you allocated memory before with malloc.
Example (you want to read a DWORD64, based on your code with some adjustments):
DWORD64* buffer = malloc(sizeof(DWORD64))
Status = MmCopyVirtualMemory(Process, (void*)SourceAddress, PsGetCurrentProcess(), (void*)buffer, sizeof(DWORD64), UserMode, &Bytes);
Do not forget to free your buffer after using to prevent a memory leak.

Execute PE from memory

I tried to write LoadLibrary function in C.
The function gets a path to a DLL file which, normally, pops up message box when it loads (I tried to run that DLL file using the original LoadLibrary function and it works).
Basically, the DLL content is loaded into a buffer, parsed and runs from entry point.
In VirtualAllocEx function, I use PAGE_READWRITE protection mode. Then, when running the line f(nth->OptionalHeader.ImageBase, DLL_PROCESS_ATTACH, NULL), I get the following error message: Exception thrown at 0x10011032 in PE.exe: 0xC0000005: Access violation executing location 0x10011032. (0x10011032 is entry point address).
If I change the mode to PAGE_EXECUTE_READWRITE, the error message is: Exception thrown at 0x00019644 in PE.exe: 0xC0000005: Access violation executing location 0x00019644. (No idea what is that address).
I think that it's clear why it's not smart to allow execution in all sectors of the PE, but I did it for testing purposes only. In the final code, I'll need to write it properly.
My code is attached.
(BTW, if you have other suggestions that are not related to my question - I'll glad to know).
#include <Windows.h>
typedef HMODULE func(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved);
HMODULE LoadLibraryFromMem(char* dllPath)
{
DWORD read;
HANDLE handle;
handle = CreateFileA(dllPath, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
DWORD size = GetFileSize(handle, NULL);
PVOID vDll = VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE);
BOOL r = ReadFile(handle, vDll, size, &read, NULL);
CloseHandle(handle);
PIMAGE_DOS_HEADER dosh = (PIMAGE_DOS_HEADER)vDll;
PIMAGE_NT_HEADERS nth = (PIMAGE_NT_HEADERS)((PBYTE)vDll + dosh->e_lfanew);
handle = GetCurrentProcess();
PVOID vImg = VirtualAllocEx(
handle,
nth->OptionalHeader.ImageBase,
nth->OptionalHeader.SizeOfImage,
MEM_RESERVE | MEM_COMMIT,
PAGE_READWRITE
); // HERE --> PAGE_EXECUTE_READWRITE
WriteProcessMemory(
handle,
vImg,
vDll,
nth->OptionalHeader.SizeOfHeaders,
0
);
PIMAGE_SECTION_HEADER sech = IMAGE_FIRST_SECTION(nth);
for (size_t i = 0; i < nth->FileHeader.NumberOfSections; i++)
WriteProcessMemory(
handle,
(PBYTE)vImg + sech[i].VirtualAddress,
(PBYTE)vDll + sech[i].PointerToRawData,
sech[i].SizeOfRawData,
0
);
PIMAGE_IMPORT_DESCRIPTOR impd = nth->OptionalHeader.ImageBase + nth->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress;
HANDLE proc;
while (((PIMAGE_IMPORT_DESCRIPTOR)impd)->Name)
{
LPSTR dllName = (nth->OptionalHeader.ImageBase + ((PIMAGE_IMPORT_DESCRIPTOR)impd)->Name);
HMODULE dllAdr = LoadLibraryA(dllName);
PDWORD iat = nth->OptionalHeader.ImageBase + ((PIMAGE_IMPORT_DESCRIPTOR)impd)->FirstThunk;
while (*iat)
{
LPSTR funcName = ((PIMAGE_IMPORT_BY_NAME)(nth->OptionalHeader.ImageBase + *iat))->Name;
proc = GetProcAddress(dllAdr, funcName);
if (!proc)
return NULL;
impd->FirstThunk = proc;
iat++;
}
impd++;
}
func* f = (func*)(nth->OptionalHeader.ImageBase + nth->OptionalHeader.AddressOfEntryPoint);
f(nth->OptionalHeader.ImageBase, DLL_PROCESS_ATTACH, NULL);
}
int main()
{
LoadLibraryFromMem("mydll.dll");
return 0;
}
Thanks in advance!

Linked list in Linux kernel freezes machine

I wrote a kernel module that needs to push messages to user space. The idea is that the kernel module buffers the message and signals the user space program, then the user space program goes and gets the message by requesting it over a netlink socket. My problem is that after buffering 90 messages, the machine locks and I need to restart. I can't figure out what I'm doing wrong, and I'm using linked lists elsewhere in the kernel module successfully.
//
// A message from the kernel space to user space.
//
typedef struct CoreLinkMessage
{
unsigned int id;
char* data;
unsigned int length;
struct list_head list; // kernel's list structure
} CoreLinkMessage;
This function initializes the list and semaphore:
// Constructor
void
ctsRtNetlinkSystem_init( void )
{
sema_init(&cmd_sem_, 1);
INIT_LIST_HEAD(&cmd_list_.list);
}
This is the function that must be causing the problem. It simply pushes an item on to the tail of the linked list. If I comment out adding items to the linked list and only call a signal, the program runs indefinitely, so I don't think the problem is the signaling.
//
// Allows the kernel module to buffer messages until requested by
// the user space
//
void
ctsRtNetlinkSystem_addMessage(char* data, unsigned int length)
{
CoreLinkMessage* msg;
int sem_ret;
BOOL doSignal = FALSE;
//
// LOCK the semaphore
//
sem_ret = down_interruptible(&cmd_sem_);
if ( !sem_ret )
{
msg = (CoreLinkMessage*)kmalloc(sizeof(CoreLinkMessage), GFP_KERNEL );
if ( msg == NULL )
{
PRINTF(CTSMSG_INFO
"ctsRtNetlinkSystem_addMessage failed to allocate memory! \n" );
goto unlock;
}
memset( msg, 0, sizeof(CoreLinkMessage) );
msg->data = (char*)kmalloc( length, GFP_KERNEL );
if ( msg->data == NULL )
{
kfree( msg );
PRINTF(CTSMSG_INFO
"ctsRtNetlinkSystem_addMessage failed to allocate data memory!\n" );
goto unlock;
}
memcpy( msg->data, data, length );
msg->length = length;
lastMessageId_ += 1;
msg->id = lastMessageId_;
list_add_tail(&(msg->list), &cmd_list_.list);
doSignal = TRUE;
unlock:
up( &cmd_sem_ );
if ( doSignal )
sendMessageSignal( msg->id );
}
else
{
PRINTF(CTSMSG_INFO
"CtsRtNetlinkSystem_addMessage -- failed to get semaphore\n" );
}
}
//
// Signal the user space that a message is waiting. Pass along the message
// id
//
static BOOL
sendMessageSignal( unsigned int id )
{
int ret;
struct siginfo info;
struct task_struct *t;
memset(&info, 0, sizeof(struct siginfo));
info.si_signo = SIGNAL_MESSAGE;
info.si_code = SI_QUEUE; // this is bit of a trickery:
// SI_QUEUE is normally used by sigqueue
// from user space,
// and kernel space should use SI_KERNEL.
// But if SI_KERNEL is used the real_time data
// is not delivered to the user space signal
// handler function.
// tell the user space application the index of the message
// real time signals may have 32 bits of data.
info.si_int = id;
rcu_read_lock();
//find the task_struct associated with this pid
t = // find_task_by_pid_type( PIDTYPE_PID, registeredPid_ );
// find_task_by_pid_type_ns(PIDTYPE_PID, nr, &init_pid_ns);
pid_task(find_vpid(registeredPid_), PIDTYPE_PID);
if(t == NULL)
{
PRINTF(CTSMSG_INFO
"CtsRtNetlinkSystem::sendMessageSignal -- no such pid\n");
rcu_read_unlock();
registeredPid_ = 0;
return FALSE;
}
rcu_read_unlock();
//send the signal
ret = send_sig_info(SIGNAL_MESSAGE, &info, t);
if (ret < 0)
{
PRINTF(CTSMSG_INFO
"CtsRtNetlinkSystem::sendMessageSignal -- \n"
"\t error sending signal %d \n", ret );
return FALSE;
}
return TRUE;
}
I'm currently testing the program on a VM, so I created a timer that ticks every 7 seconds and adds a message to the buffer.
//
// Create a timer to call the process thread
// with nanosecond resolution.
//
static void
createTimer(void)
{
hrtimer_init(
&processTimer_, // instance of process timer
CLOCK_MONOTONIC, // Pick a specific clock. CLOCK_MONOTONIC is
// guaranteed to move forward, no matter what.
// It's akin to jiffies tick count
// CLOCK_REALTIME matches the current real-world time
HRTIMER_MODE_REL ); // Timer mode (HRTIMER_ABS or HRTIMER_REL)
processTimer_.function = &cyclic_task;
processTimerNs_ = ktime_set(1, FREQUENCY_NSEC);
//
// Start the timer. It will callback the .function
// when the timer expires.
//
hrtimer_start(
&processTimer_, // instance of process timer
processTimerNs_, // time, nanosecconds
HRTIMER_MODE_REL ); // HRTIMER_REL indicates that time should be
// interpreted relative
// HRTIMER_ABS indicates time is an
// absolute value
}
static enum hrtimer_restart
cyclic_task(struct hrtimer* timer)
{
char msg[255];
sprintf(msg, "%s", "Testing the buffer.");
ctsRtNetlink_send( &msg[0], strlen(msg) );
hrtimer_forward_now(
&processTimer_,
processTimerNs_ );
return HRTIMER_RESTART;
}
Thanks in advance for any help.
Though your code flow is not very clear from the question, I feel the list addition may not be the problem. You must have the list handling elsewhere, where you must be removing the messages from the list etc. I suspect some sort of a deadlock situation somewhere between your list addition and removal etc. Also, check the place where you are copying the messages to the userspace and removing from the list and freeing it up. I suppose, you are not trying to directly referring your mesg from userspace as one of the commentator suggested above.
Also,
memset( msg, 0, sizeof(CoreLinkMessage) );
if ( msg == NULL )
{
These two lines has to reverse its order else, if alloc has failed your system is doomed.
Using GFP_ATOMIC instead of GFP_KERNEL for kmalloc solved the problem. Three days of run-time so far, and no crashing. I suspect one cannot sleep in a thread triggered by an hrtimer.
msg = (CoreLinkMessage*)kmalloc(sizeof(CoreLinkMessage), GFP_ATOMIC );
Thanks everyone for your insights!
Insufficient memory allocated
Be sure to allocate enough memory for the string length + 1 to store it's terminator.
In sending, an length + 1 may be needed.
// ctsRtNetlink_send( &msg[0], strlen(msg) );
ctsRtNetlink_send( &msg[0], strlen(msg) + 1); // +1 for \0

CreateProcess problems when using PROC_THREAD_ATTRIBUTE_PREFERRED_NODE or PROC_THREAD_ATTRIBUTE_GROUP_AFFINITY

I keep getting error 87, ERROR_INVALID_PARAMETERS when I call CreateProcess and use the PROC_THREAD_ATTRIBUTE_GROUP_AFFINITY extended attribute. I use the exact same code to call CreateRemoteThreadEx, and that works fine. Also, PROC_THREAD_ATTRIBUTE_PREFERRED_NODE seems to have no effect. So what am I doing wrong!?
Microsoft Windows Server 2008 R2 Enterprise, 6.1.7601 SP1 Build 7601
I even installed this service pack: A child process cannot be created by calling a CreateProcess function that uses the PROC_THREAD_ATTRIBUTE_PREFERRED_NODE parameter in Windows 7 or in Windows Server 2008 R2
Here is example code:
#include <windows.h>
typedef unsigned __int64 QWORD;
class CErr {
public:
CErr(LPCSTR szFunc, DWORD nErr) {
char szBuf[0x10000];
DWORD fFlags = FORMAT_MESSAGE_IGNORE_INSERTS|FORMAT_MESSAGE_FROM_SYSTEM;
DWORD fLang = MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT);
if (!nErr)
nErr = GetLastError();
FormatMessage(fFlags, NULL, nErr, fLang, szBuf, sizeof(szBuf) - 1, NULL);
printf("%s: %s", szFunc, szBuf);
}
};
int main(int argc, char* argv[])
{
DWORD nErr;
size_t cb;
char sAttribsBuf[4096];
auto pAttribs = (PPROC_THREAD_ATTRIBUTE_LIST)sAttribsBuf;
if (!InitializeProcThreadAttributeList(NULL, 1, 0, &cb)
&& ((nErr = GetLastError()) != ERROR_INSUFFICIENT_BUFFER))
throw CErr("InitializeProcThreadAttributeList", nErr);
if (!InitializeProcThreadAttributeList(pAttribs, 1, 0, &cb))
throw CErr("InitializeProcThreadAttributeList", 0);
#if 1 // if enabled, CreateProcess succeeds, but doesn't set affinity
WORD iNuma = 1; // WORD is the only size that does not error here
if (!UpdateProcThreadAttribute(pAttribs, 0, PROC_THREAD_ATTRIBUTE_PREFERRED_NODE,
&iNuma, sizeof(iNuma), NULL, NULL))
throw CErr("UpdateProcThreadAttribute", 0);
#else // if enabled, CreateProcess fails with ERROR_INVALID_PARAMETERS(87)
GROUP_AFFINITY GrpAffinity = { 0 };
GrpAffinity.Mask = 1;
if (!UpdateProcThreadAttribute(pAttribs, 0, PROC_THREAD_ATTRIBUTE_GROUP_AFFINITY,
&GrpAffinity, sizeof(GrpAffinity), NULL, NULL))
throw CErr("UpdateProcThreadAttribute", 0);
#endif
auto fCreationFlags = EXTENDED_STARTUPINFO_PRESENT;
PROCESS_INFORMATION pi = { 0 };
STARTUPINFOEX si = { 0 };
si.StartupInfo.cb = sizeof(si);
si.lpAttributeList = pAttribs;
if (!CreateProcess(NULL, "notepad.exe", NULL, NULL, false, fCreationFlags,
NULL, NULL, &si.StartupInfo, &pi))
throw CErr("CreateProcess", 0); // error if ...ATTRIBUTE_GROUP_AFFINITY
// SetProcessAffinityMask(pi.hProcess,1); // if enabled, notepad's affinity is set
WaitForSingleObject(pi.hProcess, INFINITE);
DeleteProcThreadAttributeList(pAttribs);
return 0;
}
It is not clear from the documentation, but I think I figured it out. PROC_THREAD_ATTRIBUTE_PREFERRED_NODE is only supposed to be used with CreateProcess(). PROC_THREAD_ATTRIBUTE_IDEAL_PROCESSOR and PROC_THREAD_ATTRIBUTE_GROUP_AFFINITY are only supposed to be used with CreateThread().
PROC_THREAD_ATTRIBUTE_PREFERRED_NODE might be setting the affinity of the process to all the processors in the same GROUP as the node. I can't verify it, since my test system only has 12 cores on two numa nodes. Setting PROC_THREAD_ATTRIBUTE_PREFERRED_NODE to 0 or to 1 sets the affinity to all the cores. I did verify that the stack of the process created by CreateProcess is located on the numa node indicated by PROC_THREAD_ATTRIBUTE_PREFERRED_NODE. Also not documented, the size of the node being passed in must be 2 bytes.

How to know the address range when searching for a function by its signature?

I'm trying to search for a function by its "signature".
However I can't figure out what address range I'm supposed to be searching?
I've had a look at VirtualQuery() and GetNativeSystemInfo() but I'm not if I'm on the right path or not.
Edit: Question re-attempt.
Using Win32 API I'm trying to find out how to get the start and end address of the executable pages of the process my code is executing in.
This is what I've tried:
SYSTEM_INFO info;
ZeroMemory( &info, sizeof( SYSTEM_INFO ) );
GetNativeSystemInfo( &info ); // GetSystemInfo() might be wrong on WOW64.
info.lpMinimumApplicationAddress;
info.lpMaximumApplicationAddress;
HANDLE thisProcess = GetCurrentProcess();
MEMORY_BASIC_INFORMATION memInfo;
ZeroMemory( &memInfo, sizeof( memInfo ) );
DWORD addr = (DWORD)info.lpMinimumApplicationAddress;
do
{
if ( VirtualQueryEx( thisProcess, (LPVOID)addr, &memInfo, sizeof( memInfo ) ) == 0 )
{
DWORD gle = GetLastError();
if ( gle != ERROR_INVALID_PARAMETER )
{
std::stringstream str;
str << "VirtualQueryEx failed with: " << gle;
MessageBoxA( NULL, str.str().c_str(), "Error", MB_OK );
}
break;
}
if ( memInfo.Type == MEM_IMAGE )
{
// TODO: Scan this memory block for the the sigature
}
addr += info.dwPageSize;
}
while ( addr < (DWORD)info.lpMaximumApplicationAddress );
The reason for doing this is that I'm looking for an un-exported function by its signature as asked here:
Find a function by it signature in Windows DLL
See the answer about "code signature scanning".
While this is enumerating an address range I don't know if this is correct or not since I don't know what the expected range should be. Its just the best I could come up with from looking around MSDN.
the address range when signature scanning a module is from the start of the code section to the start + the section size. the start of the code section and its size are in the PE. most tools take the lazy route and scan the entire module (again using the PE to get the size, but with the module handle as the start address).

Resources