Getting character attributes - c

Using WinAPI to get the attribute of a character located in y line and x column of the screen console.
This is what I am trying to do after a call to GetConsoleScreenBufferInfo(GetStdHandle(STD_OUTPUT_HANDLE), &nativeData); where the console cursor is set to the specified location. This won't work. It will return the last used attribute change instead.
How do I obtain the attributes used on all the characters on their locations?
EDIT:
The code I used to test ReadConsoleOutput() : http://hastebin.com/atohetisin.pl
It throws garbage values.

I see several problems off the top of my head:
No error checking. You must check the return value for ReadConsoleOutput and other functions, as documented. If the function fails, you must call GetLastError() to get the error code. If you don't check for errors, you're flying blind.
You don't allocate a buffer to receive the data in. (Granted, the documentation confusingly implies that it allocates the buffer for you, but that's obviously wrong since there's no way for it to return a pointer to it. Also, the sample code clearly shows that you have to allocate the buffer yourself. I've added a note.)
It looks as if you had intended to read the characters you had written, but you are writing to (10,5) and reading from (0,0).
You're passing newpos, which is set to (10,5), as dwBufferCoord when you call ReadConsoleOutput, but you specified a buffer size of (2,1). It doesn't make sense for the target coordinates to be outside the buffer.
Taking those last two points together I think perhaps you have dwBufferCoord and lpReadRegion confused, though I'm not sure what you meant the coordinates (200,50) to do.
You're interpreting CHAR_INFO as an integer in the final printf statement. The first element of CHAR_INFO is the character itself, not the attribute. You probably wanted to say chiBuffer[0].Attributes rather than just chiBuffer[0]. (Of course, this is moot at the moment, since chiBuffer points to a random memory address.)
If you do want to retrieve the character, you'll first need to work out whether the console is in Unicode or ASCII mode, and retrieve UnicodeChar or AsciiChar accordingly.

Related

Difficulties understanding how to take elements from a file and store them in C

I'm working on an assignment that is supposed to go over the basics of reading a file and storing the information from that file. I'm personally new to C and struggling with the lack of a "String" variable.
The file that the program is supposed to work with contains temperature values, but we are supposed to account for "corrupted data". The assignment states:
Every input item read from the file should be treated as a stream of characters (string), you can
use the function atof() to convert a string value into a floating point number (invalid data can be
set to a value lower than the lowest minimum to identify it as corrupt)."
The number of elements in the file is undetermined but an example given is:
37.8, 38.a, 139.1, abc.5, 37.9, 38.8, 40.5, 39.0, 36.9, 39.8
After reading the file we're supposed to allow a user to query these individual entries, but as mentioned if the data entry contains a non-numeric value, we are supposed to state that the specific data entry is corrupted.
Overall, I understand how to functionally write a program that can fulfill those requirements. My issue is not knowing what data structure to use and/or how to store the information to be called upon later.
The closest to an actual string datatype which you find in C is a sequence of chars which is terminated by a '\0' value. That is used for most things which you'd expect to do with strings.
Storing them requires just sufficent memory, as offered by a sufficiently large array of char, or as offered by malloc().
I think the requirements of your assignment would be met by making a char array as buffer, then reading in with fgets(), making sure to not read more than fits into your array and making sure that there is a '\0' at the end.
Then you can use atof() on the content of the array and if it fails do the handling of corrupted input. Though I would prefer sscanf() for its better feedback via separate return value.

What are the alignments referred to when discussing the strings section of a process address space

I'm trying to write a program to expose the arguments of other pids on macOS. I've made the KERN_PROCARGS2 sysctl, but it turns out that everyone and their dog use this wrong. Including Apple's ps, and Google's Chrome. The exec family of functions all allow you to pass an empty string as argv[0], which is not great but it can happen and so must be dealt with. In this case, the standard approach of skipping forward past the NULLs following the exec_path in the returned buffer doesn't work, as the last NULL before the rest of the arguments is actually the terminating NULL of an empty string, So you wind up skipping an argument you didn't mean to, which can result in printing an env var as an argument (I've confirmed this behaviour in many programs).
To do this properly one must calculate how many nulls to skip, instead of skipping them all every time. There are references around the web to the different parts of the returned buffer being pointer aligned, however no matter what part of the buffer I try to check with len % 8 I don't get a correct count of padding NULLs.
https://github.com/apple/darwin-xnu/blob/main/bsd/kern/kern_sysctl.c#L1528
https://lists.apple.com/archives/darwin-kernel/2012/Mar/msg00025.html
https://chromium.googlesource.com/crashpad/crashpad/+/refs/heads/master/util/posix/process_info_mac.cc#153
I wrote a library to do this correctly: https://getargv.narzt.cam

Parsing an iCalendar file in C

I am looking to parse iCalendar files using C. I have an existing structure setup and reading in all ready and want to parse line by line with components.
For example I would need to parse something like the following:
UID:uid1#example.com
DTSTAMP:19970714T170000Z
ORGANIZER;CN=John Doe;SENT-BY="mailto:smith#example.com":mailto:john.doe#example.com
CATEGORIES:Project Report, XYZ, Weekly Meeting
DTSTART:19970714T170000Z
DTEND:19970715T035959Z
SUMMARY:Bastille Day Party
Here are some of the rules:
The first word on each line is the property name
The property name will be followed by a colon (:) or a semicolon (;)
If it is a colon then the property value will be directly to the right of the content to the end of the line
A further layer of complexity is added here as a comma separated list of values are allowed that would then be stored in an array. So the CATEGORIES one for example would have 3 elements in an array for the values
If after the property name a semi colon is there, then there are optional parameters that follow
The optional parameter format is ParamName=ParamValue. Again a comma separated list is supported here.
There can be more than one optional parameter as seen on the ORGANIZER line. There would just be another semicolon followed by the next parameter and value.
And to throw in yet another wrench, quotations are allowed in the values. If something is in quotes for the value it would need to be treated as part of the value instead of being part of the syntax. So a semicolon in a quotation would not mean that there is another parameter it would be part of the value.
I was going about this using strchr() and strtok() and have got some basic elements from that, however it is getting very messy and unorganized and does not seem to be the right way to do this.
How can I implement such a complex parser with the standard C libraries (or the POSIX regex library)? (not looking for whole solution, just starting point)
This answer is supposing that you want to roll your own parser using Standard C. In practice it is usually better to use an existing parser because they have already thought of and handled all the weird things that can come up.
My high level approach would be:
Read a line
Pass pointer to start of this line to a function parse_line:
Use strcspn on the pointer to identify the location of the first : or ; (aborting if no marker found)
Save the text so far as the property name
While the parsing pointer points to ;:
Call a function extract_name_value_pair passing address of your parsing pointer.
That function will extract and save the name and value, and update the pointer to point to the ; or : following the entry. Of course this function must handle quote marks in the value and the fact that their might be ; or : in the value
(At this point the parsing pointer is always on :)
Pass the rest of the string to a function parse_csv which will look for comma-separated values (again, being aware of quote marks) and store the results it finds in the right place.
The functions parse_csv and extract_name_value_pair should in fact be developed and tested first. Make a test suite and check that they work properly. Then write your overall parser function which calls those functions as needed.
Also, write all the memory allocation code as separate functions. Think of what data structure you want to store your parsed result in. Then code up that data structure, and test it, entirely independently of the parsing code. Only then, write the parsing code and call functions to insert the resulting data in the data structure.
You really don't want to have memory management code mixed up with parsing code. That makes it exponentially harder to debug.
When making a function that accepts a string (e.g. all three named functions above, plus any other helpers you decide you need) you have a few options as to their interface:
Accept pointer to null-terminated string
Accept pointer to start and one-past-the-end
Accept pointer to start, and integer length
Each way has its pros and cons: it's annoying to write null terminators everywhere and then unwrite them later if need be; but it's also annoying when you want to use strcspn or other string functions but you received a length-counted piece of string.
Also, when the function needs to let the caller know how much text it consumed in parsing, you have two options:
Accept pointer to character, Return the number of characters consumed; calling function will add the two together to know what happened
Accept pointer to pointer to character, and update the pointer to character. Return value could then be used for an error code.
There's no one right answer, with experience you will get better at deciding which option leads to the cleanest code.

Prevent crash in string manipulation crashing whole application

I created a program which at regular intervals downloads a text file from a website, which is in csv format, and parses it, extracting relevant data, which then is displayed.
I have noticed that occasionally, every couple of months or so, it crashes. The crash is rare, considering the cycle of data downloading and parsing can happen every 5 minutes or even less. I am pretty sure it crashes inside the function that parses the string and extracts the data. When it crashes it happens during a congested internet connection, i.e. heavy downloads and/or a slow connection. Occasionally the remote site may be handing corrupt or incomplete data.
I used a test application which saves the data to be processed before processing it and it indeed shows it was not complete when a crash happens.
I have adapted the function to accommodate for a number of cases of invalid or incomplete data, as well as checking all return values. I also check return values of the various functions used to connect to the remote site and download the data. And will not go further when a return value indicates no success.
The core of the function uses strsep() to walk through the data and extract information out of it:
/ *
* delimiters typically contains: <;>, <">, < >
* strsep() is used to split part of the string using delimiter
* and copy into token which then is copied into the array
* normally the function stops way before ARRAYSIZE which is just a safeguard
* it would normally stop when the end of file is reached, i.e. \0
*/
for(n=0;n<ARRAYSIZE;n++)
{
token=strsep(&copy_of_downloaded_data, delimiters);
if (token==NULL)
break;
data->array[n].example=strndup(token, strlen(token));
if (data->array[n].example!=NULL)
{
token=strsep(&copy_of_downloaded_data, delimiters);
if (token==NULL)
break;
(..)
copy_of_downloaded_data=strchr(copy_of_downloaded_data,'\n'); /* find newline */
if (copy_of_downloaded_data==NULL)
break;
copy_of_downloaded_data=copy_of_downloaded_data+1;
if (copy_of_downloaded_data=='\0') /* find end of text */
break;
}
Since I suspect I can not account for all ways in which data can be corrupted I would like to know if there is a way to program this so the function when run does not crash the whole application in case of corrupted data.
If that is not possible what could I do to make it more robust.
Edit: One possible instance of a crash is when the data ends abruptly, where the middle of a field is cut of, i.e.
"test","example","this data is brok
At least I noticed it by looking through the saved data, however I found it not being consistent. Will have to stress test it as was suggested below.
The best thing to do would be to figure out what input causes the function to crash, and fix the function so that it does not crash. Since the function is doing string processing, this should be possible to do by feeding it lots of dummy/test data (or feeding it the "right" test data if it's a particular input that causes the crash). You basically want to torture-test the function until you find out how to make it crash on demand; at that point you can start investigating exactly where and why it crashes, and once you understand that, the necessary changes to fix the crash will probably become obvious to you.
Running the program under valgrind might also point you to the bug.
If for some reason you can't fix the bug, the other option is to spawn a child process and run the buggy code inside the child process. That way if it crashes, only the child process is lost and not the parent. (You can spawn the child process under most OS's by calling fork(); you'll need to come up with some way for the child process to communicate its results back to the parent process, of course). (Note that doing it this way is a kludge and will likely not be very efficient, and could also introduce a security hole into your application if someone malicious who has the ability to send your program input can figure out how to manipulate the bug in order to take control of the child process -- so I don't recommend this approach!)
What does the coredump point to?
strsep - does not have memory synchronization mechanisms, so protect it as a critical section ( lock it when you do strsep ) ?
see if strsep can handle a big chunk ( ARRAYSIZE is not gonna help you here ).
stack size of the thread/program that receives copy_of_downloaded_data ( i know you are only referencing it so look at the function that receives it. )
I would suggest that one should try to write code that keeps track of string lengths deliberately and doesn't care whether strings are zero-terminated or not. Even though null pointers have been termed the "billion dollar mistake"(*) I think zero-terminated strings are far worse. While there may be some situations where code using zero-terminated strings might be "simpler" than code that tracks string lengths, extra effort required to make sure that nothing can cause string-handling code to exceed buffer boundaries exceeds that required when working with known-length strings.
If, for example, one wants to store the concatenation of strings of length length1 and length2 into a buffer if length BUFF_SIZE, one can test easily whether length1+length2 <= BUFF_SIZE if one isn't expecting strings to be null-terminated, or length1+length2 < BUFF_SIZE if one expects a gratuitous null byte to follow every string. When using zero-terminated strings, one would have to determine the length of the two strings before concatenation, and having done so one could just as well use memcpy() rather than strcpy() or the useless strcat().
(*) There are many situations where it's much better to have a recognizably-invalid pointer than to require that pointers which can't point to anything meaningful must instead point to something meaningless. Many null-pointer related problems actually stem from a failure of implementations to trap arithmetic with null pointers; it's not fair to blame null pointers for problems that could have been, but weren't avoided.

How can I get how many bytes sscanf_s read in its last operation?

I wrote up a quick memory reader class that emulates the same functions as fread and fscanf.
Basically, I used memcpy and increased an internal pointer to read the data like fread, but I have a fscanf_s call. I used sscanf_s, except that doesn't tell me how many bytes it read out of the data.
Is there a way to tell how many bytes sscanf_s read in the last operation in order to increase the internal pointer of the string reader? Thanks!
EDIT:
And example format I am reading is:
|172|44|40|128|32|28|
fscanf reads that fine, so does sscanf. The only reason is that, if it were to be:
|0|0|0|0|0|0|
The length would be different. What I'm wondering is how fscanf knows where to put the file pointer, but sscanf doesn't.
With scanf and family, use %n in the format string. It won't read anything in, but it will cause the number of characters read so far (by this call) to be stored in the corresponding parameter (expects an int*).
Maybe I´m silly, but I´m going to try anyway. It seems from the comment threads that there's still some misconception. You need to know the amount of bytes. But the method returns only the amount of fields read, or EOF.
To get to the amount of bytes, either use something that you can easily count, or use a size specifier in the format string. Otherwise, you won't stand a chance finding out how many bytes are read, other then going over the fields one by one. Also, what you may mean is that
sscanf_s(source, "%d%d"...)
will succeed on both inputs "123 456" and "10\t30", which has a different length. In these cases, there's no way to tell the size, unless you convert it back. So: use a fixed size field, or be left in oblivion....
Important note: remember that when using %c it's the only way to include the field separators (newline, tab and space) in the output. All others will skip the field boundaries, making it harder to find the right amount of bytes.
EDIT:
From "C++ The Complete Reference" I just read that:
%n Receives an integer value equal to
the nubmer of characters read so far
Isn't that precisely what you were after? Just add it in the format string. This is confirmed here, but I haven't tested it with sscanf_s.
From MSDN:
sscanf_s, _sscanf_s_l, swscanf_s, _swscanf_s_l
Each of these functions returns the number of fields successfully converted and assigned; the return value does not include fields that were read but not assigned. A return value of 0 indicates that no fields were assigned. The return value is EOF for an error or if the end of the string is reached before the first conversion.

Resources