need base64 encode/decode in c - c

I need a function to encode base64 and a function to decode base64 string in c. I found http://base64.sourceforge.net/b64.c but the functions work on files, not strings, and add line breaks. I need one that simply encodes/decodes strings. Where can I find such a sourcecode?

Get the functions from libb64.

If you have openssl available to you (which most *nix distros seem to have out-of-the-box these days), it provides robust, well-tested base64 encoding/decoding out of the box. This site has a decent code sample: Howto base64 decode with C/C++ and OpenSSL

When I needed to use Base64 encoding to build an encrypted email server, I decided to build my own implementation.
Currently, it's placed within a C++ class; but I wrote the encoding functions without using any c++ specific code, so you can copy and paste as you please.
This implementation is not approved by any organizations; but it should help you with learning how the algorithm works, while also giving you access to just base64 encoding. IE: no extra libraries get included.
https://github.com/AlexBestoso/Base64
Use as you please. The functions take chars, maps them to an integer via bit-wise operations, and then produces your result.
Currently, there's no newline or carriage return, which specified in the MIME implementation.
If you decide to use the code and find any bugs, let me know through github.

Related

Parsing and editing ASN1 binary blob in C

I have a valid encoded ASN1 binary blob, which I want to modify.
Moreover, I don't have the encoded ASN1's definitions file, but I know it's structure (e.g. let's say it's a sequence that contains few integers and an octet string).
Therefore I'd prefer to modify the encoded binary by iterating over the sequence and it's fields, modify them by setting new values and encoding the new modified binary blob.
How can i do that? i.e. How can I parse the encoded ASN1 binary, modify it and re-encode it in C language? Is there any library that is able to do that?
I'm developing a software module in C for Windows. This is important to note because (in general) many library are Linux oriented and had trouble with building them for Windows.
Thanks.
I used asn1c for this in a past project. You do need the specification: asn1c generates a decoder and encoders based on it. Sounds like in your case it wouldn't be hard to write it.
It will work on Windows. The FAQ claims the compiler now requires GCC and cannot be compiled with MSVC though. You can get GCC for Windows from www.mingw.org or Cygwin.

C Language. How to use a string value as delimiter in SSCANF

Is there a way to use a string as a delimiter?
We can use characters as delimiters using sscanf();
Example
I have
char url[]="username=jack&pwd=jack123&email=jack#example.com"
i can use.
char username[100],pwd[100],email[100];
sscanf(url, "username=%[^&]&pwd=%[^&]&email=%[^\n]", username,pwd,email);
it works fine for this string. but for
url="username=jack&jill&pwd=jack&123&email=jack#example.com"
it cant be used...its to remove SQL injection...but i want learn a trick to use
&pwd,&email as delimiters..not necessarily with sscanf.
Update: Solution doesnt necessarily need to be in C language. I only want to know of a way to use string as a delimiter
Just code your own parsing. In many cases, representing in memory the AST you have parsed is useful. But do specify and document your input language (perhaps using EBNF notation).
Your input language (which you have not defined in your question) seems to be similar to the MIME type application/x-www-form-urlencoded used in HTTP POST requests. So you might look, at least for inspiration, into the source code of free software libraries related to HTTP server processing (like libonion) and HTTP client processing (like libcurl).
You could read an entire line with getline (or perhaps fgets) then parse it appropriately. sscanf with %n, or strtok might be useful, but you can also parse the line "manually" (consider using e.g. your recursive descent parser). You might use strchr or strstr also.
BTW, in many cases, using common textual representations like JSON, YAML, XML can be helpful, and you can easily find many libraries to handle them.
Notice also that strings can be processed as FILE* by using fmemopen and/or open_memstream.
You could use parser generators such as bison (with flex).
In some cases, regular expressions could be useful. See regcomp and friends.
So what you want to achieve is quite easy to do and standard practice. But you need more that just sscanf and you may want to combine several things.
Many external libraries (e.g. glib from GTK) provide some parsing. And you should care about UTF-8 (today, you have UTF-8 everywhere).
On Linux, if permitted to do so, you might use GNU readline instead of getline when you want interactive input (with editing abilities and autocompletion). Then take inspiration from the source code of GNU bash (or of RefPerSys, if interested by C++).
If you are unfamiliar with usual parsing techniques, read a good book such as the Dragon Book. Most large programs deal somewhere with parsing, so you need to know how that can be done.

How to basically encrypt text in an ELF binary?

I've seen some binary files where the developer was a bit paranoid it seems and obfuscated all text in a binary. I hadn't seen anything like it before and didn't find any obvious options to compile an ELF with hidden text. Even standard OS API strings were hidden which was strange given they are usually visible.
These programs wouldn't exactly have any text that isn't exposed when it runs. Except unknown text. But hiding the whole lot just red flags and it makes it look suspicious.
Are there easy ways to hide text that is compiled into an ELF? Be that with easy compiler/linking options. I imagine a decoder could be inserted at main() but how could the text section be easily encoded?
I can imagine a custom way to do it would be to have an implicit decoder in the code with a key. Then use that key to encode text of the ELF. So that it is easily encoded.
You must have been looking at compressed executable files.
There are various tools available to compress executable files and decompress them at load time, such as upx for linux. Most text in the binary file will become unreadable to the naked eye , but be aware that it is a very ineffective method to hide sensitive data as hackers will have no difficulty decompressing the executable to gain access to the actual data.
Using encrypted strings in your executable, whose contents will have been produced by a script during the build process is a better approach, but the code to decrypt them must still be available somewhere in the executable, just harder to locate. If the data is sufficiently valuable (database password, bitcoin keys...), hackers will get it.
I guess that by "text" you mean human readable text (and not the code segment a.k.a. text segment).
You could just encrypt or obfuscate it into a read only
const char encrypted_text[] = {
// a lot of encrypted bytes like 0x01, 0x43, etc
// the C file containing that would be generated by some script
};
Then you'll use your de-obfuscation or decryption routines to get the real (unciphered) text.
I'm not sure it is worth the trouble. Life is too short.
I've normally seen this when analyzing malware. The authors do this to to prevent static analysis tools like strings from working. Additionally, such authors might load functions by using dlopen and dlsym to get functions that they need.
For example, in the code snippet below;
printf("Hello World");
I would see the string "Hello World" in the output of strings and by looking at the import section of the elf file, I'd see that the program is making use of printf. So without running the program it is possible to get a sense of what it is doing.
Now lets assume that the author wrote a function char* decrypt(int). This function take an index into a sting table (which each string is encrypted) and returns the decrypted string. The above one line of code would now notionally look like
void* pfile = dlopen(decrypt(3));
void* pfunct = dlsym(pfile, decrypt(15));
pfunct(decrypt(5));
Again, remember that the above is closer to pseudo-code then actually compileable code. Now in this case using static analysis tools we would not see the strings or the function names (in the import section).
Additionally, if we were attempting to reverse engineer the code we would need to take time to decrypt the strings and work through the logic to determine what functions are being called. It's not that this can't be done but it will slow down that analyst, which means that it will be longer till a mitigation for the malware is created.
And now to your question;
Are there easy ways to hide text that is compiled into an ELF? Be that
with easy compiler/linking options. I imagine a decoder could be
inserted at main() but how could the text section be easily encoded?
There is not compiler / linker option that does this. The author of this would need to choose to do this, write the appropriate functions (i.e. decrypt) above and write a utility to produce the encrypted forms of the strings. Additionally, as others have suggested once this is done, the entire application can be encryped/compressed (think of a self-extracting zip file) thus the only thing you see initially with static analysis tools would be the stub to decrypt of decompress the file.
see https://www.ioactive.com/pdfs/ZeusSpyEyeBankingTrojanAnalysis.pdf for an example of this. (granted this is Windows based, but the techniques for encryption and dynamically loading functions is the same. Look at section on API calls)
If interested you can also see; https://www.researchgate.net/publication/224180021_On_the_analysis_of_the_Zeus_botnet_crimeware_toolkit and https://arxiv.org/pdf/1406.5569.pdf

UTF-8 to UTF-16 API wrapper libraries for Windows?

Is there any wrapper library out there that mimics the Windows "ANSI" function names (e.g. CreateFileA), assumes the inputs are in UTF-8, converts them to UTF-16, calls the UTF-16 version of the function (e.g. CreateFileW), and converts the outputs back to UTF-8 for the program?
It would allow ASCII programs to use UTF-8 almost seamlessly.
Rather than wrapping the API functions, it's easier to wrap the strings in a conversion function. Then you'll be future-proof when the next version of Windows adds more API functions.
As others said, there are too many WinAPI functions to make such a library feasible. However one can hack it on the tool-chain level or using something like http://research.microsoft.com/en-us/projects/detours/.
EDIT: Windows 10 added support for UTF-8 codepage in ANSI API.
There is this thing called WDL, it has some UTF-8 wrappers (win32_utf8). I have never tried it so I don't know how complete the support is.

Modify env variable name inside a library with hex editor?

Is it possible to modify an environmental variable's name inside a library with some sort of editor. I'm thinking maybe a hex editor ?
I wish to modify the name but without altering its length:
envfoobar (9 chars)
yellowbar (9 chars)
Obviously, recompilation would be perfect but I do not know what exact flags were used to compile this library.
What's stopping you? You can even use a text editor (as long as it's a decent editor and knows how to handle binary data, like vim does). If the library is referring to the name of the environment variable through a string, and the string is in the library in the data segment (ie. it's not a string built at runtime), then it's trivial to edit a library in this way. Just don't delete or introduce new characters. I've done this under Linux. Some other OSes may digitally sign binaries and prevent this from working. Some OSes use a standard checksum or hash in which case you'll have to recompute it.
If you can find the name with the strings command on the library it might work. You could load the library up in your favorite hex editor change the string and give it a shot.
It's a hacky thing to do but it could work. Let us know.

Resources