Question
I am wondering why do we connect to sockets by using functions like hton to take care of endianness when we could have sent the ip in plain char array.
Say we want to connect to 184.54.12.169
There is an explanation to this but I cannot figure out why we use integers instead of char, and so involving ourself in endianness hell.
I think char out_ip[] = "184.54.12.169" could have theoretically made it.
Please explain me the subtleties i don't get here.
The basic networking APIs are low level functions. These are very thin wrappers around kernel system calls. Removing these low level functions, forcing everything to use strings, would be rather bad for a low-level API like that, especially considering how tedious string handling is in C. As a concrete hurdle, even IP strings would not be fixed length, so handling them is a lot more complex than just plain 32 bit integers. And moving string handling to kernel is really quite against what kernel is supposed to be, handling arbitrary user strings is really user space problem.
So, you want to create higher-level functions which would accept strings and do the conversion in the library. But, adding such higher level "convenience" functions all over the place in the core libraries would bloat them, because certainly passing IP numbers is not the only place for such convenience. These functions would need to be maintained forever and included everywhere, after they became part of standard (official like POSIX, or de-facto) libraries.
So, removing the low-level functions is not really an option, and adding more functions for higher-level API in the same library is not a good option either.
So solution is to use another library to provide higher level networking API, which could for example handle address strings directly. Not sure what's out ther for C, but it's almost a given for other languages, which also have "real" strings built in so using them is not a hassle.
Because that's how an IP is transmitted in a packet. The "www.xxx.yyy.zzz" string form is really just a human readable form of a 4 byte integer that allows us to see the hierarchical nature a little easier. Sending a whole string would take up a lot more space as well.
Say number 127536 that requires 7 bytes not four. In addition you need to parse it.
I.e. more efficient and do not have to deal with invalid values.
Related
I need to serialize a C struct to a file in a portable way, so that I can read the file on other machines and can be guaranteed that I will get the same thing that I put in.
The file format doesn't matter as long as it is reasonably compact (writing out the in-memory representation of a struct would be ideal if it wasn't for the portability issues.)
Is there a clean way to easily achieve this?
You are essentially designing a binary network protocol, so you may want to use an existing library (like Google's protocol buffers). If you still want to design your own, you can achieve reasonable portability of writing raw structs by doing this:
Pack your structs (GCC's __attribute__((packed)), MSVC's #pragma pack). This is compiler-specific.
Make sure your integer endianness is correct (htons, htonl). This is architecture-specific.
Do not use pointers for strings (use character buffers).
Use C99 exact integer sizes (uint32_t etc).
Ensure that the code only compiles where CHAR_BIT is 8, which is the most common, or otherwise handles transformation of character strings to a stream of 8-bit octets. There are some environments where CHAR_BIT != 8, but they tend to be special-purpose hardware.
With this you can be reasonably sure you will get the same result on the other end as long as you are using the same struct definition. I am not sure about floating point numbers representation, however, but I usually avoid sending those.
Another thing unrelated to portability you may want to address is backwards compatibility by introducing length as a first field, and/or using version tag.
You could try using a library such as protocol buffers; rolling your own is probably not worth the effort.
Write one function for output.
Use sprintf to print an ascii representation of each field to the file,
one field per line.
Write one function for input.
Use fgets to load each line from the file.
Use scanf to convert to binary, directly into the field in your structure.
If you plan on doing this with a lot of different structures,
consider adding a header to each file, which identifies what kind of structure
it represents.
novice to aes. in reading http://en.wikipedia.org/wiki/AES_implementations, I am a bit surprised. I should need just one function
char16 *aes128(char16 key, char16 *secrets, int len);
where char16 is an 8*16=128bit character type. and, presumably, ignoring memory leaks,
assert( bcmp( anystring, aes128(anykey, aes128(anykey, anystring, len), len )==0 );
I am looking over the description of the algorithm on wikipedia, and although I can see myself making enough coding mistakes to take me a few days to debug my own implementation, it does not seem too complex. maybe 100 lines? I did see versions in C#, such as Using AES encryption in C#. that seem themselves almost as long as the algorithm itself. earlier recommendations on stackoverflow mostly recommend the use of individual functions inside larger libraries, but it would be nice to have a go-to function for this task that one could compile into one's code.
so, is AES implementation too complex to be for the faint of heart? or is it reasonably short and simple?
how many lines does a C implementation take? is there a self-contained aes128() C function already in free form somewhere for the taking?
another question: is each block independently encoded? presumably, it would strengthen the encryption if the first block would create a salt that the second block would then use. otoh, this would mean that disk corruption of one block would make every subsequent block undecryptable.
/iaw
You're not seeing a single function like you expect because there are so many options. For example, the block encoding mechanism you described (CBC) is just one option or mode in AES encryption. See here for more information: http://www.heliontech.com/aes_modes_basic.htm
The general rule of thumb in any language is: Don't reinvent something that's already been done and done well. This is especially true in anything related to cryptography.
well using just the AES function is basically insecure as any block X will always be encoded to block Y with key K which is too much information to give an attacker... (according to cryptographers)
so you use some method to change the block cipher at each block. you can use a nonce or Cipher Block Chaining or some other method. but there is a pretty good example on wikipedia (the penguin picture): http://en.wikipedia.org/wiki/Electronic_code_book#Electronic_codebook_.28ECB.29
so in short you can implement AES in one function that is secure (as a block cipher), but it isn't secure if you have data that is longer than 16 bytes.
also AES is fairly complex because of all the round keys... I wouldn't really want to implement it, especially with all of the many good implementations around, but I guess it wouldn't be so bad if you had a good reason to do it.
so in short, to construct a secure stream cipher from a block cipher you need to adopt some strategy to change the effective key along the stream.
ok, so I found a reasonable standalone implementation:
http://www.literatecode.com/aes256
About 400 lines. I will probably use this one.
hope it helps others, too.
When writing safe code in straight C, I'm sick and tired of coming up with arbitrary
numbers to represent limitations -- specifically, the maximum amount of
memory to allocate for a single line of text. I know I can always say
stuff like
#define MAX_LINE_LENGTH 1024
and then pass that macro to functions such as snprintf().
I work and code in NetBSD, which has a sysctl(3) variable called
"user.line_max" designed for this very purpose. So I don't need to come up
with an arbitrary number like MAX_LINE_LENGTH, above. I just read the
"user.line_max" sysctl variable, which by the way is settable by the user.
My question is whether this is the Right Thing in terms of safety and
portability. Perhaps different operating systems have a different name for
this sysctl, but I'm more interested in whether I should be using this
technique at all.
And for the record, "portability" excludes Microsoft Windows in this case.
Well the linux SYSCTL (2) man page has this to say in the Notes section:
Glibc does not provide a wrapper for this system call; call it using syscall(2).
Or rather... don't call it: use of this system call has long been discouraged, and it is so unloved that it is likely to disappear in a future kernel version. Remove it from your programs now; use the /proc/sys interface instead.
So that is one consideration.
Not a good idea. Even if it weren't for what Duck told you, relying on a system-wide setting that's runtime-variable is bad design and error-prone. If you're going to go to the trouble of having buffer size limits be variable (which typically requires dynamic allocation and checking for failure) then you should go the last step and make it configurable on a more local scope.
With your example of buffer size limits, opinions differ as to what's the best practice. Some people think you should always use dynamically-growing buffers with no hard limit. Others prefer fixed limits sufficiently large that reasonable data would not exceed them. Or, as you've noted, configurable limits are an option. In choosing what's right for your application, I would consider the user experience implications. Sure users don't like arbitrary limits, but they also don't like it when accidentally (or by somebody else's malice) reading data with no newlines in it causes your application to consume unbounded amounts of memory, start swapping, and/or eventually crash or bog down the whole system.
The nearest portable construct for this is "getconf LINE_MAX" or the equivalent C.
1) Check out the Single Unix Specification, keyword: "limits"
2) s/safety/security/
I am using C and want to know are XML messages are preferable over text messages as far as communication over a socket connection is concerned?
Is there any other good option available rather to go for XML?
Which is the best parser(or parsing option) available for parsing XML in C?
Is there any standard library which comes with C and helps to parse XML messages?
You design the protocol so you decide. You can use text or binary communication. Whatever format you use, you decide the how to serialize/de-serialize and interpret data. If you use XML, you can leverage on XMLRPC or SOAP. You can use JSONRPC as well. Last time in my project, I used binary in a very simple yet efficient way: The first to identify the method/function to call. The next 2 bytes to inform the length of data (up to 64K - 1 bytes) and the rest is data. Take note of Big/Small Endianess.
It's very subjective. You could use validating or non-validating parsers. TinyXML is lightweight one. You can look into MiniXML and Expat. libxml2 is fatter.
So far XML parsing is not in standard libraries of C or C++. You could use the aforementioned libraries.
Good luck!
EDIT:
By the way, if you want to use binary format to exchange data, just use any of these 3:
http://tpl.sourceforge.net/ - C serialization library.
http://www.s11n.net/c11n/ - A powerful and complicated C serialization library.
Siseria - http://sourceforge.net/projects/siseria/ , purely in C. I wrote for an embedded system project. It runs without any dependency and is very fast! Compared to other 2, mine is very simple and does not use heap and dynamic memory at all. Everything is on the stack!
There are any number of possible solutions. I would look at a few other options before picking XML, I think. XML has quite a lot of overhead; unless you're going to compress your streams it might be a bit costly. XML also isn't easy to edit for humans, although of course more so than a binary format.
You might want to look at JSON, it's a very popular format and is far simpler than XML. There are plenty of implementations available.
I can highly recommend Protobuf, Google's data interchange format. We're using it for communicating between two processes, at it works great. It has built-in support for C++, Python, and Java, and 3-rd party libraries for a bunch of others (Jon Skeet maintains the C# port).
The main question is what are your performance requirements.
If you are going to send one message per second, feel free. If you have human interface on one end, use XML or any other text format.
If you design a machine-to-machine interface, you'd rather consider binary data. Remember to convert everything to network-standard byte order in this case.
I'm a bit new to C, but I've done my homework (some tutorials, books, etc.) and I need to program a simple server to handle requests from clients and interact with a db. I've gone through Beej's Guide to Network programming, but I'm a bit unsure how to piece together and handle different parts of the data getting sent back and forth.
For instance, say the client is sending some information that the server will put in multiple fields. How do I piece together that data to be sent and then break it back up on the server side?
Thanks,
Eric
If I understand correctly, you're asking, "how does the server understand the information the client sends it"?
If that's what you're asking, the answer is simple: it's mutually agreed upon ahead of time that the data structures each uses will be compatible. I.e. you decide upon what your communication protocol will be ahead of time.
So, for example, if I have a client-server application where the client connects and can ask for things such as "time", "date" and can say "settime " and "setdate ", I need to write my server in such a way that it will understand those commands.
Obviously, in the above case it's trivial, since it'd just be a text-based protocol. But let's say you're writing an application that will return a struct of information, i.e.
struct Person {
char* name;
int age;
int heightInInches;
// ... other fields ...
};
You might write the entire struct out from the server/client. In this case there are a few things to be aware of:
You need to hton/ntoh properly
You need to make sure that your client and server both can understand the struct in question.
You may or may not have to align on a 4B boundary (because if you don't, different C compilers may do different things, which may burn you between the client and the server, or it may not).
In general, though, when writing a client/server app, the most important thing to get right is the communication protocol.
I'm not sure if this quite answers your question, though. Is this what you were after, or were you asking more about how, exactly, do you use the send/recv functions?
First, you define how the packet will look - what information will be in it. Make sure the definition is in an architecture-neutral format. That means that you specify it in a sequence that does not depend on whether the machine is big-endian or little-endian, for example, nor on whether you are compiling with 32-bit long or 64-bit long values. If the content is of variable length, make sure the definition contains the information needed to tell how long each part is - in particular, each variable length part should be preceded by a suitable count of its length.
When you need to package the data for transmission, you will take the raw (machine-specific) values and write them into a buffer (think 'character array') at the appropriate positions, in the appropriate format.
This buffer will be sent across the wire to the receiver, which will read it into another buffer, and then reverse the process to obtain the information from the buffer into local variables.
There are functions such as ntohs() to convert from a network ('n') to host ('h') format for a 'short' (meaning 16-bit) integer, and htonl() to convert from a host 'long' (32-bit integer) to network format - etc.
One good book for networking is Stevens' "UNIX Network Programming, Vol 1, 3rd Edn". You can find out more about it at its web site, including example code.
As already mentioned above what you need is a previously agreed means of communication. One thing that helps me is to use xmls to communicate.
e.g. You need time to send time to client then include it in a tag called time.
Then parse it on the client side and read the tag value.
The biggest advantage is that once you have a parser in place on client side then even if you have to send some new information them just have to agree on a tag name that will be parsed on the client side.
It helps me , I hope it helps you too.