I have a Windows UI, written in C#, that calls a DLL, written in C.
Data is exchanged between the C# UI and C DLL using the marshaling techniques available through pInvoke. Both the UI and the the DLL are legacy code.
All of the software runs on the cloud; specifically, on Amazon Web Services (AWS). But it is portable to any cloud service provider (Azure, Google, etc).
I need to write a new piece of C code ("NewCode"), that runs on a separate AWS (or other) instance, that does nothing except read data from a proprietary database and service data requests from the existing DLL.
For lots of reasons, this NewCode needs to run on its own instance, so that it has its own, exclusive access to memory, cpu, and disk. Newcode needs to service a variety of data requests: a single number, a char string, an array of numbers, array of strings, etc. NewCode will be portable C, so it can run under Linux, Unix, etc.
My question:
What are my options for having the existing C DLL communicate with NewCode? I know it is too broad a topic to ask for a list of options and their relative merits, so all I'm asking for here is to what should be on the list so I can begin my research. I am a complete newbie in this area, but so far I have determined that on the list should be sockets and pipes. What else should be on the list?
Since NewCode will be communicating over the network, I would look into protocol buffers. Protocol buffers would likely be the most efficient for communicating between to processes on separate machines who may be running different operating systems. There are protocol buffer implementations for many different languages, all of which use the same predefined structure definitions.
Of course, there are other options, like XML, JSON, or your own binary protocol.
https://code.google.com/p/protobuf/
Related
I've found that Elixir programs can run C code either via NIFs (native implemented functions) or via OS-level ports. Having read those and similar links, I'm not a hundred percent clear on when to use one or the other method (or something else entirely?), and feel it would be good to have a direct comparison available, for myself and other novices. Can anyone provide?
What are ports?
Ports are basically separate programs which are run separately from the Erlang VM. The Erlang VM communicates with the running port over standard input/output, and the resulting port lives behind an Erlang process that owns it and can facilitate communication between the port and the rest of your Erlang or Elixir application. Ports are "safe" in the sense that if the port crashes, it doesn't bring down the whole Erlang VM.
Porcelain might be of interest as a possible improvement and expansion over what's already provided in the Port module. System.cmd/3 also uses ports in its underlying implementation.
What are NIFs?
Native inline functions or "NIFs" are functions defined in what are essentially shared libraries / DLLs loaded by the Erlang VM and written using some language which exposes a C-compatible ABI. NIFs are more efficient than ports (since they don't have to communicate over STDIN/STDOUT) and are simpler in many respects (since you don't have to deal with encoding and decoding data between your Elixir and non-Elixir codebases), but they're also much less safe; a NIF can crash the Erlang VM, and a long-running NIF can potentially lock up the Erlang VM (since the scheduler can't reason about native code).
What are port drivers?
Port drivers are kind of an in-between approach to integrating external code with an Erlang or Elixir codebase. Like NIFs, they're loaded into the Erlang VM, and a port driver can therefore crash or hang the whole VM. Like ports, they behave similarly to Erlang processes.
When should I use a port?
You want your external code to behave like an ordinary Erlang process (at least enough for such a process to wrap it and send/receive messages on behalf of your external code)
You want the Erlang VM to be able to survive your external code crashing
You want to implement a long-running task in your external code
You want to write your external code in a language that does not support C-compatible FFI (or otherwise don't want to deal with your language's FFI facilities)
When should I use a NIF?
You want your external code to behave like a collection of ordinary Erlang functions (particularly if you want to define an Erlang/Elixir module that exports functions implemented in native-compiled code)
You want to avoid any potential performance hits / overhead from communicating via standard input/output and/or you want to avoid having to translate between Erlang terms and something your external code understands
You are reasonably confident that the things your external code is doing are neither long-running nor likely to crash (including, in the latter case, if you're writing your NIFs in something like Rust; see also: Rustler), or...
You are reasonably confident that crashing or hanging the Erlang VM is acceptable for your use case (e.g. your code is both distributed and able to survive the sudden loss of an Erlang node, or you're writing a desktop application and an application-wide crash is not a big deal aside from being an inconvenience to users)
When should I use port drivers?
You want your external code to behave like an Erlang process
You want to avoid the overhead and/or complexity of communicating over standard input/output
You are reasonably confident that your port driver won't crash or hang the Erlang VM, or...
You are reasonably confident that a crash or hang of the Erlang VM is not a critical issue
What do you recommend?
There are two aspects to weigh here:
Process-like v. module-like
Safe v. efficient
If you want maximum safety behind a process-like interface, go with a port.
If you want maximum safety behind a module-like interface, go with a module with functions that either wrap System.cmd/3 or directly use a port to communicate with your external code
If you want better efficiency behind a process-like interface, go with a port driver.
If you want better efficiency behind a module-like interface, go with NIFs.
I have an application that began its life as a C#-based Windows GUI that used marshalling to talk to a C DLL.
I now need to separate the Windows client and DLL so that the client is installed on a remote PC and communicates with the C DLL over the internet.
A further complication is that I want to have multiple Windows clients connecting to the C DLL.
This whole world is new to me, so excuse me if the following are naive questions.
My questions:
0) What is the best method for having the client communicate with the DLL over the internet? TCP/IP Sockets?
1) I need to make modifications to my DLL to have it service multiple clients. But I need some piece of middleware that collects the queries from the different clients, feeds them to the DLL, and then sends the results back to the appropriate client. Is there any code (such as node.js) that would facilitate this?
Regarding: What is the best method for having the client communicate with the DLL over the internet?
Your suggestion of using TCP/IP could certainly (and likely will) be part of the solution, but there will be other components of the solution as well. The direction you choose will in part be made by answering whether you are using standard marshaling (COM), or custom? At the very least, your problem description suggests a scenario requiring interprocess communications.
There are many ways to implement. This diagram maps out a general approach, that based on your description might apply:
Components of Interprocess Communications
Read more here
Regarding: make modifications to my DLL to have it service multiple clients...
The dll is simply a file like any other. Several processes can read, and subsequently own content from, a file as long as the processes doing the reading adhere to common file access rules. I do not think you will have to modify your dll, at least for that reason. Just make sure the processes accessing the dll comply with safe file access protocols. (Safe file access).
I need to make labview communicate with a C/C++ application. Both the applications run on the same machine. What is the IPC mechanism with lower overhead and highest speed available in LabView?
TCP, UDP, ActiveX, DDE, file transactions, or perhaps just directly calling a dll are the solutions that come to mind.
First I'd just call a dll if you can manage with that. Assuming you're tied in to using two separate applications then:
I'd use TCP or UDP. File transactions are clunky but easy to implement, DDE is older but might be viable (I'd recommend against it).
Basic TCP/IP in Labview
TCP/IP and UDP in Labview
Calling a dll from Labview
Have you investigated straight up TCP or UDP?
It'll make it easy if you ever need to separate the applications onto different machines later on down the road. Implementation is pretty straight forward too, although it may not be the fastest throughput.
What speeds are we talking about here?
NI has provided a thorough document explaining that: Using External Code in LabVIEW [pdf]. In brief, you can use:
Shared Libraries (on windows they are called DLLs). According to the above document, any
language can be used to write DLLs as long as the DLLs can be called
using one of the calling conventions LabVIEW supports, either
stdcall or C."
Code Interface Node (CIN), which is a block diagram node that links C/C++ source
code to LabVIEW.
.NET technology.
Note that "Shared Libraries" and "Code Interface Node" are supported on Windows, Max OS X, Linux and Solaris.
I want to send lot of strings (~250000) for <1sec from C application to a C# application. When I do it with WM_COPYDATA and SendMessage, my C# application hangs. What else can I do? Named pipes are included only in .NET 4, and I'm using .NET 2.
EDIT:
I'm gonna stick to WM_COPYDATA and appending to a list (which is a fast operation). Then post processing this list.
The fastest option is probably to use named pipes via P/Invoke. This is still much higher performance than most other IPC options.
Shared memory or MMF is the fastest method. It's as fast as kernel objects, used for signalling about data availability are. And, more importantly, you can first open the shared memory, then put your data directly there (saves you one copy operation) and signal to other application. That other application can consume the data directly from shared memory (again, no need to copy).
Not the fastest on win32 currently, but worth investigating: 0mq
Uses TCP sockets on Windows, but very efficiently.
For a closed source solution I don't think 29 West's Ultra Messaging can easily be trumped, includes a rare feature of zero-copy messaging in .net
I am developing some experimental setup in C.
I am exploring a scenario as follows and I need help to understand it.
I have a system A which has a lot of Applications using cryptographic algorithms.
But these crypto calls(openssl calls) should be sent to another System B which takes care of cryptography.
Therefore, I have to send any calls to cryptographic (openssl) engines via socket to a remote system(B) which has openssl support.
My plan is to have a small socket prog on System A which forwards these calls to system B.
What I'm still unclear at this moment is how I handle the received commands at System B.
Do I actually get these commands and translate them into corresponding calls to openssl locally in my system? This means I have to program whatever is done on System A right?
Or is there a way to tunnel/send these raw lines of code to the openssl libs directly and just received the result and then resend to System A
How do you think I should go about the problem?
PS: Oh by the way, the calls to cryptography(like EngineUpdate, VerifyFinal etc or Digest on System A can be either on Java or C.. I already wrote a Java/C program to send these commands to System B via sockets...
The problem is only on System B and how I have to handle..
You could use sockets on B, but that means you need to define a protocol for that. Or you use RPC (remote procedure calls).
Examples for socket programming can be found here.
RPC is explained here.
The easiest (not to say "the easy", but still) way I can imagine would be to:
Write wrapper (proxy) versions of the libraries you want to make remote.
Write a server program that listens to calls, performs them using the real local libraries, and sends the result back.
Preload the proxy library before running any application where you want to do this.
Of course, there are many many problems with this approach:
It's not exactly trivial to define a serializing protocol for generic C function calls.
It's not exactly trivial to write the server, either.
Applications will slow a lot, since the proxy call needs to be synchronous.
What about security of the data on the network?
UPDATE:
As requested in a comment, I'll try to expand a bit. By "wrapper" I mean a new library, that has the same API as another one, but does not in fact contain the same code. Instead, the wrapper library will contain code to serialize the arguments, call the server, wait for a response, de-serialize the result(s), and present them to the calling program as if nothing happened.
Since this involves a lot of tedious, repetitive and error-prone code, it's probably best to abstract it by making it code-driven. The best would be to use the original library's header file to define the serialization needed, but that (of course) requires quite heavy C parsing. Failing that, you might start bottom-up and make a custom language to describe the calls, and then use that to generate the serialization, de-serialization, and proxy code.
On Linux systems, you can control the dynamic linker so that it loads your proxy library instead of the "real" library. You could of course also replace (on disk) the real library with the proxy, but that will break all applications that use it if the server is not working, which seems very risky.
So you basically have two choices, each outlined by unwind and ammoQ respectively:
(1) Write a server and do the socket/protocol work etc., yourself. You can minimize some of the pain by using solutions like Google's protocol buffers.
(2) use an existing middleware solution like (a) message queues or (b) an RPC mechanism like CORBA and its many alternatives
Either is probably more work than you anticipated. So really you have to answer this yourself. How serious is your project? How varied is your hardware? How likely is the hardware and software configuration to change in the future?
If this is more than a learning or pet project you are going to be bored with in a month or two then an existing middleware solution is probably the way to go. The downside is there is a somewhat intimidating learning curve.
You can go the RPC route with CORBA, ICE, or whatever the Java solutions are these days (RMI? EJB?), and a bunch of others. This is an elegant solution since your calls to the remote encryption machine appear to your SystemA as simple function calls and the middleware handles the data issues and sockets. But you aren't going to learn them in a weekend.
Personally I would look to see if a message queue solution like AMQP would work for you first. There is less of a learning curve than RPC.