Missing characters from input stream from fastcgi request

Missing characters from input stream from fastcgi request - c

I'm trying to develop simple RESTful api using FastCGI (and restcgi). When I tried to implement POST method I noticed that the input stream (representing request body) is wrong. I did a little test and looks like when I try to read the stream only every other character is received.
Body sent: name=john&surname=smith
Received: aejh&unm=mt
I've tried more clients just to make sure it's not the client messing with the data.
My code is:
int main(int argc, char* argv[]) {
// FastCGI initialization.
FCGX_Init();
FCGX_Request request;
FCGX_InitRequest(&request, 0, 0);
while (FCGX_Accept_r(&request) >= 0) {
// FastCGI request setup.
fcgi_streambuf fisbuf(request.in);
std::istream is(&fisbuf);
fcgi_streambuf fosbuf(request.out);
std::ostream os(&fosbuf);
std::string str;
is >> str;
std::cerr << str; // this way I can see it in apache error log
// restcgi code here
}
return 0;
}
I'm using fast_cgi module with apache (not sure if that makes any difference).
Any idea what am I doing wrong?

The problem is in fcgio.cpp
The fcgi_steambuf class is defined using char_type, but the int underflow() method downcasts its return value to (unsigned char), it should cast to (char_type).

I encountered this problem as well, on an unmodified Debian install.
I found that the problem went away if I supplied a buffer to the fcgi_streambuf constructor:
const size_t LEN = ... // whatever, it doesn't have to be big.
vector<char> v (LEN);
fcgi_streambuf buf (request.in, &v[0], v.size());
iostream in (&buf);
string s;
getline(in, s); // s now holds the correct data.

After finding no answer anywhere (not even FastCGI mailing list) I dumped the original fastcgi libraries and tried using fastcgi++ libraries instead. The problem disappeared. There are also other benefits - c++, more features, easier to use.

Use is.read() not is >> ...
Sample from restcgi documentation:
clen = strtol(clenstr, &clenstr, 10);
if (*clenstr)
{
cerr << "can't parse \"CONTENT_LENGTH="
<< FCGX_GetParam("CONTENT_LENGTH", request->envp)
<< "\"\n";
clen = STDIN_MAX;
}
// *always* put a cap on the amount of data that will be read
if (clen > STDIN_MAX) clen = STDIN_MAX;
*content = new char[clen];
is.read(*content, clen);
clen = is.gcount();

Related

Sending Image Data via HTTP Websockets in C

I'm currently trying to build a library similar to ExpressJS in C. I have the ability to send any text (with res.send() functionality) or textually formatted file (.html, .txt, .css, etc.).
However, sending image data seems to cause a lot more trouble! I'm trying to use pretty much the exact same process I used for reading textual files. I saw this post and answer which uses a MAXLEN variable, which I would like to avoid. First, here's how I'm reading the data in:
// fread char *, goes 64 chars at a time
char *read_64 = malloc(sizeof(char) * 64);
// the entirety of the file data is placed in full_data
int *full_data_max = malloc(sizeof(int)), full_data_index = 0;
*full_data_max = 64;
char *full_data = malloc(sizeof(char) * *full_data_max);
full_data[0] = '\0';
// start reading 64 characters at a time from the file while fread gives positive feedback
size_t fread_response_length = 0;
while ((fread_response_length = fread(read_64, sizeof(char), 64, f_pt)) > 0) {
// internal array checker to make sure full_data has enough space
full_data = resize_array(full_data, full_data_max, full_data_index + 65, sizeof(char));
// copy contents of read_64 into full_data
for (int read_data_in = 0; read_data_in < fread_response_length / sizeof(char); read_data_in++) {
full_data[full_data_index + read_data_in] = read_64[read_data_in];
}
// update the entirety data current index pointer
full_data_index += fread_response_length / sizeof(char);
}
full_data[full_data_index] = '\0';
I believe the error is related to this component here. Likely something with calculating data length with fread() responses perhaps? I'll take you through the HTTP response creating as well.
I split the response sending into two components (as per the response on this question here). First I send my header, which looks good (29834 seems a bit large for image data, but that is an unjustified thought):
HTTP/1.1 200 OK
Content-Length: 29834
Content-Type: image/jpg
Connection: Keep-Alive
Access-Control-Allow-Origin: *
I send this first using the following code:
int *head_msg_len = malloc(sizeof(int));
// internal header builder that builds the aforementioned header
char *main_head_msg = create_header(status, head_msg_len, status_code, headers, data_length);
// send header
int bytes_sent = 0;
while ((bytes_sent = send(sock, main_head_msg + bytes_sent, *head_msg_len - bytes_sent / sizeof(char), 0)) < sizeof(char) * *head_msg_len);
Sending the image data (body)
Then I use a similar setup to try sending the full_data element that has the image data in it:
bytes_sent = 0;
while ((bytes_sent = send(sock, full_data + bytes_sent, full_data_index - bytes_sent, 0)) < full_data_index);
So, this all seems reasonable to me! I've even taken a look at the file original file and the file post curling, and they each start and end with the exact same sequence:
Original (| implies a skip for easy reading):
�PNG
�
IHDR��X��d�IT pHYs
|
|
|
RU�X�^Q�����땵I1`��-���
#QEQEQEQEQE~��#��&IEND�B`�
Post using curl:
�PNG
�
IHDR��X��d�IT pHYs
|
|
|
RU�X�^Q�����땵I1`��-���
#QEQEQEQEQE~��#��&IEND�B`
However, trying to open the file that was created after curling results in corruption errors. Similar issues occur on the browser as well. I'm curious if this could be an off by one or something small.
Edit:
If you would like to see the full code, check out this branch on Github.

Use of both easy and multi interfaces in libcurl for FTP

I'm writing a program that continuously and recursively checks an FTP server for new files. When a file is detected, it is downloaded.
I wrote the all thing using the curl easy interface, since blocking calls to curl_easy_perform() are great for the control channel and listing operations. But when it comes to download files, the multi interface seems a lot more appropriate. I thought about switching the entire thing to multi, but it gets very complicated for directory listing.
So here's my question, can I use both interfaces, easy and multi inside the same thread ? If so, can they share the same connection to the server ?
EDIT 1
Instead of using curl_easy_perform(), is there a way to check for a single transfer status ? So I could use the curl_multi_* interface for all my transfers, and only check my LIST command status right after I perform it. This would allow me to simulate a blocking behavior, without interfering with my file transfers that would be handled and checked elsewhere.
From what I saw, the curl_multi_info_read() doesn't allow to do so :
When you fetch a message using this function, it is removed from the internal queue so calling this function again will not return the same message again.

Does this answer your question:
When an easy handle is setup and ready for transfer, then instead of using curl_easy_perform like when using the easy interface for transfers, you should add the easy handle to the multi handle with curl_multi_add_handle. You can add more easy handles to a multi handle at any point, even if other transfers are already running.
From libcurl - multi interface overview (ONE MULTI HANDLE MANY EASY HANDLES)

can I use both interfaces, easy and multi inside the same thread ?
yes absolutely, but note that the easy api is mostly blocking, and the multi api is mostly non-blocking, so if you combine them wrong, you might end up in a situation where your multi transfers are hanging/slow because your thread is blocking on a curl_easy~ call.
If so, can they share the same connection to the server ?
strictly speaking, yes, at least in some situations, but you really should let libcurl worry about connection-reuse details, unless you're in a micro-optimization phase (and given your questions, you're absolutely not)
is there a way to check for a single transfer status
check status of a single transfer from a curl_multi list of transfers?
idk to be honest, when i use curl_multi, i usually only check up on them when they're no longer active as reported by curl_multi_info_read() & co.. you could wrap each transfer in its own object with its own dedicated download thread, and keep track of each transfer with CURLOPT_WRITEFUNCTION & co,
this program will output
transfer #1 is 4.70178% downloaded. running: true
transfer #2 is 6.51742% downloaded. running: true
transfer #3 is 6.14288% downloaded. running: true
transfer #4 is 6.01199% downloaded. running: true
transfer #0 is 12.3027% downloaded. running: true
transfer #1 is 8.73407% downloaded. running: true
transfer #2 is 14.0515% downloaded. running: true
transfer #3 is 12.8638% downloaded. running: true
transfer #4 is 11.8516% downloaded. running: true
(...)
transfer #0 is 94.8156% downloaded. running: true
transfer #1 is 88.5291% downloaded. running: true
transfer #2 is 98.8117% downloaded. running: true
transfer #3 is 92.01% downloaded. running: true
transfer #4 is 100% downloaded. running: false
transfer #0 is 100% downloaded. running: false
transfer #1 is 100% downloaded. running: false
transfer #2 is 100% downloaded. running: false
transfer #3 is 100% downloaded. running: false
it keeps track of each individual transfer in its own thread, and the main thread can easily check up on any individual transfers by doing transfers[x]->status ~
#include <iostream>
#include <thread>
#include <string>
#include <string_view>
#include <atomic>
#include <vector>
#include <memory>
#include <curl/curl.h>
class Curl_Transfer
{
public:
std::string url;
std::string response_headers;
std::string response_body;
CURL *ch = nullptr;
CURLcode curl_easy_perform_code = CURLcode(0);
bool running = true;
std::thread dedicated_thread;
int64_t expected_size = 0; // << content-length reported size
Curl_Transfer(std::string url) : url(url)
{
this->dedicated_thread = std::thread([&]() -> void
{
this->ch = curl_easy_init();
curl_easy_setopt(this->ch, CURLOPT_URL, this->url.c_str());
curl_easy_setopt(this->ch, CURLOPT_WRITEDATA,
this);
curl_easy_setopt(this->ch, CURLOPT_HEADERDATA,
this);
curl_easy_setopt(this->ch, CURLOPT_WRITEFUNCTION, this->WRITEFUNCTION_cb);
curl_easy_setopt(this->ch, CURLOPT_HEADERFUNCTION, this->HEADERFUNCTION_cb);
CURLcode code=curl_easy_perform(this->ch);
//std::cout << "code: " << code << std::endl;
this->curl_easy_perform_code = code;
this->running = false;
});
}
~Curl_Transfer()
{
std::cout << "DESTRUCTING!" << std::endl;
this->dedicated_thread.join();
curl_easy_cleanup(this->ch);
}
private:
// this function need to be static to be compatible with some C->C++ calling stuff... idk, but it also need access to this, so fthis=this...
static size_t WRITEFUNCTION_cb(const char *data, size_t size, size_t nmemb, Curl_Transfer *fthis)
{
CURL *ch = fthis->ch;
fthis->response_body.append(data, size * nmemb);
//std::cout << "got body data! " << size*nmemb << "\n";
return size * nmemb;
};
// this function need to be static to be compatible with some C->C++ calling stuff... idk, but it also need access to this, so fthis=this...
static size_t HEADERFUNCTION_cb(const char *data, size_t size, size_t nmemb, Curl_Transfer *fthis)
{
CURL *ch = fthis->ch;
//std::cout << "got headers! " << size*nmemb << "\n";
fthis->response_headers.append(data, size * nmemb);
std::string_view svd(data, size * nmemb);
const std::string_view needle = "Content-Length: ";
auto clp = svd.find(needle);
if (clp != std::string::npos)
{
svd = svd.substr(needle.size());
std::string fck(svd);
fthis->expected_size = std::stoll(fck, nullptr, 0);
}
return size * nmemb;
};
};
int main()
{
curl_global_init(~0); // << todo get proper constant
std::vector<Curl_Transfer *> transfers;
for (int i = 0; i < 5; ++i)
{
auto fck = new Curl_Transfer("http://speedtest.tele2.net/100MB.zip");
transfers.push_back((fck));
}
for (;;)
{
std::this_thread::sleep_for(std::chrono::seconds(5));
for (size_t i = 0; i < transfers.size(); ++i)
{
std::cout << "transfer #" << i << " is " << (double((transfers[i]->response_body.size()) / double(transfers[i]->expected_size))*100) << "% downloaded. running: " << (transfers[i]->running ? "true" : "false") << "\n";
}
}
}
there's probably a better way to do this though.. there has to be. but until someone smarter comes along, *this works at least...
apparently i did all the threading shit to avoid using the curl_multi api.. dafuq
you're using C, not C++... sorry, ofc you can do all of the above in C as well, but i'm not comfortable enough with C to enjoy re-writing that in C (anyone is free to re-write the code in C if they want to)

Getting Host field from TCP packet payload

I'm writing a kernel module in C, and trying to get the Host field from a TCP packet's payload, carrying http request headers.
I've managed to do something similar with FTP (scan the payload and look for FTP commands), but I can't seem to be able to do the same and find the field.
My module is connected to the POST_ROUTING hook.
each packet that goes to that hook, if it has a dst port of 80, is being recognized as an HTTP packet, and so my module starts to parse it.
for some reason, I can't seem to be able to get the HOST line (matter of fact, I only see the server HTTP 200 ok)
are these headers always go on the packets that use port 80?
if so, what is the best way to parse those packt's payload? seems like going char by char is a lot of work. is there any better way?
Thanks
EDIT:
Got some progress.
every packet I get from the server, I can read the payload with no problem. but every packet I send - it's like the payload is empty.
I thought it's a problem of skb pointer, but i'm getting the TCP ports fine. just can't seem to read this damn payload.
this is how i parse it:
unsigned char* user_data = (unsigned char *)((int)tcphd + (int)(tcphd->doff * 4));
unsigned char *it;
for (it = user_data; it != tail; ++it) {
unsigned char c = *(unsigned char *)it;
http_command[http_command_index] = c;
http_command_index++;
}
where tail:
tail = skb_tail_pointer(skb);
The pointer doesn't advance at all on the loop. it's like it's empty from the start or something, and I can't figure out why.
help, please.

I've managed to solve this.
using this
, I've figured out how to parse all of the packet's payload.
I hope this code explains it
int http_command_offset = iphd->ihl*4 + tcphd->doff*4;
int http_command_length = skb->len - http_command_offset;
http_command = kmalloc(http_command_length + 1, GFP_ATOMIC);
skb_copy_bits(skb, http_command_offset , (void*)http_command, http_command_length);
skb_cop_bits, just copies the payload entirely into the buffer i've created. parsing it now is pretty simple.

Nanopb without callbacks

I'm using Nanopb to try and send protobuf messages from a VxWorks based National Instruments Compact RIO (9025). My cross compilation works great, and I can even send a complete message with data types that don't require extra encoding. What's getting me is the callbacks. My code is cross compiled and called from LabVIEW and the callback based structure of Nanopb seems to break (error out, crash, target reboots, whatever) on the target machine. If I run it without any callbacks it works great.
Here is the code in question:
bool encode_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg)
{
char *str = "Woo hoo!";
if (!pb_encode_tag_for_field(stream, field))
return false;
return pb_encode_string(stream, (uint8_t*)str, strlen(str));
}
extern "C" uint16_t getPacket(uint8_t* packet)
{
uint8_t buffer[256];
uint16_t packetSize;
ExampleMsg msg = {};
pb_ostream_t stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
msg.name.funcs.encode = &encode_string;
msg.value = 17;
msg.number = 18;
pb_encode(&stream, ExampleMsg_fields, &msg);
packetSize = stream.bytes_written;
memcpy(packet, buffer, 256);
return packetSize;
}
And here's the proto file:
syntax = "proto2"
message ExampleMsg {
required int32 value = 1;
required int32 number = 2;
required string name = 3;
}
I have tried making the callback an extern "C" as well and it didn't change anything. I've also tried adding a nanopb options file with a max length and either didn't understand it correctly or it didn't work either.
If I remove the string from the proto message and remove the callback, it works great. It seems like the callback structure is not going to work in this LabVIEW -> C library environment. Is there another way I can encode the message without the callback structure? Or somehow embed the callback into the getPacket() function?
Updated code:
extern "C" uint16_t getPacket(uint8_t* packet)
{
uint8_t buffer[256];
for (unsigned int i = 0; i < 256; ++i)
buffer[i] = 0;
uint16_t packetSize;
ExampleMsg msg = {};
pb_ostream_t stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
msg.name.funcs.encode = &encode_string;
msg.value = 17;
msg.number = 18;
char name[] = "Woo hoo!";
strncpy(msg.name, name, strlen(name));
pb_encode(&stream, ExampleMsg_fields, &msg);
packetSize = stream.bytes_written;
memcpy(packet, buffer, sizeof(buffer));
return packetSize;
}
Updated proto file:
syntax = "proto2"
import "nanopb.proto";
message ExampleMsg {
required int32 value = 1;
required int32 number = 2;
required string name = 3 [(nanopb).max_size = 40];
}

You can avoid callbacks by giving a maximum size for the string field using the option (nanopb).max_size = 123 in the .proto file. Then nanopb can generate a simple char array in the structure (relevant part of documentation).
Regarding why callbacks don't work: just a guess, but try adding extern "C" also to the callback function. I assume you are using C++ there, so perhaps on that platform the C and C++ calling conventions differ and that causes the crash.
Does the VxWorks serial console give any more information about the crash? I don't remember if it does that for functions called from LabView, so running some test code directly from the VxWorks shell may be worth a try also.

Perhaps the first hurdle is how the code handles strings.
LabVIEW's native string representation is not null-terminated like C, but you can configure LabVIEW to use a different representation or update your code to handle LabVIEW's native format.
LabVIEW stores a string in a special format in which the first four bytes of the array of characters form a 32-bit signed integer that stores how many characters appear in the string. Thus, a string with n characters requires n + 4 bytes to store in memory.
LabVIEW Help: Using Arrays and Strings in the Call Library Function Node
http://zone.ni.com/reference/en-XX/help/371361L-01/lvexcodeconcepts/array_and_string_options/

Recursive CreateDirectory

I found many examples of CreatingDirectory recursively, but not the one I was looking for.
here is the spec
Given input
\\server\share\aa\bb\cc
c:\aa\bb\cc
USING helper API
CreateDirectory (char * path)
returns true, if successful
else
FALSE
Condition: There should not be any parsing to distinguish if the path is Local or Server share.
Write a routine in C, or C++

I think it's quite easier... here a version that works in every Windows version:
unsigned int pos = 0;
do
{
pos = path.find_first_of("\\/", pos + 1);
CreateDirectory(path.substr(0, pos).c_str(), NULL);
} while (pos != std::string::npos);
Unicode:
pos = path.find_first_of(L"\\/", pos + 1);
Regards,

This might be exactly what you want.
It doesn't try to do any parsing to distinguish if the path is Local or Server share.
bool TryCreateDirectory(char *path){
char *p;
bool b;
if(
!(b=CreateDirectory(path))
&&
!(b=NULL==(p=strrchr(path, '\\')))
){
size_t i;
(p=strncpy((char *)malloc(1+i), path, i=p-path))[i]='\0';
b=TryCreateDirectory(p);
free(p);
b=b?CreateDirectory(path):false;
}
return b;
}
The algorithm is quite simple, just pass the string of higher level directory recursively while creation of current level of directory fails until one success or there is no more higher level. When the inner call returns with succeed, create the current. This method do not parse to determ the local or server it self, it's according to the CreateDirectory.
In WINAPI, CreateDirectory will never allows you to create "c:" or "\" when the path reaches that level, the method soon falls in to calling it self with path="" and this fails, too. It's the reason why Microsoft defines file sharing naming rule like this, for compatibility of DOS path rule and simplify the coding effort.

Totally hackish and insecure and nothing you'd ever actually want to do in production code, but...
Warning: here be code that was typed in a browser:
int createDirectory(const char * path) {
char * buffer = malloc((strlen(path) + 10) * sizeof(char));
sprintf(buffer, "mkdir -p %s", path);
int result = system(buffer);
free(buffer);
return result;
}

How about using MakeSureDirectoryPathExists() ?

Just walk through each directory level in the path starting from the root, attempting to create the next level.
If any of the CreateDirectory calls fail then you can exit early, you're successful if you get to the end of the path without a failure.
This is assuming that calling CreateDirectory on a path that already exists has no ill effects.

The requirement of not parsing the pathname for server names is interesting, as it seems to concede that parsing for / is required.
Perhaps the idea is to avoid building in hackish expressions for potentially complex syntax for hosts and mount points, which can have on some systems elaborate credentials encoded.
If it's homework, I may be giving away the algorithm you are supposed to think up, but it occurs to me that one way to meet those requirements is to start trying by attempting to mkdir the full pathname. If it fails, trim off the last directory and try again, if that fails, trim off another and try again... Eventually you should reach a root directory without needing to understand the server syntax, and then you will need to start adding pathname components back and making the subdirs one by one.

std::pair<bool, unsigned long> CreateDirectory(std::basic_string<_TCHAR> path)
{
_ASSERT(!path.empty());
typedef std::basic_string<_TCHAR> tstring;
tstring::size_type pos = 0;
while ((pos = path.find_first_of(_T("\\/"), pos + 1)) != tstring::npos)
{
::CreateDirectory(path.substr(0, pos + 1).c_str(), nullptr);
}
if ((pos = path.find_first_of(_T("\\/"), path.length() - 1)) == tstring::npos)
{
path.append(_T("\\"));
}
::CreateDirectory(path.c_str(), nullptr);
return std::make_pair(
::GetFileAttributes(path.c_str()) != INVALID_FILE_ATTRIBUTES,
::GetLastError()
);
}

void createFolders(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
char combinedName[50]={'\0'};
while (std::getline(ss, item, delim)) {
sprintf(combinedName,"%s%s%c",combinedName,item.c_str(),delim);
cout<<combinedName<<endl;
struct stat st = {0};
if (stat(combinedName,&st)==-1)
{
#if REDHAT
mkdir(combinedName,0777);
#else
CreateDirectory(combinedName,NULL);
#endif
}
}
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Missing characters from input stream from fastcgi request - c

The problem is in fcgio.cpp The fcgi_steambuf class is defined using char_type, but the int underflow() method downcasts its return value to (unsigned char), it should cast to (char_type).

After finding no answer anywhere (not even FastCGI mailing list) I dumped the original fastcgi libraries and tried using fastcgi++ libraries instead. The problem disappeared. There are also other benefits - c++, more features, easier to use.

Related

Sending Image Data via HTTP Websockets in C

Use of both easy and multi interfaces in libcurl for FTP

Getting Host field from TCP packet payload

Nanopb without callbacks

Recursive CreateDirectory

Categories

Resources