Not seeing uploaded file in C when using form-data - c
I have a simple web server in C that prints out everything it receives from a client. It works great except when I try to upload a file using multipart/form-data I tried to do this using Postman and curl with the same results. What could be the reason for this?
But when I choose binary then I see my uploaded file in the output.
When using form-data I get this (I don't see the uploaded file):
POST /hello-world/ HTTP/1.1
Content-Type: multipart/form-data; boundary=--------------------------257071285279776422109124
User-Agent: PostmanRuntime/7.21.0
Host: localhost:8088
Accept-Encoding: gzip, deflate
Content-Length: 753
Connection: keep-alive
----------------------------257071285279776422109124
Content-Disposition: form-data; name="filecoming2"; filename="imgage.jpg"
Content-Type: image/jpeg
80 79 83 84 32 47 104 101 108 108 111 45 119 111 114 108 100 47 32 72 84 84 80 47 49 46 49 13 10 67 111 110 116 101 110 116 45 84 121 112 101 58 32 109 117 108 116 105 112 97 114 116 47 102 111 114 109 45 100 97 116 97 59 32 98 111 117 110 100 97 114 121 61 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 50 53 55 48 55 49 50 56 53 50 55 57 55 55 54 52 50 50 49 48 57 49 50 52 13 10 85 115 101 114 45 65 103 101 110 116 58 32 80 111 115 116 109 97 110 82 117 110 116 105 109 101 47 55 46 50 49 46 48 13 10 65 99 99 101 112 116 58 32 42 47 42 13 10 67 97 99 104 101 45 67 111 110 116 114 111 108 58 32 110 111 45 99 97 99 104 101 13 10 80 111 115 116 109 97 110 45 84 111 107 101 110 58 32 99 48 50 100 55 99 49 57 45 102 100 50 52 45 52 49 97 48 45 56 52 53 99 45 100 56 54 49 101 99 55 54 48 102 56 99 13 10 72 111 115 116 58 32 108 111 99 97 108 104 111 115 116 58 56 48 56 56 13 10 65 99 99 101 112 116 45 69 110 99 111 100 105 110 103 58 32 103 122 105 112 44 32 100 101 102 108 97 116 101 13 10 67 111 110 116 101 110 116 45 76 101 110 103 116 104 58 32 55 53 51 13 10 67 111 110 110 101 99 116 105 111 110 58 32 107 101 101 112 45 97 108 105 118 101 13 10 13 10 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 50 53 55 48 55 49 50 56 53 50 55 57 55 55 54 52 50 50 49 48 57 49 50 52 13 10 67 111 110 116 101 110 116 45 68 105 115 112 111 115 105 116 105 111 110 58 32 102 111 114 109 45 100 97 116 97 59 32 110 97 109 101 61 34 102 105 108 101 99 111 109 105 110 103 50 34 59 32 102 105 108 101 110 97 109 101 61 34 105 109 103 97 103 101 46 106 112 103 34 13 10 67 111 110 116 101 110 116 45 84 121 112 101 58 32 105 109 97 103 101 47 106 112 101 103 13 10 13 10
When using binary (in Postman) I see my uploaded file!
POST /hello-world/ HTTP/1.1
Content-Type: application/x-www-form-urlencoded
User-Agent: PostmanRuntime/7.21.0
Host: localhost:8088
Accept-Encoding: gzip, deflate
Content-Length: 538
Connection: keep-alive
����
80 79 83 84 32 47 104 101 108 108 111 45 119 111 114 108 100 47 32 72 84 84 80 47 49 46 49 13 10 67 111 110 116 101 110 116 45 84 121 112 101 58 32 97 112 112 108 105 99 97 116 105 111 110 47 120 45 119 119 119 45 102 111 114 109 45 117 114 108 101 110 99 111 100 101 100 13 10 85 115 101 114 45 65 103 101 110 116 58 32 80 111 115 116 109 97 110 82 117 110 116 105 109 101 47 55 46 50 49 46 48 13 10 65 99 99 101 112 116 58 32 42 47 42 13 10 67 97 99 104 101 45 67 111 110 116 114 111 108 58 32 110 111 45 99 97 99 104 101 13 10 80 111 115 116 109 97 110 45 84 111 107 101 110 58 32 55 51 56 99 55 55 55 51 45 50 102 57 52 45 52 101 100 102 45 56 99 52 102 45 48 48 55 99 54 50 98 51 48 50 55 53 13 10 72 111 115 116 58 32 108 111 99 97 108 104 111 115 116 58 56 48 56 56 13 10 65 99 99 101 112 116 45 69 110 99 111 100 105 110 103 58 32 103 122 105 112 44 32 100 101 102 108 97 116 101 13 10 67 111 110 116 101 110 116 45 76 101 110 103 116 104 58 32 53 51 56 13 10 67 111 110 110 101 99 116 105 111 110 58 32 107 101 101 112 45 97 108 105 118 101 13 10 13 10 -1 -40 -1 -32 0 16 74 70 73 70 0 1 1 1 0 72 0 72 0 0 -1 -2 0 19 67 114 101 97 116 101 100 32 119 105 116 104 32 71 73 77 80 -1 -37 0 67 0 80 55 60 70 60 50 80 70 65 70 90 85 80 95 120 -56 -126 120 110 110 120 -11 -81 -71 -111 -56 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -37 0 67 1 85 90 90 120 105 120 -21 -126 -126 -21 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -62 0 17 8 0 1 0 1 3 1 17 0 2 17 1 3 17 1 -1 -60 0 20 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 -1 -60 0 20 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 12 3 1 0 2 16 3 16 0 0 1 3 -1 -60 0 20 16 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 1 0 1 5 2 127 -1 -60 0 20 17 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 3 1 1 63 1 127 -1 -60 0 20 17 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 2 1 1 63 1 127 -1 -60 0 20 16 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 1 0 6 63 2 127 -1 -60 0 20 16 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 1 0 1 63 33 127 -1 -38 0 12 3 1 0 2 0 3 0 0 0 16 -97 -1 -60 0 20 17 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 3 1 1 63 16 127 -1 -60 0 20 17 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 2 1 1 63 16 127 -1 -60 0 20 16 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -38 0 8 1 1 0 1 63 16 127 -1 -39
Here is my web server in C:
#include <stdio.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdlib.h>
#include <netinet/in.h>
#include <string.h>
#include <stdbool.h>
#define PORT 8088
int main(void)
{
int server_fd, new_socket;
long valread;
struct sockaddr_in address;
int addrlen = sizeof(address);
char *hello = "HTTP/1.1 200 OK\nContent-Type: text/plain\nContent-Length: 12\n\nHello world!";
server_fd = socket(AF_INET, SOCK_STREAM, 0);
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons( PORT );
//memset(address.sin_zero, '\0', sizeof address.sin_zero);
bind(server_fd, (struct sockaddr *)&address, sizeof(address));
listen(server_fd, 10);
while(true)
{
printf("\n+++++++ Waiting for new connection ++++++++\n\n");
new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen);
char buffer[30000] = {0};
valread = read(new_socket, buffer, 30000);
printf("%s\n", buffer);
printf("\n");
for (int i = 0; i < valread; i++) { printf("%i ", buffer[i]); }
printf("\n\n");
write(new_socket, hello, strlen(hello));
printf("------------------Hello message sent-------------------\n");
close(new_socket);
}
}
TCP is stream oriented. You need to call read and write repeatedly. Considering your example, a single call to read does not guarantee to fill the buffer with the whole file. Read the HTTP header, parse the Content-Length and continue reading until your total valread sums up to header size plus Content-Length.
You are not reading enough data.
read() is not guaranteed to return as many bytes as you ask for. And there is no guarantee that read() will receive a complete HTTP message is a single read. TCP is a byte stream, you need to call read() in a loop until the complete HTTP request has been received in full. See RFC 2616 Section 4.4 and RFC 7230 Section 3.3.3 for the rules on how to detect the end of an HTTP message.
4.4 Message Length
The transfer-length of a message is the length of the message-body as
it appears in the message; that is, after any transfer-codings have
been applied. When a message-body is included with a message, the
transfer-length of that body is determined by one of the following
(in order of precedence):
1.Any response message which "MUST NOT" include a message-body (such
as the 1xx, 204, and 304 responses and any response to a HEAD
request) is always terminated by the first empty line after the
header fields, regardless of the entity-header fields present in
the message.
2.If a Transfer-Encoding header field (section 14.41) is present and
has any value other than "identity", then the transfer-length is
defined by use of the "chunked" transfer-coding (section 3.6),
unless the message is terminated by closing the connection.
3.If a Content-Length header field (section 14.13) is present, its
decimal value in OCTETs represents both the entity-length and the
transfer-length. The Content-Length header field MUST NOT be sent
if these two lengths are different (i.e., if a Transfer-Encoding
header field is present). If a message is received with both a
Transfer-Encoding header field and a Content-Length header field,
the latter MUST be ignored.
4.If the message uses the media type "multipart/byteranges", and the
ransfer-length is not otherwise specified, then this self-
elimiting media type defines the transfer-length. This media type
UST NOT be used unless the sender knows that the recipient can arse
it; the presence in a request of a Range header with ultiple byte-
range specifiers from a 1.1 client implies that the lient can parse
multipart/byteranges responses.
A range header might be forwarded by a 1.0 proxy that does not
understand multipart/byteranges; in this case the server MUST
delimit the message using methods defined in items 1,3 or 5 of
this section.
5.By the server closing the connection. (Closing the connection
cannot be used to indicate the end of a request body, since that
would leave no possibility for the server to send back a response.)
For compatibility with HTTP/1.0 applications, HTTP/1.1 requests
containing a message-body MUST include a valid Content-Length header
field unless the server is known to be HTTP/1.1 compliant. If a
request contains a message-body and a Content-Length is not given,
the server SHOULD respond with 400 (bad request) if it cannot
determine the length of the message, or with 411 (length required) if
it wishes to insist on receiving a valid Content-Length.
All HTTP/1.1 applications that receive entities MUST accept the
"chunked" transfer-coding (section 3.6), thus allowing this mechanism
to be used for messages when the message length cannot be determined
in advance.
Messages MUST NOT include both a Content-Length header field and a
non-identity transfer-coding. If the message does include a non-
identity transfer-coding, the Content-Length MUST be ignored.
When a Content-Length is given in a message where a message-body is
allowed, its field value MUST exactly match the number of OCTETs in
the message-body. HTTP/1.1 user agents MUST notify the user when an
invalid length is received and detected.
3.3.3. Message Body Length
The length of a message body is determined by one of the following
(in order of precedence):
1. Any response to a HEAD request and any response with a 1xx
(Informational), 204 (No Content), or 304 (Not Modified) status
code is always terminated by the first empty line after the
header fields, regardless of the header fields present in the
message, and thus cannot contain a message body.
2. Any 2xx (Successful) response to a CONNECT request implies that
the connection will become a tunnel immediately after the empty
line that concludes the header fields. A client MUST ignore any
Content-Length or Transfer-Encoding header fields received in
such a message.
3. If a Transfer-Encoding header field is present and the chunked
transfer coding (Section 4.1) is the final encoding, the message
body length is determined by reading and decoding the chunked
data until the transfer coding indicates the data is complete.
If a Transfer-Encoding header field is present in a response and
the chunked transfer coding is not the final encoding, the
message body length is determined by reading the connection until
it is closed by the server. If a Transfer-Encoding header field
is present in a request and the chunked transfer coding is not
the final encoding, the message body length cannot be determined
reliably; the server MUST respond with the 400 (Bad Request)
status code and then close the connection.
If a message is received with both a Transfer-Encoding and a
Content-Length header field, the Transfer-Encoding overrides the
Content-Length. Such a message might indicate an attempt to
perform request smuggling (Section 9.5) or response splitting
(Section 9.4) and ought to be handled as an error. A sender MUST
remove the received Content-Length field prior to forwarding such
a message downstream.
4. If a message is received without Transfer-Encoding and with
either multiple Content-Length header fields having differing
field-values or a single Content-Length header field having an
invalid value, then the message framing is invalid and the
recipient MUST treat it as an unrecoverable error. If this is a
request message, the server MUST respond with a 400 (Bad Request)
status code and then close the connection. If this is a response
message received by a proxy, the proxy MUST close the connection
to the server, discard the received response, and send a 502 (Bad
Gateway) response to the client. If this is a response message
received by a user agent, the user agent MUST close the
connection to the server and discard the received response.
5. If a valid Content-Length header field is present without
Transfer-Encoding, its decimal value defines the expected message
body length in octets. If the sender closes the connection or
the recipient times out before the indicated number of octets are
received, the recipient MUST consider the message to be
incomplete and close the connection.
6. If this is a request message and none of the above are true, then
the message body length is zero (no message body is present).
7. Otherwise, this is a response message without a declared message
body length, so the message body length is determined by the
number of octets received prior to the server closing the
connection.
Since there is no way to distinguish a successfully completed,
close-delimited message from a partially received message interrupted
by network failure, a server SHOULD generate encoding or
length-delimited messages whenever possible. The close-delimiting
feature exists primarily for backwards compatibility with HTTP/1.0.
A server MAY reject a request that contains a message body but not a
Content-Length by responding with 411 (Length Required).
Unless a transfer coding other than chunked has been applied, a
client that sends a request containing a message body SHOULD use a
valid Content-Length header field if the message body length is known
in advance, rather than the chunked transfer coding, since some
existing services respond to chunked with a 411 (Length Required)
status code even though they understand the chunked transfer coding.
This is typically because such services are implemented via a gateway
that requires a content-length in advance of being called and the
server is unable or unwilling to buffer the entire request before
processing.
A user agent that sends a request containing a message body MUST send
a valid Content-Length header field if it does not know the server
will handle HTTP/1.1 (or later) requests; such knowledge can be in
the form of specific user configuration or by remembering the version
of a prior received response.
If the final response to the last request on a connection has been
completely received and there remains additional data to read, a user
agent MAY discard the remaining data or attempt to determine if that
data belongs as part of the prior response body, which might be the
case if the prior message's Content-Length value is incorrect. A
client MUST NOT process, cache, or forward such extra data as a
separate response, since such behavior would be vulnerable to cache
poisoning.
Likewise, write() may not send as many bytes as you ask of it, so you need to call write() in a loop as well.
For example, try something more along the lines of this:
#include <stdio.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdlib.h>
#include <netinet/in.h>
#include <string.h>
#include <stdbool.h>
#define PORT 8088
struct data_buffer
{
void *data;
size_t size;
size_t cap;
};
void initDataBuffer(struct data_buffer *buffer)
{
buffer->data = NULL;
buffer->size = 0;
buffer->cap = 0;
}
void freeDataBuffer(struct data_buffer *buffer)
{
if (buffer->data) free(buffer->data);
initDataBuffer(buffer);
}
int appendBytes(struct data_buffer *buffer, void *data, size_t data_size)
{
if ((buffer->cap - buffer->size) < data_size)
{
size_t new_cap = (((buffer->size + data_size) + 255) / 256) * 256;
char *new_buffer = realloc(buffer->data, new_cap);
if (!new_buffer)
return -1;
buffer->data = new_buffer;
buffer->cap = new_cap;
}
memcpy(buffer->data, data, data_size);
buffer->size += data_size;
return 0;
}
int readRaw(int fd, void *data, size_t size)
{
char *ptr = data;
ssize_t recvd;
while (size > 0)
{
recvd = read(fd, ptr, size);
if (recvd <= 0)
return -1;
printf("%.*s", recvd, ptr);
ptr += sent;
size -= sent;
}
return 0;
}
char* readLine(int fd)
{
struct data_buffer line_buffer;
char ch;
line_buffer.data = NULL;
line_buffer.size = 0;
line_buffer.cap = 0;
while (true)
{
if (read(fd, &ch, 1) <= 0)
{
freeDataBuffer(&line_buffer);
return NULL;
}
if (ch == '\r')
{
if (read(fd, &ch, 1) <= 0)
{
freeDataBuffer(&line_buffer);
return NULL;
}
if (ch != '\n')
{
printf("%c", '\r');
if (appendBytes(&line_buffer, "\r", 1) < 0)
{
freeDataBuffer(&line_buffer);
return NULL;
}
}
}
printf("%c", ch);
if (ch == '\n')
break;
if (appendBytes(&line_buffer, &ch, 1) < 0)
{
freeDataBuffer(&line_buffer);
return NULL;
}
}
if (!appendBytes(&line_buffer, "\0", 1))
{
freeDataBuffer(&line_buffer);
return NULL;
}
return line_buffer.data;
}
struct string_list
{
char **strings;
size_t count;
size_t cap;
};
void initStringList(struct string_list *list)
{
list->strings = NULL;
list->count = 0;
list->cap = 0;
}
void freeStringList(struct string_list *list)
{
if (list->strings)
{
for(size_t i = 0 ; i < list->count; ++i)
free(list->strings[i]);
free(list->strings);
}
initStringList(list);
}
int appendString(struct string_list *list, char *str)
{
if (list->count >= list->cap)
{
size_t new_cap = list->cap + 10;
char **new_strings = realloc(list->strings, sizeof(char*) * new_cap);
if (!new_strings)
return -1;
list->strings = new_strings;
list->cap = new_cap;
}
list->strings[list->count] = str;
list->count += 1;
return 0;
}
int readHeaders(int fd, struct string_list *headers)
{
char *line;
initStringList(headers);
while (true)
{
line = readLine(fd);
if (!line)
{
freeStringList(headers);
return -1;
}
if (*line == '\0')
{
free(line);
break;
}
if (!appendLine(headers, line))
{
free(line);
freeStringList(headers);
return -1;
}
}
return 0;
}
int sendRaw(int fd, void *data, size_t size)
{
char *ptr = data;
ssize_t sent;
while (size > 0)
{
sent = write(fd, ptr, size);
if (sent < 1)
return -1;
ptr += sent;
size -= sent;
}
return 0;
}
int sendString(int fd, char *str)
{
return sendRaw(fd, line, strlen(line));
}
int sendLine(int fd, char *line)
{
if (sendString(fd, line) < 0)
return -1;
return sendRaw(fd, "\r\n", 2);
}
int sendResponse(int fd, int code, char *status, char **headers, void *data, size_t data_size)
{
char tmp[50];
sprintf(tmp, "HTTP/1.1 %d ", code);
if (sendString(fd, tmp) < 0)
return -1;
if (sendLine(fd, status) < 0)
return -1;
if (headers)
{
for(size_t i = 0; headers[i] != NULL; ++i)
{
if (sendLine(fd, headers[i]) < 0)
return -1;
}
}
sprintf(tmp, "Content-Length: %u ", data_size);
if (sendLine(fd, tmp) < 0)
return -1;
if (sendRaw(fd, "\r\n", 2) < 0)
return -1;
return sendRaw(fd, data, data_size);
}
struct httpRequest
{
struct string_list headers;
struct data_buffer body;
};
void initHttpRequest(struct httpRequest *req)
{
initStringList(&(req->headers));
initDataBuffer(&(req->body));
}
void freeHttpRequest(struct httpRequest *req)
{
freeStringList(&(req->headers));
freeDataBuffer(&(req->body));
}
int readRequest(int fd, struct httpRequest *req)
{
char *transferEncoding = NULL;
char *contentLength = NULL;
char* error_headers[] = {"Connection: close", NULL};
initHttpRequest(req);
if (readHeaders(fd, &(req->headers)) < 0)
{
freeHttpRequest(req);
sendResponse(fd, 500, "Internal Server Error", error_headers, NULL, 0);
return -1;
}
for (size_t i = 0; req->headers[i] != NULL; ++i)
{
if (strncmpi(req->headers[i], "Transfer-Encoding:", 18) == 0)
transferEncoding = req->headers[i] + 18;
else if (strncmpi(req->headers[i], "Content-Length:", 15) == 0)
contentLength = req->headers[i] + 15;
}
if (transferEncoding)
{
size_t len = strlen(transferEncoding);
if ((len < 7) || (strncmpi(transferEncoding + len - 7, "chunked", 7) != 0))
{
freeHttpRequest(req);
sendResponse(fd, 400, "Bad Request", error_headers, NULL, 0);
return -1;
}
char chunk_data[1024];
while (true)
{
char *chunk_line = readLine(fd);
if (!chunk_line)
{
freeHttpRequest(req);
sendResponse(fd, 500, "Internal Server Error", error_headers, NULL, 0);
return -1;
}
long chunk_size = strtol(chunk_line, NULL, 16);
free(chunk_line);
if ((chunk_size < 0) || (chunk_size >= LONG_MAX))
{
freeHttpRequest(req);
sendResponse(fd, 400, "Bad Request", error_headers, NULL, 0);
return -1;
}
if (chunk_size == 0)
break;
while (chunk_size > 0)
{
size_t size = min(chunk_size, sizeof(chunk_data));
if (readRaw(fd, chunk_data, size) < 0)
{
freeHttpRequest(req);
return -1;
}
if (!appendBytes(&(req->body), chunk_data, size))
{
freeHttpRequest(req);
sendResponse(fd, 500, "Internal Server Error", error_headers, NULL, 0);
return -1;
}
}
}
struct string_list trailer_headers;
if (readHeaders(fd, &trailer_headers) < 0)
{
freeHttpRequest(req);
sendResponse(fd, 500, "Internal Server Error", error_headers, NULL, 0);
return -1;
}
// update entries of req->headers with entries of trailer_headers as needed...
freeStringList(&trailer_headers);
}
else if (contentLength)
{
long content_size = strtol(contentLength, NULL, 10);
if ((content_size < 0) || (content_size >= LONG_MAX))
{
freeHttpRequest(req);
sendResponse(fd, 400, "Bad Request", error_headers, NULL, 0);
return -1;
}
req->body.data = malloc(content_size);
if (!req->body.data)
{
freeHttpRequest(req);
sendResponse(fd, 500, "Internal Server Error", error_headers, NULL, 0);
return -1;
}
req->data_cap = content_size;
if (readRaw(fd, req->body.data, content_size) < 0)
{
freeHttpRequest(req);
return -1;
}
req->data_size = content_size;
}
return 0;
}
int main(void)
{
int server_fd, new_socket;
struct sockaddr_in address;
struct httpRequest req;
int addrlen = sizeof(address);
server_fd = socket(AF_INET, SOCK_STREAM, 0);
if (server_fd < 0)
{
perror("socket");
return 1;
}
memset(&address, 0, sizeof address);
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons( PORT );
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address)) < 0)
{
perror("bind");
return 1;
}
if (listen(server_fd, 10) < 0)
{
perror("listen");
return 1;
}
while (true)
{
printf("\n+++++++ Waiting for new connection ++++++++\n\n");
addrlen = sizeof(address);
new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen);
if (new_socket < 0)
{
perror("accept");
return 1;
}
if (readRequest(new_socket, &req) < 0)
{
close(new_socket);
continue;
}
// use req as needed...
printf("\n");
for (size_t i = 0; i < req->body.size; i++)
printf("%02X ", (int) ((char*)req->body.data)[i]);
printf("\n\n");
freeHttpRequest(&req);
char* resp_headers[] = {"Connection: close", "Content-Type: text/plain", NULL};
if (sendResponse(new_socket, 200, "OK", resp_headers, "Hello world!", 12) == 0)
printf("------------------Hello message sent-------------------\n");
close(new_socket);
}
return 0;
}
Related
Strange behavior with -pg and optimization when generating hashes
I decided to make a program to find a sha3-512 hash with a certain number of zeroes at the start (like hashcash). It was working fine in my initial tests, so i decided to profile it with gprof to see if I could make it faster. After I had compiled with -pg and run it, I though i should go buy a lotto ticket. On the very first nonce I got a hash with 8 zeroes. However, I ran it again and I again got a number with 8 zeroes at the start. In fact, there were many discernible patterns in the hashes. After a few tests, I found that this only happened if I compiled with -pg and one of -O1, -O2, and -O3. Here is the program #include <tomcrypt.h> #include <string.h> #include <time.h> #include <stdbool.h> #include <stdio.h> unsigned char* randstring(size_t length) { srand(time(NULL)); unsigned char* randomString = NULL; if (length) { randomString = malloc(sizeof(char) * (length)); if (randomString) { for (int n = 0; n < length; n++) { int key = rand() % 255; randomString[n] = (unsigned char)key; } } } return randomString; } void find_nonce(int zeroes, int* nonce_ptr, unsigned char* returner) { unsigned char string[40]; unsigned char* rand_string = randstring(30); memcpy(string, rand_string, 30); free(rand_string); //string is longer than rand_string because i need romm to put the nonce in int nonce = 0; int idx; if (register_hash(&sha3_512_desc) == -1) { printf("Error registering SHA3-512.\n"); exit(1); } idx = find_hash("sha3-512"); if (idx == -1) { printf("Invalid hash name!\n"); exit(1); } int res_bool = false; unsigned char res[64]; unsigned long res_size; char nonce_str[11]; int nonce_len = 0; while (!res_bool) { //Put the nonce into a string sprintf(nonce_str, "%d", nonce); //Put the nonce string into the string as an unsigned char (hash_memory takes an unsigned char) for (int i = 0, j = 11;; ++i, ++j) { if (nonce_str[i] == '\0') { break; } string[j] = (unsigned char)nonce_str[i]; nonce_len++; } //Hash it hash_memory(idx, string, 30+nonce_len, res, &res_size); nonce_len = 0; //Check if the string has a sufficient number of zeroes at the start res_bool = true; for (int i = 0; i < zeroes; i++) { if ((int)res[i] != 0) { res_bool = false; break; } } nonce++; } *nonce_ptr = nonce; for (int i = 0; i < 64; i++) { returner[i] = res[i]; } } int main(int argc, char** argv) { //Getting command-line arguments int zeroes = atoi(argv[argc - 1]); int nonce; unsigned char hash[64]; //Timing the execution clock_t start, end; double cpu_time_used; start = clock(); find_nonce(zeroes, &nonce, hash); end = clock(); cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC; //Printing the output to the screen printf("Hash was "); for (int i = 0; i < 64; i++) { printf("%d ", (int)hash[i]); } printf("\nNonce to get the hash was %d\nIt took %f seconds to calculate\n", nonce, cpu_time_used); return 0; } And here is an example output from five tests: Hash was 0 0 0 0 0 0 0 0 6 203 85 177 228 127 0 0 192 128 164 212 252 127 0 0 129 219 85 177 228 127 0 0 0 235 105 177 228 127 0 0 144 128 164 212 252 127 0 0 2 0 0 0 0 0 0 0 48 130 164 212 252 127 0 0 Nonce to get the hash was 1 It took 0.000000 seconds to calculate Hash was 0 0 0 0 0 0 0 0 6 203 214 123 135 127 0 0 64 216 207 126 253 127 0 0 129 219 214 123 135 127 0 0 0 235 234 123 135 127 0 0 16 216 207 126 253 127 0 0 2 0 0 0 0 0 0 0 176 217 207 126 253 127 0 0 Nonce to get the hash was 1 It took 0.000000 seconds to calculate Hash was 0 0 0 0 0 0 0 0 6 123 219 55 192 127 0 0 144 108 17 232 252 127 0 0 129 139 219 55 192 127 0 0 0 155 239 55 192 127 0 0 96 108 17 232 252 127 0 0 2 0 0 0 0 0 0 0 0 110 17 232 252 127 0 0 Nonce to get the hash was 1 It took 0.000000 seconds to calculate Hash was 0 0 0 0 0 0 0 0 6 107 181 157 222 127 0 0 64 183 143 12 253 127 0 0 129 123 181 157 222 127 0 0 0 139 201 157 222 127 0 0 16 183 143 12 253 127 0 0 2 0 0 0 0 0 0 0 176 184 143 12 253 127 0 0 Nonce to get the hash was 1 It took 0.000000 seconds to calculate Hash was 0 0 0 0 0 0 0 0 6 139 121 81 110 127 0 0 32 171 61 179 254 127 0 0 129 155 121 81 110 127 0 0 0 171 141 81 110 127 0 0 240 170 61 179 254 127 0 0 2 0 0 0 0 0 0 0 144 172 61 179 254 127 0 0 Nonce to get the hash was 1 It took 0.000000 seconds to calculate
res_size is not initialized. It contains garbage, and the garbage is different depending on the compiler flags. hash_memory, on the other hand, expects it to have the size of the output buffer. The first thing it does is to check that there is enough space provided, and if not it bails out. So what you see are not hashes, but initial state of your buffer. Always test the return values!
Generate Unique Values
I want to create a C program to generate numbers from 0 to 999999, keeping in mind that the number generated should not have any digits that are repetitive within it. For example, "123" is an acceptable value but not "121" as the '1' is repeated. I have sourced other program codes that check if an integer has repeated digits: Check if integer has repeating digits. No string methods or arrays What is the fastest way to check for duplicate digits of a number? However these do not really solve my problem and they are very inefficient solutions if I were to perform the check for 1,000,000 different values. Moreover, the solution provided is for int and not char[] and char *, which I use in my program. Below is my code thus far. As you can see I have no problems handling values of up to "012", however the possibilities for values with 3 digits and above are too many to list and too inefficient to code. Would appreciate some help. int i, j; char genNext[7] = "0"; printf("%s\n", genNext); // loop through to return next pass in sequence while (1) { for (i = 0; i < sizeof(genNext) / sizeof(char); i++) { if (genNext[i] == '9') { char * thisPass = strndup(genNext, sizeof(genNext)); int countDigit = (int) strlen(thisPass); switch (countDigit) { case 1: genNext = "01"; break; case 2: if (strcmp(genNext, "98")) { if (i == 0) { genNext[1] += 1; } else { genNext[0] += 1; genNext[1] == '0'; } } else { genNext = "012"; } break; case 3: if (strcmp(genNext, "987")) { // code to handle all cases } else { genNext = "0123"; } break; case 4: case 5: case 6: // insert code here } break; } else if (genNext[i] == '\0') { break; } else if (genNext[i+1] == '\0') { genNext[i] += 1; for (j = 0; j < i; j++) { if (genNext[i] == genNext[j]) { genNext[i] += 1; } } } else { continue; } } printf("%s\n", genNext); if (strcmp(genNext, "987654") == 0) { break; } } The main problem that I am facing is the cases when '9' is part of the value that is being tested. For example, the next value in the sequence after "897" is "901" and after "067895" comes "067912" based on the rules of non-repetitiveness as well as sequential returning of the result. A desired output would be as follows: 0 1 2 3 ... 8 9 01 02 03 ... 09 10 12 13 ... 97 98 012 013 014 ... 098 102 103 ... 985 986 987 0123 0124 ... etc etc. Any assistance is appreciated, and if any part of my question was unclear, feel free to clarify. Thanks! EDIT: How do I generate all permutations of a list of numbers? does not solve my question as the increment from "120398" to "120435" as the next "legal" value in the sequence. EDIT 2: Updated question to include desired output
There are three variant algorithms below. Adapt variant 3 to suit your requirements. Variant 1 This is one way to do it. It implements a minor variant of the initialize a table of 10 digit counts to 0; scan the digits, increment the count for each digit encountered, then check whether any of the digit counts is more than 1 algorithm I suggested in a comment. The test function returns as soon as a duplicate digit is spotted. #include <stdio.h> #include <stdbool.h> enum { MAX_ITERATION = 1000000 }; static bool duplicate_digits_1(int value) { char buffer[12]; snprintf(buffer, sizeof(buffer), "%d", value); char digits[10] = { 0 }; char *ptr = buffer; char c; while ((c = *ptr++) != '\0') { if (++digits[c - '0'] > 1) return true; } return false; } int main(void) { int count = 0; for (int i = 0; i < MAX_ITERATION; i++) { if (!duplicate_digits_1(i)) { count += printf(" %d", i); if (count > 72) { putchar('\n'); count = 0; } } } putchar('\n'); return 0; } When run, it produces 168,571 values between 0 and 1,000,000, starting: 0 1 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 20 21 23 24 25 26 27 28 29 30 31 32 34 35 36 37 38 39 40 41 42 43 45 46 47 48 49 50 51 52 53 54 56 57 58 59 60 61 62 63 64 65 67 68 69 70 71 72 73 74 75 76 78 79 80 81 82 83 84 85 86 87 89 90 91 92 93 94 95 96 97 98 102 103 104 105 106 107 108 109 120 123 124 125 126 127 128 129 130 132 134 135 136 137 138 139 140 142 143 145 146 147 148 149 150 152 153 154 156 157 158 159 160 162 163 164 165 167 168 169 170 172 173 174 175 176 178 179 180 182 183 184 185 186 187 189 190 192 193 194 195 196 197 198 201 203 204 205 206 207 208 209 210 213 214 215 216 217 218 219 230 231 234 235 236 237 238 239 240 241 243 245 246 247 248 249 250 251 253 254 256 257 258 259 260 261 263 264 265 267 268 269 270 271 273 … 987340 987341 987342 987345 987346 987350 987351 987352 987354 987356 987360 987361 987362 987364 987365 987401 987402 987403 987405 987406 987410 987412 987413 987415 987416 987420 987421 987423 987425 987426 987430 987431 987432 987435 987436 987450 987451 987452 987453 987456 987460 987461 987462 987463 987465 987501 987502 987503 987504 987506 987510 987512 987513 987514 987516 987520 987521 987523 987524 987526 987530 987531 987532 987534 987536 987540 987541 987542 987543 987546 987560 987561 987562 987563 987564 987601 987602 987603 987604 987605 987610 987612 987613 987614 987615 987620 987621 987623 987624 987625 987630 987631 987632 987634 987635 987640 987641 987642 987643 987645 987650 987651 987652 987653 987654 Before you decide this is 'not efficient', measure it. Are you really exercising it often enough that the performance is a real problem? Variant 2 Creating the alternative version I suggested in the comments: use strchr() iteratively, checking whether the first digit appears in the tail, and if not whether the second digit appears in the tail, and so on is very easy to implement given the framework of the first answer: static bool duplicate_digits_2(int value) { char buffer[12]; snprintf(buffer, sizeof(buffer), "%d", value); char *ptr = buffer; char c; while ((c = *ptr++) != '\0') { if (strchr(ptr, c) != NULL) return true; } return false; } When the times are compared I got these results (ng41 uses duplicate_digits_1() and ng43 uses duplicate_digits_2(). $ time ng41 > /dev/null real 0m0.175s user 0m0.169s sys 0m0.002s $ time ng43 > /dev/null real 0m0.201s user 0m0.193s sys 0m0.003s $ Repeated timings generally showed similar results, but sometimes I got ng43 running faster than ng41 — the timing on just one set of one million numbers isn't clear cut (so YMMV — your mileage may vary!). Variant 3 You could also use this technique, which is analogous to 'count digits' but without the conversion to string first (so it should be quicker). #include <stdio.h> #include <stdbool.h> #include <string.h> enum { MAX_ITERATION = 1000000 }; static bool duplicate_digits_3(int value) { char digits[10] = { 0 }; while (value > 0) { if (++digits[value % 10] > 1) return true; value /= 10; } return false; } int main(void) { int count = 0; const char *pad = ""; for (int i = 0; i < MAX_ITERATION; i++) { if (!duplicate_digits_3(i)) { count += printf("%s%d", pad, i); pad = " "; if (count > 72) { putchar('\n'); count = 0; pad = ""; } } } putchar('\n'); return 0; } Because it avoids conversions to strings, it is much faster. The slowest timing I got from a series of 3 runs was: real 0m0.063s user 0m0.060s sys 0m0.001s which is roughly three times as fast as either of the other two. Extra timing I also changed the value of MAX_ITERATION to 10,000,000 and ran timing. There are many more rejected outputs, of course. $ time ng41 >/dev/null real 0m1.721s user 0m1.707s sys 0m0.006s $ time ng43 >/dev/null real 0m1.958s user 0m1.942s sys 0m0.008s $ time ng47 >/dev/null real 0m0.463s user 0m0.454s sys 0m0.004s $ ng41 | wc 69237 712891 5495951 $ ng43 | wc 69237 712891 5495951 $ ng47 | wc 69237 712891 5495951 $ cmp <(ng41) <(ng43) $ cmp <(ng41) <(ng47) $ cmp <(ng43) <(ng47) $ These timings were more stable; variant 1 (ng41) was always quicker than variant 2 (ng43), but variant 3 (ng47) beats both by a significant margin. JFTR: testing was done on macOS Sierra 10.12.1 with GCC 6.2.0 on an old 17" MacBook Pro — Early 2011, 2.3GHz Intel Core i7 with 16 GB 1333 MHz DDR3 RAM — not that memory is an issue with this code. The program numbers are consecutive 2-digit primes, in case you're wondering. Leading zeros too This code generates the sequence of numbers you want (though it is only configured to run up to 100,000 — the change for 1,000,000 is trivial). It's fun in a masochistic sort of way. #include <assert.h> #include <stdbool.h> #include <stdio.h> #include <string.h> enum { MAX_ITERATIONS = 100000 }; /* lz = 1 or 0 - consider that the number has a leading zero or not */ static bool has_duplicate_digits(int value, int lz) { assert(value >= 0 && value < MAX_ITERATIONS + 1); assert(lz == 0 || lz == 1); char digits[10] = { [0] = lz }; while (value > 0) { if (++digits[value % 10] > 1) return true; value /= 10; } return false; } int main(void) { int lz = 0; int p10 = 1; int log_p10 = 0; /* log10(0) is -infinity - but 0 works better */ int linelen = 0; const char *pad = ""; /* The + 1 allows the cycle to reset for the leading zero pass */ for (int i = 0; i < MAX_ITERATIONS + 1; i++) { if (i >= 10 * p10 && lz == 0) { /* Passed through range p10 .. (10*p10-1) once without leading zeros */ /* Repeat, adding leading zeros this time */ lz = 1; i = p10; } else if (i >= 10 * p10) { /* Passed through range p10 .. (10*p10-1) without and with leading zeros */ /* Continue through next range, without leading zeros to start with */ p10 *= 10; log_p10++; lz = 0; } if (!has_duplicate_digits(i, lz)) { /* Adds a leading zero if lz == 1; otherwise, it doesn't */ linelen += printf("%s%.*d", pad, log_p10 + lz + 1, i); pad = " "; if (linelen > 72) { putchar('\n'); pad = ""; linelen = 0; } } } putchar('\n'); return 0; } Sample output (to 100,000): 0 1 2 3 4 5 6 7 8 9 01 02 03 04 05 06 07 08 09 10 12 13 14 15 16 17 18 19 20 21 23 24 25 26 27 28 29 30 31 32 34 35 36 37 38 39 40 41 42 43 45 46 47 48 49 50 51 52 53 54 56 57 58 59 60 61 62 63 64 65 67 68 69 70 71 72 73 74 75 76 78 79 80 81 82 83 84 85 86 87 89 90 91 92 93 94 95 96 97 98 012 013 014 015 016 017 018 019 021 023 024 025 026 027 028 029 031 032 034 035 036 037 038 039 041 042 043 045 046 047 048 049 051 052 053 054 056 057 058 059 061 062 063 064 065 067 068 069 071 072 073 074 075 076 078 079 081 082 083 084 085 086 087 089 091 092 093 094 095 096 097 098 102 103 104 105 106 107 108 109 120 123 124 125 126 127 128 129 130 132 134 135 136 137 138 139 140 … 901 902 903 904 905 906 907 908 910 912 913 914 915 916 917 918 920 921 923 924 925 926 927 928 930 931 932 934 935 936 937 938 940 941 942 943 945 946 947 948 950 951 952 953 954 956 957 958 960 961 962 963 964 965 967 968 970 971 972 973 974 975 976 978 980 981 982 983 984 985 986 987 0123 0124 0125 0126 0127 0128 0129 0132 0134 0135 0136 0137 0138 0139 0142 0143 0145 0146 0147 0148 0149 0152 0153 0154 0156 0157 0158 0159 0162 0163 0164 0165 0167 … 0917 0918 0921 0923 0924 0925 0926 0927 0928 0931 0932 0934 0935 0936 0937 0938 0941 0942 0943 0945 0946 0947 0948 0951 0952 0953 0954 0956 0957 0958 0961 0962 0963 0964 0965 0967 0968 0971 0972 0973 0974 0975 0976 0978 0981 0982 0983 0984 0985 0986 0987 1023 1024 1025 1026 1027 1028 1029 1032 1034 1035 1036 1037 1038 1039 1042 1043 1045 1046 1047 1048 1049 1052 1053 1054 1056 1057 1058 1059 1062 1063 1064 1065 1067 1068 1069 1072 1073 1074 1075 … 9820 9821 9823 9824 9825 9826 9827 9830 9831 9832 9834 9835 9836 9837 9840 9841 9842 9843 9845 9846 9847 9850 9851 9852 9853 9854 9856 9857 9860 9861 9862 9863 9864 9865 9867 9870 9871 9872 9873 9874 9875 9876 01234 01235 01236 01237 01238 01239 01243 01245 01246 01247 01248 01249 01253 01254 01256 01257 01258 01259 01263 01264 01265 01267 01268 01269 01273 01274 01275 01276 01278 01279 01283 01284 01285 01286 01287 01289 01293 01294 01295 01296 01297 01298 … 09827 09831 09832 09834 09835 09836 09837 09841 09842 09843 09845 09846 09847 09851 09852 09853 09854 09856 09857 09861 09862 09863 09864 09865 09867 09871 09872 09873 09874 09875 09876 10234 10235 10236 10237 10238 10239 10243 10245 10246 10247 10248 10249 10253 10254 10256 10257 10258 10259 10263 10264 10265 … 98705 98706 98710 98712 98713 98714 98715 98716 98720 98721 98723 98724 98725 98726 98730 98731 98732 98734 98735 98736 98740 98741 98742 98743 98745 98746 98750 98751 98752 98753 98754 98756 98760 98761 98762 98763 98764 98765 012345 012346 012347 012348 012349 012354 012356 012357 012358 012359 012364 012365 012367 012368 012369 012374 012375 012376 012378 012379 012384 012385 012386 … 098653 098654 098657 098671 098672 098673 098674 098675 098712 098713 098714 098715 098716 098721 098723 098724 098725 098726 098731 098732 098734 098735 098736 098741 098742 098743 098745 098746 098751 098752 098753 098754 098756 098761 098762 098763 098764 098765
Using a loop (from 0 to 999,999, inclusive), and rejecting values with repeating digits sounds like the most straightforward implementation to me. The reject-if-duplicate-digits function can be made to be pretty fast. Consider, for example, int has_duplicate_digits(unsigned int value) { unsigned int digit_mask = 0U; do { /* (value % 10U) is the value of the rightmost decimal digit of (value). 1U << (value % 10U) refers to the value of the corresponding bit -- bit 0 to bit 9. */ const unsigned int mask = 1U << (value % 10U); /* If the bit is already set in digit_mask, we have a duplicate digit in value. */ if (mask & digit_mask) return 1; /* Mark this digit as seen in the digit_mask. */ digit_mask |= mask; /* Drop the rightmost digit off value. Note that this is integer division. */ value /= 10U; /* If we have additional digits, repeat loop. */ } while (value); /* No duplicate digits found. */ return 0; }
This is actually a classical combinatorial problem. Below is a proof of concept implementation using Algorithm L in TAOCP 7.2.1.2 and Algorithm T in TAOCP 7.2.1.3. There might be some errors. Refer to the algorithms for details. Here is a bit of explanation. Let t be the number of digits. For t == 10, the problem is to generate all t! permutations of the set {0,1,2,...,9} in lexicographic order (Algorithm L). For t > 0 and t < 10, this breaks down to 1) Generate all combinations of t digits from the 10 possible digits. 2). For each combination, generate all t! permutations. Last, you can sort all 10! + 10! / 2 + 10! / 3! + .. + 10 results. The sorting might look expensive at first. But first, the combination generating is already in lexical order. Second, the permutation generating is also in lexical order. So the sequence is actually highly regular. A QSort is not really too bad here. #include <assert.h> #include <stdio.h> #include <stdlib.h> #include <string.h> static inline int compare_str(const void *a, const void *b) { return strcmp(a, b); } static inline int compare_char(const void *a, const void *b) { char ca = *((char *) a); char cb = *((char *) b); if (ca < cb) return -1; if (ca > cb) return 1; return 0; } // Algorithm L in TAOCP 7.2.1.2 static inline char *algorithm_l(int n, const char *c, char *r) { char a[n + 1]; memcpy(a, c, n); a[n] = '\0'; qsort(a, n, 1, compare_char); while (1) { // L1. [Visit] memcpy(r, a, n + 1); r += n + 1; // L2. [Find j] int j = n - 1; while (j > 0 && a[j - 1] >= a[j]) --j; if (j == 0) break; // L3. [Increase a[j - 1]] int l = n; while (l >= 0 && a[j - 1] >= a[l - 1]) --l; char tmp = a[j - 1]; a[j - 1] = a[l - 1]; a[l - 1] = tmp; // L4. [Reverse a[j]...a[n-1]] int k = j + 1; l = n; while (k < l) { char tmp = a[k - 1]; a[k - 1] = a[l - 1]; a[l - 1] = tmp; ++k; --l; } } return r; } // Algorithm T in TAOCP 7.2.1.2 static inline void algorithm_t(int t, char *r) { assert(t > 0); assert(t < 10); // Algorithm T in TAOCP 7.2.1.3 // T1. [Initialize] char c[12]; // the digits for (int i = 0; i < t; ++i) c[i] = '0' + i; c[t] = '9' + 1; c[t + 1] = '0'; char j = t; char x = '0'; while (1) { // T2. [Visit] r = algorithm_l(t, c, r); if (j > 0) { x = '0' + j; } else { // T3. [Easy case?] if (c[0] + 1 < c[1]) { ++c[0]; continue; } j = 2; // T4. [Find j] while (1) { c[j - 2] = '0' + j - 2; x = c[j - 1] + 1; if (x != c[j]) break; ++j; } // T5. [Done?] if (j > t) break; } // T6. [Increase c[j - 1]] c[j - 1] = x; --j; } } static inline void generate(int t) { assert(t >= 0 && t <= 10); if (t == 0) return; int n = 1; int k = 10; for (int i = 1; i <= t; ++i, --k) n *= k; char *r = (char *) malloc((t + 1) * n); if (t == 10) { algorithm_l(10, "0123456789", r); } else { algorithm_t(t, r); } qsort(r, n, t + 1, strcmpv); for (int i = 0; i < n; ++i, r += t + 1) printf("%s\n", r); } int main() { for (int t = 1; t <= 10; ++t) generate(t); } Efficiency: This is implementation is not very efficient. It is a direct translation from the algorithm, for easier understanding. However it is still a lot more efficient than iterating over 10^10 numbers. It takes about 2.5 seconds to generate all numbers from 0 to 9876543210. This includes the time of writing them to a file, a 94MB output file, with one number a line. For up to 6 digits, it takes about 0.05 seconds. If you want these numbers come in the order you want in program, it might be better to generate the numbers as above to prepare a table and use the table later. Even for the table from 0 to 9876543210, there are less than ten million numbers, which is not a really big number in today's computers. In your case, up to six digits, there are only less than one million numbers.
Issue with fscanf reading input file
I am trying to convert an input file from .txt to .csv. I've performed multiple tests using gdb and switched my code around. The code looks like it should work but for some reason it doesn't. I've tried using "while (fscanf(…arguments…) != EOF)" but I always end up in a never-ending loop when I know that the input file does end. Is it the way I'm trying to read the file that's the problem or something else? I'd greatly appreciate any advice. A sample of the input file (it's way too big. Also the potentiometer value is the only value that is consistently zero. All other values are greater than zero) time: 40 ms switch0: 1 switch1: 1 switch2: 1 switch3: 1 potentiometer: 0.00 temperature: 0.66 light: 0.23 --------------------------- time: 80 ms switch0: 1 switch1: 1 switch2: 1 switch3: 1 potentiometer: 0.00 temperature: 0.66 light: 0.23 --------------------------- time: 120 ms switch0: 1 switch1: 1 switch2: 1 switch3: 1 potentiometer: 0.00 temperature: 0.66 light: 0.23 --------------------------- The file that convert from txt to csv 1 #include <stdio.h> 2 #include <stdlib.h> 3 4 int main() 5 { 6 FILE *data = fopen("data.csv","w"); 7 FILE *arduino = fopen("arduino.txt","r"); 8 9 if(arduino == NULL) 10 { 11 printf("error reading file\n"); 12 return 1; 13 } 14 if(data == NULL) 15 { 16 printf("error writing file\n"); 17 return 2; 18 } 19 20 fprintf(data,"Time,Switch0,Switch1,Switch2,Switch3,Potentiometer,Temperature,Light\n"); 21 22 int num1,num2,num3,num4,num5; 23 double num6,num7,num8; 24 25 double temp1[800]; 26 27 int count1 = 0; 28 29 while(count1<800) 30 { 31 fscanf(arduino,"%lf",&temp1[count1]); 32 count1++; 33 } 34 35 for(count1 = 0; count1 < 800; count1++) 36 { 37 printf("%lf",temp1[count1]); 38 } 39 40 41 int count2 = 0; 42 int i = 0; 43 44 while(count2 != 800) 45 { 46 for(i=0 ; i <8;i++) 47 { 48 if(i==7) 49 { 50 fprintf(data,"%lf\n",temp1[count2]); 51 } 52 53 else 54 { 55 fprintf(data, "%lf,", temp1[count2]); 56 } 57 count2++; 58 } 59 } 60 61 62 if (fclose(arduino)==EOF) 63 { 64 printf("error closing input file\n"); 65 } 66 if(fclose(data)==EOF) 67 { 68 printf("error closing output file\n"); 69 } 70 } and here's the output Time,Switch0,Switch1,Switch2,Switch3,Potentiometer,Temperature,Light 2 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 3 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 4 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 5 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 6 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 7 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 8 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 9 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 (still zeros across the board) 61 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 62 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 63 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 64 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 65 0.000000,0.000000,0.000000,0.000000,-nan,0.000000,0.000000,0.000000 66 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 67 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 68 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 69 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 70 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 71 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 72 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 73 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 74 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 75 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 76 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 77 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 78 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 79 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 80 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 81 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 82 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 83 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 84 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 85 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 86 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 87 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 88 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 89 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 90 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 91 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 92 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 93 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 94 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 95 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 96 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 97 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 98 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 99 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000 100 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,253700437344304240814662650531587413567701629019053044079803999006261210888228446189339915094527360 92923426425002851172634311446770729875573243622981632.000000,49038509684686202755808411764574575003743211375155249005916427827780247417991687082747214451 073341675744581253991867335918252416362555908299070786942125737694751726823604090062182039519355613866611467434357822207669472484839486934106348907556279 40839424.000000 101 0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
Checking the return value from fscanf() against the expected value, not against EOF is a good idea. #BLUEPIXY #Peter. This will identify/solve most of OP's problems. The data file has keywords in it, they needed to be scanned too. #user3386109 Suggest direct code to read the entire record with one fscanf(). struct kk { int time; int switchn[4]; double potentiometer, temperature, light; }; int data_kk_read(FILE *inf, struct kk *data) { int cnt = fscanf(inf, " time: %d ms switch0: %d switch1: %d switch2: %d" " switch3: %d potentiometer: %lf temperature: %lf light: %lf" " --------------------------- ", &data->time,&data->switchn[0],&data->switchn[1], &data->switchn[2], &data->switchn[3], &data->potentiometer, &data->temperature, &data->light); return cnt; } int data_kk_write(FILE *outf, unsigned n, const struct kk *data) { int cnt = fprintf(outf, "%3u %d, %d, %d, %d, %d, %.2lf, %.2lf, %.2lf\n", n, data->time, data->switchn[0], data->switchn[1], data->switchn[2], data->switchn[3], data->potentiometer, data->temperature, data->light); return cnt; } int main(void) { FILE *inf = fopen("data.txt", "r"); assert(inf); unsigned line = 0; struct kk data; int cnt; while ((cnt = data_kk_read(inf, &data)) == 8) { data_kk_write(stdout, ++line, &data); } fclose(inf); if (cnt != EOF) puts("Unexpected scanning problem"); return 0; } Output 1 40, 1, 1, 1, 1, 0.00, 0.66, 0.23 2 80, 1, 1, 1, 1, 0.00, 0.66, 0.23 3 120, 1, 1, 1, 1, 0.00, 0.66, 0.23
Partial Threaded Sorting in C
I'm trying to do a partial sort with a threads, my current output it 27 12 21 48 15 28 82 69 35 91 13 82 33 35 46 5 35 28 87 95 0 10 20 22 23 30 52 80 86 96 3 8 42 53 67 70 70 71 75 79 5 8 8 18 41 43 70 79 86 88 10 51 56 60 65 84 87 91 94 99 23 25 38 39 40 44 51 56 69 75 20 21 25 29 29 38 66 71 73 96 33 50 9 6 13 27 97 21 70 22 3 4 6 6 7 15 34 59 63 70 As you can see I am getting it partially sorted I want my output to be this (no merging at the end) 12 15 21 27 28 35 48 69 82 91 5 13 28 33 35 35 46 82 87 95 0 10 20 22 23 30 52 80 86 96 3 8 42 53 67 70 70 71 75 79 5 8 8 18 41 43 70 79 86 88 10 51 56 60 65 84 87 91 94 99 23 25 38 39 40 44 51 56 69 75 20 21 25 29 29 38 66 71 73 96 6 9 13 21 22 27 33 50 70 97 3 4 6 6 7 15 34 59 63 70 I can get the right output if instead of using a struct I use &array[i] and manually input the length This is the code I have so far: #include <stdio.h> #include <unistd.h> #include <fcntl.h> #include <stdlib.h> #include <pthread.h> int cmpfunc(const void *a, const void *b) { return (*(int*)a - *(int*)b); } struct values { int *arrayptr; int length; }; void *thread_fn(void *a) { struct values *start = a; qsort(start->arrayptr, start->length, sizeof(int), cmpfunc); return (void*)a; } int main(int argc, const char *argv[]) { FILE *fp = fopen(argv[3], "r"); FILE *fp1 = fopen("numS1.dat", "w+"); //amount of threads int threadAmount = atoi(argv[1]); //size of input int numberAmount = atoi(argv[2]); //multidimensional array int array[threadAmount][numberAmount / threadAmount]; for (int i = 0; i < threadAmount; i++) for (int j = 0; j < numberAmount / threadAmount; j++) fscanf(fp, "%d", &array[i][j]); pthread_t threadid[threadAmount]; for (int i = 0; i < threadAmount; ++i) { struct values a = { array[i], numberAmount / threadAmount }; pthread_create(&threadid[i], NULL, thread_fn, &a); } for (int i = 0; i < threadAmount; ++i) pthread_join(threadid[i], NULL); for (int i = 0; i < threadAmount; i++) { if (i != 0) fprintf(fp1, "\n"); for (int j = 0; j < numberAmount / threadAmount; j++) fprintf(fp1 ,"%d ", array[i][j]); } return 0; } Do you know where I am going wrong? I think its the struct but everything I see online does what I'm doing.
You are passing a pointer to automatic storage to newly created threads: the struct values object becomes invalid as soon as the calling scope is exited, thus it cannot be reliably accessed by the new thread. You should allocate the struct values and pass the pointer to the allocated object as a parameter to pthread_create: for (int i = 0; i < threadAmount; ++i) { struct values *a = malloc(sizeof(*a)); a->arrayptr = array[i]; a->length = numberAmount / threadAmount; pthread_create(&threadid[i], NULL, thread_fn, a); } The structure can be freed by the thread function before exiting. Notes: the way you split the array into chunks only works if the length is a multiple of the number of threads. the comparison function does not work for large int values, you should use this instead: int cmpfunc(const void *a, const void *b) { return (*(int*)b < *(int*)a) - (*(int*)a < *(int*)b); }
Detect end of line in C
This is the code which read a matrix 10x10 from a file "F1.txt" #include <stdio.h> int main( int argc, char ** argv ) { FILE * fr; fr = fopen("F1.txt","r"); int i, j; int matrix[10][10] = {0.0}; for(i = 0; i < 10; i++) { for(j = 0; j < 10; j++) { fscanf(fr, "%d",&matrix[i][j]); printf("%d\n", matrix[i][j]); } } getchar(); return 0; } "F1.txt" looks like this: 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 4 34 56 43 32 124 52 212 3 32 343 34 544 43 32 7 52 456 98 It works without problems but the output is: 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 .......... etc.... I have to detect the end of line to make my input the same like in F1.txt 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 343 34 544 43 32 124 52 212 3 12 4 34 56 43 32 124 52 212 3 32 343 34 544 43 32 7 52 456 98 .
Rewrite the loops the following way for(i = 0; i < 10; i++) { for(j = 0; j < 10; j++) { fscanf(fr, "%d",&matrix[i][j]); printf("%3d ", matrix[i][j]); } printf( "\n" ); }
You are reading the data correctly, but you are not printing it right. Your program inserts '\n' after each character, that's why you see so many lines. Change your program like this to see the output that you expect: for(i = 0; i < 10; i++) { for(j = 0; j < 10; j++) { fscanf(fr, "%d",&matrix[i][j]); printf("%d ", matrix[i][j]); // <<== Replace \n with a single space } printf("\n"); // <<== Add this line }
Or if you just want your input formatted, you can write your end of line in the outer loop on i. for(i = 0; i < 10; i++) { for(j = 0; j < 10; j++) { fscanf(fr, "%d",&matrix[i][j]); printf("%d\t", matrix[i][j]); } printf("\n"); }
For your knowledge, EOL is usually a LF character or combination of CR-LF characters. In C they are represented by \n or \r\n respectively. A possible solution is that you can use fgets to read a complete line at once (fgets will read one line only by itself). Then read out the integers from that string using sscanf or strtok. I suggest you to use sscanf if you know number of integers in every line. Otherwise, if a line can contain any amount of numbers then you can use strtok with " " (space) as the delimitter. You can read more about these functions here: sscanf , strtok
fscanf(fr, "%d",&matrix[i][j]); printf("%d", matrix[i][j]); if(j < 10-1) printf(" "); else printf("\n");