Schema.org fails to deliver application/ld+json - json-ld

In my java app I get the following error
Caused by: java.lang.RuntimeException: java.lang.RuntimeException:
com.github.jsonldjava.core.JsonLdError: loading remote context failed: http://schema.org/
With curl I currently get something like
curl -i -L -k --compressed -H "Accept: application/ld+json" https://schema.org/
HTTP/2 200
access-control-allow-credentials: true
access-control-allow-headers: Accept
access-control-allow-methods: GET
access-control-allow-origin: *
access-control-expose-headers: Link
link: </docs/jsonldcontext.jsonld>; rel="alternate"; type="application/ld+json"
....
/** Body contains HTML payload **/

Content negotiation is not longer supported for the main page.
Instead you have to follow the instructions at
https://schema.org/docs/developers.html
and use one of the linked versions:
e.g.
https://schema.org/version/latest/schemaorg-current-http.jsonld
{
"#context" : "https://schema.org/version/latest/schemaorg-current-http.jsonld"
}
or with curl
curl -i -L https://schema.org/version/latest/schemaorg-current-http.jsonld

Related

Redirecting to permanently moved page with response code 302

I am sending this request from my C code:
char * request = "GET / HTTP/1.1\r\n" \
"Host: www.some.com\r\n" \
"Connection: keep-alive\r\n" \
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36\r\n" \
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r\nAccept-Language: en-US,en;q=0.9\r\nAccept-Encoding: gzip, deflate\r\n\r\n";
But I get this response after sending the above request:
HTTP/1.1 302 Found
Location: https://www.some.com/?gws_rd=ssl
Cache-Control: private
Content-Type: text/html; charset=UTF-8
BFCache-Opt-In: unload
Date: Thu, 24 Feb 2022 06:17:10 GMT
Server: gws
Content-Length: 231
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2022-02-24-06; expires=Sat, 26-Mar-2022 06:17:10 GMT; path=/; domain=.some.com; Secure; SameSite=none
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
here.
</BODY></HTML>
This is the message here. so If I follow https://www.some.com/?gws_rd=ssl I dont get any data, its like request is being sent but data is not received. I sending this request to https://www.some.com/?gws_rd=ssl
char *x="GET / https://www.some.com/?gws_rd=ssl\r\n\r\n";
why is that. Whats wrong with my http/https.
I am using openSSL.
So after sending initial request the server moved the resource to new url. Now when I following the new url nothing happens no data response
Code:
/* filename nossl.c */
#include "stdio.h"
#include "string.h"
#include "openssl/ssl.h"
#include "openssl/bio.h"
#include "openssl/err.h"
int main()
{
BIO * bio;
char resp[1024];
int ret;
//char * request = "GET /cas/login?service=https%3A%2F%2Fweb.corp.ema-tech.com%3A8888%2F HTTP/1.1\x0D\x0AHost: web.corp.ema-tech.com\x0D\x0A\x43onnection: Close\x0D\x0A\x0D\x0A";
char * request = "GET / HTTP/1.1\r\n" \
"Host: www.yoursite.com\r\n" \
"Connection: keep-alive\r\n" \
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36\r\n" \
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r\nAccept-Language: en-US,en;q=0.9\r\nAccept-Encoding: gzip, deflate\r\n\r\n";
char *x="GET / https://www.yoursite.com/?gws_rd=ssl\r\n\r\n";
/* Set up the library */
ERR_load_BIO_strings();
SSL_load_error_strings();
OpenSSL_add_all_algorithms();
/* Create and setup the connection */
//bio = BIO_new_connect("web.corp.ema-tech.com:8888");
printf("___________________________+\n");
bio = BIO_new_connect("www.yoursite.com:80");
if(bio == NULL) {
printf("====___________________________-\n");
printf("BIO is null\n");
}
if(BIO_do_connect(bio) <= 0) {
printf("+++++___________________________#\n");
BIO_free_all(bio);
}
printf("___________________________#^\n");
/* Send the request */
BIO_write(bio, request, strlen(request));
printf("___________________________0\n");
/* Read in the response */
for(;;) {
ret = BIO_read(bio, resp, 1023);
printf("----%d\n",ret);
if(ret <= 0) break;
resp[ret] = 0;
printf("%s\n", resp);
}
BIO_write(bio,x,sizeof("GET / https://www.yoursite.com/?gws_rd=ssl\r\n\r\n"));
for(;;) {
ret = BIO_read(bio, resp, 1023);
printf("----%d\n",ret);
if(ret <= 0) break;
resp[ret] = 0;
printf("%s\n", resp);
}
/* Close the connection and free the context */
BIO_free_all(bio);
return 0;
}
If your first request was HTTP (not HTTPS) then the server is mainly telling you to use HTTPS instead of HTTP. Your request would be
char * request = "GET /?gws_rd=ssl HTTP/1.1\r\n" \
"Host: www.some.com\r\n" ...
The /?gws_rd=ssl is the local resource name (/) and a query string (?gws_rd=ssl) from https://www.some.com/?gws_rd=ssl, while the host name www.some.com goes to the "Host:" header.
Some servers will only allow to connect if you are using the server name TLS extension (OpenSSL: "SSL_set_tlsext_host_name") and supply the host name as well.
You could also think about using an C existing library for a HTTPS client, for example:
libcurl (https://curl.se/libcurl/ - libcurl is one of the most used HTTP/HTTPS client libraries in C)
CivetWeb (https://github.com/civetweb/civetweb/blob/master/docs/api/mg_connect_client_secure.md - actually a server with some additional client functions; disclaimer: I am in the maintainer team of this server).
Both are open source and MIT licensed.
Edit:
Actually I need to know difference between openssl and https
HTTPS is a communication protocol (HyperText Transfer Protocol Secure).
OpenSSL is a crypto library.
The protocol stack from HTTP like:
HTTP: [HTTP]
[TCP/IP]
The stack from HTTPS looks like:
HTTPS: [HTTP]
[TLS (= SSL)]
[TCP/IP]
SSL stands for Secure Sockets Layer, and TLS (Transport Layer Security) is the successor of SSL. OpenSSL implements SSL version 2 and 3 (both deprecated) as well as all versions of TLS (1.0, 1.1, 1.2 and 1.3).
OpenSSL can provide the middle part of the HTTPS stack, but you still need the top and bottom part. They are identical to HTTP, so TLS (the protocol) respectively OpenSSL (a library implementing the protocol) is inserted in the middle.
To see this live in action try to read from www.google.com using the OpenSSL command line:
$ openssl s_client www.google.com
The server will provide some information, in particular the server certificate. Then you type:
GET / HTTP/1.1
Host: www.google.com
Connection: close
After the empty line at the bottom, the server will send a header:
HTTP/1.1 200 OK
Date: ..
Server: gws
Connection: close
Followed by an empty line and finally a HTML page.
This OpenSSL command line client will implement the TLS layer and use the TCP/IP layer from the operating system. But you have to provide the HTTP layer on top: The four lines of text (GET ..., Host ..., Connection ... and the empty line at the end) is a valid HTTP protocol request.
The full source of s_client can be found here: https://github.com/openssl/openssl/blob/master/apps/s_client.c
The source is lengthy because it provides a hundred different options.
A much smaller client example with more explanation can be found here:
https://wiki.openssl.org/index.php/SSL/TLS_Client
You will find the same four lines for the HTTP protocol in this example:
BIO_puts(web, "GET " HOST_RESOURCE " HTTP/1.1\r\n"
"Host: " HOST_NAME "\r\n"
"Connection: close\r\n\r\n");
In your code you used "Connection: keep-alive". That's perfectly fine if you want to make multiple HTTP requests using the same HTTP connection. Just make sure the last request you want to make used "Connection: close". Also be aware that a HTTP server may decide to close the connection at any time by sending a "Connection: close" header.
"Connection: close" is easier to begin with.
If you only want to download a web page, these four lines of code are usually enough - unless you need a login/cookies/access token/... for a specific web site. Additional requests such as POST (e.g, submitting a web form) will require more lines on top of OpenSSL. If you need this, you should consider using an additional library instead of implementing it on your own.
The response of the server needs to be split into header (everything above the first empty line) and body (everything below). Depending on the header, it might be required to interpret the body data differently.
For example, www.google.com will send one header line "Transfer-Encoding: chunked" (instead of "Content-Length: ####"). This are two different ways a server can let the client know how long the body data is supposed to be. If you get a "Content-Length: 1234" header, you know that you have to read 1234 bytes in your HTTP protocol implementation.
If you get a "Transfer-Encoding: chunked" header, the server will first send a hex number, followed by "\r\n". Followed by as many bytes as the hex number stated before. Followed by another hex number, "\r\n" and more data. Finally a hex number "0" will indicate the end of the data. The hex numbers and "\r\n" are not part of the HTML page - you need to remove it (if you keep it, you will end up with broken HTML or whatever you want to download).
If a server neither sends "Content-Length:" nor "Transfer-Encoding:" then you need to read until the server closes the connection.
This is also part of the HTTP protocol hat has to be implemented on top of OpenSSL for a HTTPS client. You will have to implement all three in a HTTP or HTTPS client, unless you need to communicate with only one server and you know it is only using "Content-Length: ####".

Slack - How to upload a file with a mode other than snippet through API

I am attempting to load files to a slack channel automatically from a server via the API, but I'm running up against the 1MB limit for snippets. Is there a way to use file.upload to post files as either hosted or post?
Here's the curl command I'm testing with:
curl -F file=$FILE_PATH -F channels=$CHANNEL_ID \
-F token=$TOKEN -F filename=$SLACK_FILE_NAME \
$SLACK ADDRESS -x $PROXY_SERVER
This successfully posts the file to the channel, but the mode is snippet which means it's not a particularly elegant looking interface in slack and subject to the very small 1MB limit.
EDIT:
When posting the file through the API here is the response that I am getting back (with potentially sensitive information sanitized):
{"ok":true,
"file":{"id":ID_NBR,
"created":1529417913,
"timestamp":1529417913,
"name":FILE_NAME,
"title":FILE_TITLE,
"mimetype":"text\/csv",
"filetype":"csv",
"pretty_type":"CSV",
"user":USER_ID,
"editable":true,
"size":74810,
"mode":"snippet", ###### This is where I see the mode snippet through APIs, in channel it appears as a snippet as well ######
"is_external":false,
"external_type":"",
"is_public":true,
"public_url_shared":false,
"display_as_bot":false,"username":"",
"url_private":URL,
"url_private_download":URL,
"permalink":URL,
"permalink_public":URL,
"edit_link":URL,
"preview":FILE_DATA,
"preview_highlight":"<div class=\"CodeMirror cm-s-default CodeMirrorServer\" oncopy=\"if(event.clipboardData){event.clipboardData.setData('text\/plain',window.getSelection().toString().replace(\/\\u200b\/g,''));event.preventDefault();event.stopPropagation();}\">\n<div class=\"CodeMirror-code\">\n<div><pre>FILE_DATA<\/pre><\/div>\n<\/div>\n<\/div>\n",
"lines":202,
"lines_more":201,
"preview_is_truncated":true,
"channels":[CHANNEL_IDS],
"groups":[],
"ims":[],
"comments_count":0}}

Curl doesn't send entire form-data in HTTP POST request

Edit:
Problem: 2 and Problem: 3 solved by following #melpomene comment i.e., by using number of bytes read to print the buffer.
But still struck on Problem: 1.
I have written a TCP server-client program. Later out of curiosity, I want to know about HTTP server.
My previous question: Simple TCP server can't output to web browser
Now I'm just seeing what and how the data is transferred to the server by using GET and POST( form-data and x-www-form-urlencoded for now).
I'm following How to cURL POST from the Command Line to send POST requests.
When I send x-www-form-urlencoded as:
curl -d "data=example1&data2=example2" localhost:8080
Output on Server:
POST / HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.54.0
Accept: */*
Content-Length: 28
Content-Type: application/x-www-form-urlencoded
data=example1&data2=example2
This is as expected.
Problem: 1
Now comes the problem. When I try to send form-data, the output is not expected.
When I send form-data as:
curl -X POST -F "name=user" -F "password=test" localhost:8080
Output on server:
POST / HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.54.0
Accept: */*
Content-Length: 244
Expect: 100-continue
Content-Type: multipart/form-data; boundary=------------------------78b7f8917ad1992c
I'm getting the boundary but I'm not getting the next part like the data I'm sending.
Problem: 2
One more odd thing is when I try to send x-www-form-urlencoded after sending form-data.
When I send x-www-form-urlencoded after form-data as:
curl -d "data=example1&data2=example2" localhost:8080
Output on server:
POST / HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.54.0
Accept: */*
Content-Length: 28
Content-Type: application/x-www-form-urlencoded
data=example1&data2=example2------------78b7f8917ad1992c
Why am I getting boundary here?
Problem: 3
And also while sending GET as:
curl localhost:8080
Output on server:
GET / HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.54.0
Accept: */*
ontent-Length: 28
Content-Type: application/x-www-form-urlencoded
data=example1&data2=example2------------78b7f8917ad1992c
I'm getting Content-Type and x-www-form-urlencoded data along with boundary.
What am I doing wrong?
Is something wrong with my code or with my understanding?
Server.c:
// Server side C program to demonstrate Socket programming
#include <stdio.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdlib.h>
#include <netinet/in.h>
#include <string.h>
#define PORT 8080
int main(int argc, char const *argv[])
{
int server_fd, new_socket; long valread;
struct sockaddr_in address;
int addrlen = sizeof(address);
char buffer[1024] = {0};
char *hello = "HTTP/1.1 200 OK\nContent-Type: text/plain\nContent-Length: 12\n\nHello world!";
// Creating socket file descriptor
if ((server_fd = socket(AF_INET, SOCK_STREAM, 0)) == 0)
{
perror("In socket");
exit(EXIT_FAILURE);
}
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons( PORT );
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address))<0)
{
perror("In bind");
exit(EXIT_FAILURE);
}
if (listen(server_fd, 10) < 0)
{
perror("In listen");
exit(EXIT_FAILURE);
}
while(1)
{
printf("\n+++++++ Waiting for new connection ++++++++\n\n");
if ((new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen))<0)
{
perror("In accept");
exit(EXIT_FAILURE);
}
valread = read( new_socket , buffer, 1024);
printf("%s\n",buffer );
write(new_socket , hello , strlen(hello));
printf("------------------Hello message sent-------------------\n");
close(new_socket);
}
return 0;
}
Problem 1
I have checked the request headers and found Expect: 100-continue header. This is the first time I have seen this header.
A simple search in Google shows this is causing the problem.
Expect: 100-Continue' Issues and Risks (I'm just gonna paste everything to avoid dead link)
How the Expect: 100-Continue Header Works
When Expect: 100-Continue is NOT present, HTTP follows approximately the following flow (from the
client's point of view):
1. The request initiates a TCP connection to the server.
2. When the connection to the server is established, the full request--which includes both the request headers and the request body--is transmitted to the server.
3. The client waits for a response from the server
(comprised of response headers and a response body).
4. If HTTP
keep-alives are supported, the request is optionally repeated from
step 2.
When the client is using the Expect: 100-Continue feature, the
following events occur:
1. The request initiates a TCP connection to the server.
2. When the
connection to the server is established, the request--including the
headers, the Expect: 100-Continue header, without the request body--is
then transmitted to the server.
3. The client then waits for a response
from the server. If the status code is a final status code, using the
prior steps above the client retries the request without Expect:
100-Continue header. If the status code is 100-Continue, the request
body is sent to the server.
4. The client will then wait for a response
from the server (comprised of response headers and a response body).
5. If HTTP keep-alives are supported, the request is optionally repeated
from step 2.
Why use Expect: 100-Continue?
API POST requests that
include the Expect: 100-Continue header save bandwidth between the
client and the server, because the server can reject the API request
before the request body is even transmitted. For API POST requests
with very large request bodies (such as file uploads), the server can,
for example, check for invalid authentication and reject the request
before the push body was sent, resulting in significant bandwidth
savings.
Without Expect: 100-Continue:
Without the Expect: 100-Continue
feature, the entire API request, including the (potentially large)
push body would have to be transmitted before the server could even
determine if the syntax or authentication is valid. However, since the
majority of our API requests have small POST bodies, the benefits of
separating the request header from the request body is negligible.
Problems when the request header and body are sent separately
Because
of the high volume of requests that Urban Airship handles, many levels
of complexity exist between our clients and the servers responsible
for responding to API requests. This is not an abnormal phenomenon for
most server configurations and strategies, but it does introduce a
risk of elevated request failures to any API POST requests using the
Expect: 100-Continue header. This is due to the fact that the request
header and the request body are sent separately from one another, and
must travel through the same connection throughout the entire API
server infrastructure.
With the number of proxies, load-balancing servers, and back-end
request processing servers that are implemented, requests with the
Expect: 100-Continue header have an increased probability of becoming
separated from one another, and hence returning with an error.
What To Expect:
We've always attempted to support Expect: 100-Continue.
However, we have determined that our customers that use Expect:
100-Continue are receiving a sub-optimal quality of service due to
elevated request failures.
Additionally, the majority of our API requests have small POST bodies,
and as a result the benefits of separating the request header from the
request body are negligible. These reasons have motivated us to
disable support for Expect: 100-Continue service-wide.
Our Recommendations:
We recommend against the use of Expect:
100-Continue. If you receive an HTTP Error 417 (Expectation failed),
retry the request without Expect: 100-Continue.
So, to prevent Expect: 100-continue header in POST form-data, include -H 'Expect:' in your `curl
curl -X POST -F "name=user" -F "password=test" localhost:8080 -H 'Expect:'
Now you can receive your entire data in one go(just like Postman) as you said in comments.
Problem 2 & 3
As #melpomene said in comments, read() doesn't put \0 after reading. That's why you are seeing data from previous requests.
So, just use valread to iterate over string to print or just declare variable in your while loop as I said in the comments.
Code:
while(1)
{
printf("\n+++++++ Waiting for new connection ++++++++\n\n");
if ((new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen))<0)
{
perror("In accept");
exit(EXIT_FAILURE);
}
char buffer[30000] = {0}; // This way you get new variable everytime. So, there is no need to iterate over the string using valread value.
valread = read( new_socket , buffer, 30000);
printf("%s\n",buffer );
write(new_socket , hello , strlen(hello));
printf("------------------Hello message sent-------------------%lu\n", valread);
close(new_socket);
}

POST JSON to url using CURL and C

I'm tring to POST JSON data to a url from bash using:
$ curl -v -d '{xxx:200}&apikey=xxxxx' -X POST http://localhost/xxxx/input/post.json -H "Accept: application/json" -H "Content-Type:application/json"
And in C using the following:
int main(void)
{
CURL *easyhandle;
curl_global_init(CURL_GLOBAL_ALL);
easyhandle = curl_easy_init();
if(easyhandle) {
char *data="json={xxx:200}&apikey=xxxxx";
curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, data);
curl_easy_setopt(easyhandle, CURLOPT_URL, "http://localhost/xxxx/input/post.json");
curl_easy_perform(easyhandle);
curl_easy_cleanup (easyhandle);
}
curl_global_cleanup();
return 0;
}
This is what i'm trying to achieve actually:
http://localhost/xxxx/input/post.json?json={xxx:200}&apikey=xxxxx
It doesn't seem to work. :(
I'm a complete novice to curl. Please help.
Thanks!
Fortunately the server I was sending the data handled POST and GET requests, so the code in question would suffice.
Others with a similar problem can use a simple workaround (If your code doesn't have real time constraints and is not performance intensive). You may fork a bash process using system() from C. This avoids you the trouble of encoding.
What you are trying to do is not performing a POST request but performing a GET request. However, I'm not sure this is a good idea, since GET parameters are limited in length (to something like 2 kB or so), and - as others have already mentioned - they need to be encoded and decoded and all the funky stuff which is a pain in the neck.
The URL pointing to localhost suggests me that you have control over the server code. If you used the POST parameters instead of the GET ones, you could use your current code as-is (which sets the POST body of the request, which is probably the right thing to do -- so you don't have to change your client code, you only have to change the server code.)

Meaning of libcurl messages and execution process

I am using libcurl library to fetch abc-1.tar file from server. I want to know meaning of message which is display and process of execution of libcurl to display this messages.
For Example: I provide some messages below out of that I know basic message meaning like Content-Length means length of file which are downloaded, etc.
I want meaning of all messages, particularly messages which are start with * (e. g. Connection #0 to host (nil) left intact)
* Re-using existing connection! (#0) with host (nil)
* Connected to (nil) (182.72.67.14) port 65101 (#0)
GET /...... HTTP/1.1
Host: 182.72.67.14:65101
Accept: */*
Connection:keep-alive
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Length: 186368
< Content-Type: application/x-tar
< Server: Microsoft-IIS/7.5
< Content-Disposition: attachment; filename=abc-1.tar
< X-AspNet-Version: 4.0.30319
< X-Powered-By: ASP.NET
< Date: Tue, 01 Oct 2013 06:29:00 GMT
<
* Connection #0 to host (nil) left intact
cURL's Man Page specifies three types of "special" verbose output:
A line starting with '>' means "header data" sent by curl, '<' means "header data" received by curl that is hidden in normal cases, and a line starting with '*' means additional info provided by curl.
You can read about HTTP header fields in the HTTP official publication page. Any other output lines displayed by cURL belong to the HTTP body carried by the corresponding message.
So what is the actual meaning of these informationals starting with *, you ask? They inform you about the status of the transfer's TCP connection with the host. For instance:
"Connected to (nil) (182.72.67.14) port 65101 (#0)" means that a TCP connection is established with the server side (in your case: 182.72.67.14). The #0 is the TCP session number (which is used only by cURL). The nil indicates that the host name couldn't be resolved via DNS (had it been resolved, the it would've appeared instead of nil).
"Connection #0 to host (nil) left intact" means that although the transfer is over, the TCP session itself is still open (i.e no FIN/ACK exchanges have been made), allowing you to keep reusing the same TCP connection for multiple transfers (which could be useful if you don't want to sacrifice time on opening a new TCP connection).
The message "Re-using existing connection! (#0) with host (nil)" supports that, indicating that cURL does indeed that, riding an existing TCP connection (from a previous transfer).
Marked by < are HTTP headers.You can read in detail about http headers and their meaning here
and marked by * are verbose information provided by curl which is displayed on stderr.

Resources