Best practice for large site

Best practice for large site - json-ld

I am working on a large site and want to implement JSON-LD. The site has a large social media following and a lot of artist profiles and articles.
This is what I currently have, (the following code is from Google's guidelines)
Front page
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "Organization",
"name": "Organization name",
"url": "http://www.your-site.com",
"sameAs": [
"http://www.facebook.com/your-profile",
"http://instagram.com/yourProfile",
"http://www.linkedin.com/in/yourprofile",
"http://plus.google.com/your_profile"
]
}
</script>
Content pages
<script type='application/ld+json'>
{
"#context": "http://www.schema.org",
"#type": "WebSite",
"name": "About us",
"url": "http://www.your-site.com/about-us"
}
</script>
Profile pages of each artist:
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "NewsArticle",
"mainEntityOfPage": {
"#type": "WebPage",
"#id": "https://google.com/article"
},
"headline": "Article headline",
"image": [
"https://example.com/photos/1x1/photo.jpg",
"https://example.com/photos/4x3/photo.jpg",
"https://example.com/photos/16x9/photo.jpg"
],
"datePublished": "2015-02-05T08:00:00+08:00",
"dateModified": "2015-02-05T09:20:00+08:00",
"author": {
"#type": "Person",
"name": "John Doe"
},
"publisher": {
"#type": "Organization",
"name": "Google",
"logo": {
"#type": "ImageObject",
"url": "https://google.com/logo.jpg"
}
},
"description": "A most wonderful article"
}
</script>
Do I add one script tag per page or do I add all JSON-LD under one script tag? On the front page I have the "Organization" tag and show the social media links, do I add this on all pages?

You may have multiple script JSON-LD data blocks on a page, but using one script element makes it easier to connect the structured data entities: you can nest entities instead of having to reference their URIs.
What to connect? Your NewsArticle can
provide the WebPage¹ entity as value for the mainEntityOfPage property, and
provide the Organization entity as value for the publisher property.
This is only one possibility. Another one: You could provide the WebPage entity as top-level item and provide the NewsArticle entity as value for the mainEntity property.
If you have to duplicate data (for example, because the Organization is author and publisher, or because it’s the publisher of both, the WebPage and the NewsArticle), you can mix nesting and referencing. Give each entity an #id and wherever you provide this entity as value, also provide its #id.
¹ You are using WebSite, but you probably mean WebPage. Also note that the #context should be http://schema.org, not http://www.schema.org.

Related

HTML Email sent with a Google's Event Schema is not adding event to Google Calendar

I am sending an HTML email to my costumers after they book a reservation on my restaurant's website. I wanted the email to automatically add an event to the costumer's calendar, and so I was adding the following script after the HTML opening body tag to test it (https://developers.google.com/gmail/markup/reference/restaurant-reservation):
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "FoodEstablishmentReservation",
"reservationNumber": "OT12345",
"reservationStatus": "http://schema.org/Confirmed",
"underName": {
"#type": "Person",
"name": "John Smith"
},
"reservationFor": {
"#type": "FoodEstablishment",
"name": "Wagamama",
"address": {
"#type": "PostalAddress",
"streetAddress": "1 Tavistock Street",
"addressLocality": "London",
"addressRegion": "Greater London",
"postalCode": "WC2E 7PG",
"addressCountry": "United Kingdom"
}
},
"startTime": "2022-10-01T08:00:00+00:00",
"partySize": "2"
}
</script>
But the event is not being added to the calendar...
To test this I am using gmail to send emails to myself (from x#gmail.com to x#gmail.com) (skip google authentication). And I am using the Restaurant reservation schema (suported schemas).
When I inspect the received email's HTML that script is gone. I do see it if I check the original message ("Show original" option on gmail). Is this normal? Did Google/Gmail already "took care of it" before it being displayed on the HTML, or should it be there for Google/Gmail to be able to "recognize it"? I wanted to at least understand at which step the problem is happening.
Any ideas?

Need JSON-LD Structured Data Example for Multiple Courses

I have referred to the following example given by Google:
<html>
<head>
<title>Introduction to Computer Science and Programming</title>
<script type="application/ld+json">
{
"#context": "https://schema.org",
"#type": "Course",
"name": "Introduction to Computer Science and Programming",
"description": "Introductory CS course laying out the basics.",
"provider": {
"#type": "Organization",
"name": "University of Technology - Eureka",
"sameAs": "http://www.ut-eureka.edu"
}
}
</script>
</head>
<body>
</body>
</html>
But I have a page with list of Soft Skill Courses. Google also gives something called ItemList, but no example has been given on how to put it together with the Course. How can I specify multiple Courses structured data in JSON-LD? Thanks!

I had the same issue. It is enough to open one more script like this:
<script type="application/ld+json">
{
"#context": "https://schema.org",
"#type": "Course",
"name": "Introduction to Computer Science and Programming",
"description": "Introductory CS course laying out the basics.",
"provider": {
"#type": "Organization",
"name": "University of Technology - Eureka",
"sameAs": "http://www.ut-eureka.edu"
}
}
</script>
<script type="application/ld+json">
{
"#context": "https://schema.org",
"#type": "Course",
"name": "Introduction to Computer Science and Programming",
"description": "Introductory CS course laying out the basics.",
"provider": {
"#type": "Organization",
"name": "University of Technology - Eureka",
"sameAs": "http://www.ut-eureka.edu"
}
}
</script>

What JSON-LD structured data to use for a multi-pararaph, multi-image blogpost?

I have created the following JSON-LD for a blogpost in my blog:
{
"#context": "http://schema.org",
"#type": "BlogPosting",
"mainEntityOfPage": {
"#type": "WebPage",
"#id": "https://www.example.com"
},
"headline": "My Headline",
"articleBody": "blablabla",
"articleSection": "bla",
"description": "Article description",
"inLanguage": "en",
"image": "https://www.example.com/myimage.jpg",
"dateCreated": "2019-01-01T08:00:00+08:00",
"datePublished": "2019-01-01T08:00:00+08:00",
"dateModified": "2019-01-01T08:00:00+08:00",
"author": {
"#type": "Organization",
"name": "My Organization",
"logo": {
"#type": "ImageObject",
"url": "https://www.example.com/logo.jpg"
}
},
"publisher": {
"#type": "Organization",
"name": "Artina Luxury Villa",
"name": "My Organization",
"logo": {
"#type": "ImageObject",
"url": "https://www.example.com/mylogo.jpg"
}
}
}
Now, I have some blog posts that contain multiple paragraphs and each paragraph is accompanied by an image. Any ideas how can I depict such a structure with JSON-LD?
Background
I have created a simple blog which uses a JSON file for 2 purposes: (a) feed the blog with posts instead using a DB (by using XMLHttpRequest and JSON.parse) and (b) to add JSON-LD structured data to the code for SEO purposes.
When I read the JSON file I have to know which image belongs to which paragraph of the text in order to display it correctly.

Note: As you seem to need this only for internal purposes, and as there is typically no need to publically provide data about this kind of structure, I think it would be best not to provide public Schema.org data about it. So you could, for example, use it to build the page, and then remove it again (or whatever works for your case). Then it would also be possible to use a custom vocabulary (under your own domain) for this, if it better fits your needs.
You could use the hasPart property to add a WebPageElement for each paragraph+image block.
Each WebPageElement can have text and image (and, again, hasPart, if you need to nest them).
Note that JSON-LD arrays are unordered by default. You can use #list to make it ordered.
"hasPart": { "#list":
[
{
"#type": "WebPageElement",
"text": "plain text",
"image": "image-1.png"
},
{
"#type": "WebPageElement",
"text": "plain text",
"image": "image-2.png"
}
]
}
For the blog posting’s header/footer, you could use the more specific WPHeader/WPFooter instead of WebPageElement.

Annotating nested structures/values in JSON-LD

Say I have a JSON object with some properties in a nested object.
{
"title": "My Blog Post",
"meta": {
"publishedAt": "2016-08-01T00:00:00Z"
}
}
Is there an easy way I can just add a #context to my top-level object to reach
these properties (i.e. just "pass through" the meta object)? Something along
these lines:
{
"#context": {
"title": "schema:name",
"meta.publishedAt": {
"#type": "xsd:date",
"#id": "schema:datePublished"
}
},
"#id": "/my-article",
"title": "My Blog Post",
"meta": {
"publishedAt": "2016-08-01T00:00:00Z"
}
}
I would like to avoid having to add (duplicate) #id to the nested object, which is how I would otherwise have solved it:
{
"#context": {
"title": "schema:name",
"meta": { "#id": "_:meta", "#container": "#set" },
"publishedAt": {
"#type": "xsd:date",
"#id": "schema:datePublished"
}
},
"#id": "/my-article",
"title": "My Blog Post",
"meta": {
"#id": "/my-article",
"publishedAt": "2016-08-01T00:00:00Z"
}
}
This solution works, but requires duplication, and comes from ethanresnick's
comments on Github about annotating JSON API. He noted in another issue that #context is not "quite expressive enough to annotate the JSON API structure". I was hoping to prove him wrong at least with regards to this issue.

I just discovered that the latest JSON-LD spec includes a new section on nested properties. Defining your context like this should result in the desired output:
{
"#context": {
"title": "schema:name",
"meta": "#nest",
"publishedAt": {
"#type": "xsd:date",
"#id": "schema:datePublished",
"#nest": "meta"
}
},
...
}

If what you're trying to do is eat the meta element, then no, this can't be done in JSON-LD.
There have been discussions about doing an inverse-index that could do something like this, but I don't see an issue. You might create one at https://github.com/json-ld/json-ld.org/issues. At some point the CG, or a newly formed WG will start looking at feature requests for a new version.

Is there an open-source version of Facebook's Linter?

When you post a link to Facebook, it grabs the article title, description and relevant images. Most major sites have the required OG tags, making it easy to grab this info, but FB is also able to handle websites that don't have them (you can try it here).
Clearly they've got a system in place for grabbing this info in the absence of OG tags. Does anyone know if there's an open-source version?
I'm thinking it would need (in order of preference for each section):
Title:
Check for og:title tag.
Check for regular meta "title" tag.
Check for h1 tag.
Description:
Check for og:description tag.
Check for regular meta "description tag"
Check for div or p tags with sufficient content to indicate a body paragraph
Images:
Check for og:image tags
Check for images over a certain size (say 100x100) and give priority to those that come first.
Thanks a lot!

https://github.com/Anonyfox/node-htmlcarve
The htmlcarve module for Node.js does most of what you're after, here's the output generated from this page:
htmlcarve = require('htmlcarve');
htmlcarve.fromUrl('https://scotch.io/tutorials/using-mongoosejs-in-node-js-and-mongodb-applications', function(error, data) {
console.log(JSON.stringify(data, null, 2));
});
This produces:
{
"source": {
"html_meta": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose â¥ Scotch",
"summary": "",
"image": "/wp-content/themes/thirty/img/scotch-logo.png",
"language": "en-US",
"feed": "https://scotch.io/feed",
"favicon": "https://scotch.io/wp-content/themes/thirty/img/icons/favicon-57.png",
"author": "Chris Sevilleja"
},
"open_graph": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose",
"summary": "",
"image": "https://scotch.io/wp-content/uploads/2014/11/mongoosejs-node-mongodb-applications.png"
},
"twitter_card": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose",
"summary": "",
"author": "sevilayha"
}
},
"result": {
"title": "Easily Develop Node.js and MongoDB Apps with Mongoose",
"summary": "",
"image": "https://scotch.io/wp-content/uploads/2014/11/mongoosejs-node-mongodb-applications.png",
"author": "sevilayha",
"language": "en-US",
"feed": "https://scotch.io/feed",
"favicon": "https://scotch.io/wp-content/themes/thirty/img/icons/favicon-57.png"
},
"links": {
"deep": "https://scotch.io/tutorials/using-mongoosejs-in-node-js-and-mongodb-applications",
"shallow": "https://scotch.io/tutorials/using-mongoosejs-in-node-js-and-mongodb-applications",
"base": "https://scotch.io"
}
}
If you've got Node.js installed, then install it using
npm i -g htmlcarve
and you can run it from the command line directly.