Translate PDF file using Google Translate API - file

I want to use Google Translate in my project. I completed all the formalities with Google. I have the API key also with me. With this key I can easily translate any word with JavaScript. But how to translate the PDF file as we can do in Google Translate site? I found one thing like this:
http://translate.google.com/translate?hl=fr&sl=auto&tl=en&u=http://www.example.com/PDF.pdf
But here I cannot use my key, as a result it takes so much time to translate. So I want to use my Key and translate a PDF file. Please help me out.
My approach is like this:
1. One html page I have.
2. One browse button for pdf
3. Upload the file
4. Transalte the pdf with Google API and show in the html page.
I searched it for this pdf translate with but did not find anything. Please help me out.

TL:DR: Use headless browser to render a PDF from the Google's PDF translation service.
PDF is a complex format and can include many components that are text. To translate it I will describe solution from easy one to more advanced.
Translate raw text
If you only need the translation without the visual output, you can extract the text and give it to Google Translate.
Since you did not provide information on your project (language, environment, ...) I will redirect you to this thread on how to extract text
Translate all text
If you need to get text from everything in your PDF, well that's pretty hard. To avoid headache (partially) you can convert the PDF to an image (using imagemagick tools or similar) and then you have three options:
OCR the text from the image, then give it to google, again you are loosing the original form.
OCR the text, but saving the position (some libraries can do that, again since you did not specify your project information, see theses links: #1, #2, #3, #4).
Then translate it with google api, and write the result to the image. For great results you need to take account of text font, color and background color. Pretty difficult, but feasible.
Translate the image using google translate image service. Unfortunately this feature is not available in the public API, so unless doing some reverse engineering, this is not possible.
Translate using Google's PDF translation service
The solution you provide by using the translate site can be automated quite easily. The reason it's long is because it is an heavy process and you probably won't beat Google.
Using an headless browser, you can get the translation page with your pdf, then observe that the translated content is sitting in an iframe, get that iframe and finally print to PDF.
Here is a short example using SlimerJS (should be compatible for Phantomjs)
var page = require("webpage").create();
// here you may want to setup page size and options
// get the page
page.open('https://translate.google.fr/translate?hl=fr&sl=en&u=http://example.com/pdf-sample.pdf', function(status) {
if (status !== 'success') {
console.log('Unable to access network');
} else {
// find the iframe with querySelector
var iframe_src = page.evaluate(function() {
return document.querySelector('#contentframe').querySelector('iframe').src;
});
console.log('Found iframe: ' + iframe_src);
// render the iframe
page.open(iframe_src, function(status) {
// wait a bit for javascript to translate
// this can be optimized to be triggered in javascript when translation is done
setTimeout(function() {
// print the page into PDF
page.render('/tmp/test.pdf', { format: 'pdf' });
phantom.exit(0);
}, 2000);
});
}
});
Giving this file: http://www.cbu.edu.zm/downloads/pdf-sample.pdf
It produce this result (translated in French): (I posted a screenshot since I cannot embed PDF ;) )

Use Apache Tika to extract the text content of the pdf file(you should write the necessary java code), then use whatever API you want to use to translate it. But, as it has been mentioned above Google Translate is a paid service.

Related

how do you use Axios to create a preview of a page in React?

I have never used Axios before and i am new to the fetch Api, could I have some advice on how to display a URL preview within a website?
The best I have as an example of the code I would like is below:
In the not working code, I used the fetch API to try and change the state of coverImageURL to the image of the link that will be clicked (boards). Though it does not work.
useEffect(()=>{
setDigit({ id:mapper.length +1,
text:JSON.stringify(myRef.current.innerHTML)
});
fetch("/boards")
// vvvv
.then(response => response.blob())
.then(images => {
setCoverImageUrl(URL.createObjectURL(images));
console.log(coverImageUrl)
})
},[bool])
It sounds like you want a screenshot of a webpage.
response.blob() will not give you that. It is simply a Blob representation of whatever comes over the wire. In the case of HTML, converting HTML text to a Blob does not create an image of any kind. Not only that, but most webpages have additional content that a browser will request to properly render the page (css/images/APIs). A single fetch to the page URL will get you none of that content, unless it is all inlined. The fetch API couldn't assemble anything resembling your webpage even if it did make all those requests.
There are link previewer libraries out there (search google/github/npm or check out this link) but those just give some metadata and an image taken from the page. I am sure you have run into these link previews in the wild.
If you must show a screenshot of a webpage, you can try something like Puppeteer but you will need a backend server for that and it will be nothing close to resembling a realtime user experience in terms of latency...
Edit: if the content is static for the most part (changes infrequently), and you only need previews of local links, you can have your build server generate these images using something like puppeteer and save them as assets. Depending on your needs, that may be an option....

Serving dynamically generated files from GitHub Pages

I'm a few months into web development so I apologize if I misunderstood anything.
What I did
I created a react-random-shapes package that would draw out random shapes as a React component. You can see an example here on my site or in the project page. Each time you refresh the page you'd get a new image. (Note: these pages use React.)
What I want to do next
The result I'm aiming for is to create an API (GET-only) on GitHub Pages that would return the dynamically generated svg file (so you can do something like
<img src="https://github.com/artt/react-random-shapes/blob?size=300&fill=red">
which would return a random blob for anyone who's interested in using. Alternatively, this API could return the svg path so the user could do whatever they want with it (e.g. animation).
The problem I have
Right now I know how to output an html page with the svg file, but not quite sure how to return just the svg (or json, etc.) part of it.
Thanks!
I am trying to do the same thing. I think your best bet would be to use a webserver on another platform like Heroku or, another good option, Replit.

How do I save rich text editor data to DRF - postgresql database and display in React

I want to develop a blog application with Django backend and React frontend. I shall be using Postgresql.
I want to use a rich text editor like Quill to write the blog article. My questions:
I heard that article written in a text editor needs to be converted to HTML before saving in the database. If so, how do I do this in Django Rest Framework?
How do I present the article keeping the same style and formatting in the frontend from the database?
Say, I include multiple photos in the article. How do I save all the photos in the database? i.e. what should be the schema then?
I want to have my doubts clear before I jump in.
I'm also doing the same thing at the moment. For your answers :
In DRF, the simplest way to post the data is by using Textfield in your model. Rich text field (with Tags) will be stored in the Postgres. In the Admin page or the DRF API you'll see something like this
Then, to re-render it to the front end, you can use any HTML Parser library. for example I'm using "react-html-parser" that simply convert the rich text into the defined styling.
As for Image, this is a bit tricky, and I havent done this part myself but what i could think of right now is you would create another model & end points to store the images.
when sending the post request to the django, you would convert the base file path/url from the front end to back end. example :
original > http://localhost:3000/image/efewf23r.jpg
new (django) > http://localhost:8000/media/img/img_model/efewf23r.jpg
then do a second post request to the image itself and make sure django would rename the the file as per what we set above.
Let me know if you found a better solution.
It's been long since I posted this question. After that, I have gained enough working knowledge to make Quill.js (the rich text editor I'm using) work with React.JS, or, in my case Next.JS. So, this is focused on Quill.js only. The Quill npm package more specific to ReactJS is react-quill. I am presenting it as beginner friendly as possible.
A built-in function is provided with Quill: editor.getHTML(). editor is the current editor instance, where one types the content. This method generates the innerHtml of the content one prepares in the editor.
To save it to the databse, simply POST it to your back-end. But you must sanitize this innerHtml before passing it to the database. Can't say about server-side but I had to do this sanitization on the client-side. One good package is DOMPurify. You need to save this to the database if you want to present it in the same manner as it was typed in the browser.
The first point also provides the solution to my 2nd question. But one important point: The content one writes in Quill editor is also available as a JSON like format called quill-delta. You can get the delta with the function editor.getContents(). You need to POST it to the database if you want to edit the content in a later time.
To edit, you need to get this delta from the database and then initialize Quill editor with this delta in the value attribute.
For example, the text in orange is the delta representation of the text in the editor:
codepen source.
There is another function editor.getText() which extracts all the text from the editor.
Photos. Generally in Quill, you simply put the photo in the editor and Quill generates a base64 encoded delta for the photo. It's this easy. You don't need to worry about separate image fields.

LinkedIn share links to PDF documents

I am trying to create buttons on a web page that allow users to share links to PDF documents on LinkedIn. LinkedIn loads a window without any errors but offers no link or preview of the PDF or any indication of what is being shared.
Here are the two methods I have tried. First the plugin method.
<script type="in/share" data-url="http://example.net/DocumentDownload.aspx?Command=Core_Download&entryID=114"></script>
And, secondly with a custom url.
TEST
Encoding the url makes no difference.
The above links are direct document links from a DNN web site using Document Exchange. If I change the urls to any html page it works fine and LinkedIn seems to be able to extract the useful information right from the page and use that for the share details.
Can LinkedIn handle this kind of thing? There is nothing to guide me on the type of links that can be shared. I can't find any information about it. There are no errors in the web console.
Not sure, but you should try to provide LinkedIn with the link that has .pdf at the end, like http://example.com/documents/file1.pdf. I guess LinkedIn just checks the URL if it has .pdf file at the end to decide if it is a PDF document or not.
I have no problem sharing pdf's on LinkedIn. Check it out...
https://www.linkedin.com/sharing/share-offsite/?url=https://www.revoltlib.com/anarchism/the-conquest-of-bread/view.pdf
Works perfectly fine. And view.pdf is a script, not a file, either, so, it's not looking for a PDF file to analyze, so much as headers that indicate you have a PDF file available to analyze, so, in PHP, at DocumentDownload.aspx, we would do...
header('Content-type: application/pdf; charset=utf-8');
This header let's the sharing app know that it can analyze the document as a PDF file and extract useful information from it, as you can see from the screen shot.

Converting Blob object to html in google app

I have stored user uploaded document (.doc ,.pdf) as a Blob object into data-store.
Instead of allowing user to download the document, I would like to present it as an HTML page
for viewing the doc. how do I convert Blob into HTML ? does google app engine provides any ready made API for the same?
There is no ready made API in AppEngine to convert .doc or .pdf (or or other types of) files to HTML. You would need to find a library for your preferred language to parse the blob file into its parts structured as an object model (like a DOM). Then you would need to write code to convert individual parts of the object model to HTML, unless you are lucky enough to find another library. And no, StackOverflow is not a good place to ask "what library is there...".
No. AppEngine itself does not provide any file format conversion tools. You might want to look into Google Drive API, which might, to some extent, do the format conversion for you.
You can have embed a PDF reader on a web page by using pdf.js.
Most browsers already have a built-in PDF viewer. If you provide a link to a PDF file, when users click on it, many browsers will automatically display the document. Those browsers that do not support this option, will offer a user to download the file to their hard-drive.
This is the easiest solution - you don't have to do anything at all.

Resources