How To Circumvent 504 Errors - reactjs

I am working in ReactJs and one of the main aspects of our project is the ability to upload a scorecard and have all of its results parsed and placed into objects. However, due to the nature of these pdfs that get uploaded, there's a LOT of information, an average of 12-14 pages.
Most of the information is irrelevant, I usually will only need pages 5-7, but users will be users, and they upload all 12.
I am using the pdfParser API which is very good, we're not looking for replacements on that. However, due to how large the file is, if I am somewhere with only half-decent connection, I am hit with a 504 error since the process takes so long. If I have good to great connection, there's no issue.
This being said I have two questions:
Is there a way to extend the amount of time that needs to elapse before my computer gives up on the process
Is there a way to parse only SOME of the pages that get submitted?
The relevant code will be shown below...
var url = 'https://pdftables.com/api?key=770oukvvx1wl&format=xlsx-single';
const pdfToExcel = (pdfFile) => {
var req = request.post({encoding: null, url: url}, async function (err, resp, body) {
if (!err && resp.statusCode == 200) {
fs.writeFile(`${pdfFile.path}.xlsx`, body, function(err) {
if (err) {
console.log('error writing file');
}
});
} else {
console.log('error retrieving URL');
};
});
var form = req.form();
form.append('file', fs.createReadStream(`./${pdfFile.path}`));
}
const parseExcel = async (file) => {
let workSheetsFromFile;
if (file.path.search(".xlsx") === -1) {
const filePath = await path.resolve(`./${file.path}.xlsx`)
workSheetsFromFile = await xlsx.parse(`./${file.path}.xlsx`);
await fs.unlinkSync(`./${file.path}`)
await fs.unlinkSync(filePath)
return workSheetsFromFile[0].data
}
if (file.path.search(".xlsx") !== -1) {
const filePath = await path.resolve(`./${file.path}`)
workSheetsFromFile = await xlsx.parse(`./${file.path}`);
await fs.unlinkSync(filePath)
return workSheetsFromFile[0].data
}
}

Related

Restful API failing sporadically but "catch" statement is always called, even when it does not fail

We are using a restful API to retrieve information about esports matches being played. From time to time the page simply loads, with no info being returned form the API.
I am fairly confident that the issue is with the API itself, but wanted to double-check that we are not doing anything wrong. Please see our code below:
const proxyurl = "https://cors-anywhere.herokuapp.com/";
const url = "http://datafeed.bet/en/esports.json ";
fetch(proxyurl + url)
.then(response => response.json())
.then(data => {
const list = data;
const games =
list &&
list.Sport &&
list.Sport.Events &&
list.Sport.Events.map((match) =>
match.Name.substr(0, match.Name.indexOf(","))
);
const uniqueGames = [...new Set(games)];
let combinedMatches = [];
data &&
data.Sport &&
data.Sport.Events &&
data.Sport.Events.map((game) => {
game.Matches.map((match) => {
match.Logo = game.Logo;
match.TournamentName = game.Name;
match.CategoryID = game.CategoryID;
match.ID = game.ID;
});
combinedMatches = combinedMatches.concat(game.Matches);
});
this.setState({
gameData: data,
games: games,
uniqueGames: uniqueGames,
preloading: false,
filteredGames: combinedMatches,
allMatches: combinedMatches,
count: Math.ceil(combinedMatches.length / this.state.pageSize),
});
var i;
let allMatches = 0;
let temp;
for (i = 0; i < this.state.filteredGames.length; i++) {
temp = allMatches =
allMatches + this.state.filteredGames[i].Matches.length;
}
this.setState({ allMatches: allMatches });
})
.catch(console.log('error');
Something that confuses me is that whether the data is returned or not, the "catch" statement gets called, outputting "error" to the console. I would like to build in some workaround for when the data is not returned. Would this be placed in the "catch" statement? If so, why how do I only let the catch run if the operation actually fails.
When you do this:
.catch(console.log('error'))
You immediately invoke console.log('error') and pass its result (which is undefined) to the catch. In this case it's invoking it right away, before the AJAX operation is even performed.
What you want to pass to catch is a function which would invoke that operation if/when it needs to:
.catch(e => console.log('error'))
As an aside, you'd probably also want to log the error itself so you can see what happened, as opposed to just the string 'error':
.catch(e => console.log('error', e))

React Chrome extension and Promises

I am writing a Chrome extension in ReactJS.
I am looping through an array of URLs and trying to get the the HTML content of those pages.
this.state.advertData.map(function(e, i) {
common.updateTabUrl(e.url).then((tab) => {
common.requestHTML(tab).then((response) => {
console.log(response.content);
})
});
})
common.js:
let requestHTML = function(tab) {
return new Promise(function(resolve, reject) {
chrome.tabs.query({active: true, currentWindow: true}, function(tabs) {
chrome.tabs.sendMessage(tab.id, {'req': 'source-code'}, function (response) {
resolve(response)
})
})
})
}
let updateTabUrl = function(url) {
return new Promise(function(resolve, reject) {
let update = chrome.tabs.update({
url: url
}, function(tab) {
chrome.tabs.onUpdated.addListener(function listener (tabId, info) {
if (info.status === 'complete' && tabId === tab.id) {
chrome.tabs.onUpdated.removeListener(listener);
resolve(tab);
}
});
})
})
}
content_script.js
chrome.runtime.onMessage.addListener(function (request, sender, sendResponse) {
let response = '';
if (request.req === 'source-code') {
response = document.documentElement.innerHTML;
}
sendResponse({content: response});
});
My issue is that the response.content always seems to be the same. More importantly, the tab that updates seems to only ever display the last url in my array. I think it is a problem with the way I am handling Promises.
Any help is appreciated.
The problem with your code is that it doesn't wait for the previous URL to load before proceeding to the next one so only the last one gets actually loaded in a tab.
I suggest using 1) Mozilla's WebExtension polyfill, 2) await/async syntax, 3) executeScript that automatically runs when a tab is complete by default 4) a literal code string in executeScript so you don't need neither a separate file nor to declare the content script in manifest.json.
async function getUrlSourceForArray({urls, tabId = null}) {
const results = [];
for (const url of urls) {
await browser.tabs.update(tabId, {url});
const [html] = await browser.tabs.executeScript(tabId, {
code: 'document.documentElement.innerHTML',
});
results.push(html);
}
return results;
}
Invoking inside an async function:
const allHtmls = await getUrlSourceForArray({
urls: this.state.advertData.map(d => d.url),
tabId: null, // active tab
});
P.S. you can also open all the URLs at once in a new window in background, assuming there won't be more than say 10 URLs, otherwise you would risk exhausting the user's RAM.

How to for loop all documents in a collection - Azure CosmosDB - Nodejs

I have looked around at a few answers/questions regarding this issue but yet to find a solution.
I have a collection with documents (simplified) as such:
{
"id": 123
"stuff": "abc"
"array":[
{
"id2":456
"properties": [
{
"id3": 789
"important": true
}
]
}
]
}
I want to check for each document in my collection, for each array object within array, for each properties, if it has important: true for example. Then return:
"id": 123
"id2": 456
"id3": 789
I have tried using:
client.queryDocuments(self.collection._self, querySpec).toArray(function(err, results) {
if (err) {
callback(err);
} else {
callback(null, results[0]);
}
});
But the issue is an array has a maximum character limit. If my collection has millions of documents, this would presumably be exceeded. (Javascript Increase max array size)
Or, am I misunderstanding the above question? Is it talking about the number of objects in an array (of which, each can have unlimited object character length?)
Thus I am looking a for loop-esque solution, where each document is returned, I do my analysis, then move to then next/do them in parallel.
Any insight would be greatly appreciated.
But the issue is an array has a maximum character limit. If my
collection has millions of documents, this would presumably be
exceeded. (Javascript Increase max array size)
Based on my research,the longest possible array in js could have 232-1 = 4,294,967,295 = 4.29 billion elements. However, it is perfectly enough to meet your millions data volume requirements. In addition,you can't query such huge volume data directly surely,that's impossible you do that.
Whether about throughput constraints(RUs settings) or query efficiency factors, you should consider batching large volumes of data anyway.
Thus I am looking a for loop-esque solution, where each document is
returned, I do my analysis, then move to then next/do them in
parallel.
Maybe you could use v2 js sdk for cosmos db sql api.Please refer to the sample code:
const cosmos = require('#azure/cosmos');
const CosmosClient = cosmos.CosmosClient;
const endpoint = "https://***.documents.azure.com:443/"; // Add your endpoint
const masterKey = "***"; // Add the masterkey of the endpoint
const client = new CosmosClient({ endpoint, auth: { masterKey } });
const databaseId = "db";
const containerId = "coll";
async function run() {
const { container, database } = await init();
const querySpec = {
query: "SELECT r.id,r._ts FROM root r"
};
const queryOptions = {
maxItemCount : -1
}
const queryIterator = await container.items.query(querySpec,queryOptions);
while (queryIterator.hasMoreResults()) {
const { result: results, headers } = await queryIterator.executeNext();
console.log(results)
console.log(headers)
//do what you want to do
if (results === undefined) {
// no more results
break;
}
}
}
async function init() {
const { database } = await client.databases.createIfNotExists({ id: databaseId });
const { container } = await database.containers.createIfNotExists({ id: containerId });
return { database, container };
}
run().catch(err => {
console.error(err);
});
More details about continuation token ,please refer to my previous case.Any concern,please let me know.
I am using Cosmos DB SQL API Node.js library. I am unable to find the Continuation Token from this library so that I can return it to client. The idea is to get it back from the client for the next pagination request.
I have a working code which iterates multiple times to get all the documents. What changes will be required here to get the continuation token?
function queryCollectionPaging() {
return new Promise((resolve, reject) => {
function executeNextWithRetry(iterator, callback) {
iterator.executeNext(function (err, results, responseHeaders) {
if (err) {
return callback(err, null);
}
else {
documents = documents.concat(results);
if (iterator.hasMoreResults()) {
executeNextWithRetry(iterator, callback);
}
else {
callback();
}
}
});
}
let options = {
maxItemCount: 1,
enableCrossPartitionQuery: true
};
let documents = []
let iterator = client.queryDocuments( collectionUrl, 'SELECT r.partitionkey, r.documentid, r._ts FROM root r WHERE r.partitionkey in ("user1", "user2") ORDER BY r._ts', options);
executeNextWithRetry(iterator, function (err, result) {
if (err) {
reject(err)
}
else {
console.log(documents);
resolve(documents)
}
});
});
};

How to Run an API Calls in Parallel (Node.js)

I am trying to run some API calls in parallel, but am having problems since I am trying to call a function again before the API data has been returned.
I am thinking that I could possibly use the new command in Node, but am not sure how to structure it into this scheme. I am trying to avoid recursion, as I already have a recursive version working and it is slow.
Currently I am trying to this code on the server.
loopThroughArray(req, res) {
for(let i=0; i<req.map.length; i++) {
stack[i] = (callback) => {
let data = getApi(req, res, req.map[i], callback)
}
}
async.parallel(stack, (result) => {
res.json(result)
})
}
....
function getApi(req, res, num, cb) {
request({
url: 'https://example.com/api/' + num
},
(error, response, body) => {
if(error) {
// Log error
} else {
let i = {
name: JSON.parse(body)['name'],
age: '100'
}
console.log(body) // Returns empty value array.length > 1 (req.map[i])
cb(i)
}
})
Is there a way to spawn new instances of the function each time it's called and accumulate the results to send back as one result to the client?
Here's an example of calling Web APIs (each with different parameters), using the Async library, we start by creating an array of N function variables.
const async = require('async');
const request = require('request');
//Set whatever request options you like, see: https://github.com/request/request#requestoptions-callback
var requestArray = [
{url: 'https://httpbin.org/get'},
{url: 'https://httpbin.org/ip'}
];
let getApi = function (opt, callback) {
request(opt, (err, response, body) => {
callback(err, JSON.parse(body));
});
};
const functionArray = requestArray.map((opt) => {
return (callback) => getApi(opt, callback);
});
async.parallel(
functionArray, (err, results) => {
if (err) {
console.error('Error: ', err);
} else {
console.log('Results: ', results.length, results);
}
});
You can easily switch the Url and Query values to match whatever you need. I'm using HttpBin here, since it's good for illustrative purposes.

Serving PDF content back to browser via Node Express using pdfMake

I am making use of the pdfmake library for generating PDF documents in my node express application and want these to be sent straight back to the client to trigger the browser to automatically download the file.
As a reference point I have been using the following examples for my express middleware:
https://gist.github.com/w33ble/38c5e0220d491148de1c
https://github.com/bpampuch/pdfmake/issues/489
I have opted for sending a buffered response back, so the key part of my middleware looks like this:
function createPDFDocument(docDefinition, callback) {
var fontDescriptors = {
Roboto: {
normal: './src/server/fonts/Roboto-Regular.ttf',
bold: './src/server/fonts/Roboto-Medium.ttf',
italics: './src/server/fonts/Roboto-Italic.ttf',
bolditalics: './src/server/fonts/Roboto-MediumItalic.ttf'
}
};
var printer = new Printer(fontDescriptors);
var pdfDoc = printer.createPdfKitDocument(docDefinition);
// buffer the output
var chunks = [];
pdfDoc.on('data', function(chunk) {
chunks.push(chunk);
});
pdfDoc.on('end', function() {
var result = Buffer.concat(chunks);
callback(result);
});
pdfDoc.on('error', callback);
// close the stream
pdfDoc.end();
}
In my angular application I am using the $resource service and have an endpoint defined like so:
this.resource = $resource('api/document-requests/',
null,
<any>{
'save': {
method: 'POST',
responseType: 'arraybuffer'
}
});
When I try this out, I dont get any browser download kicking in, the response I receive is as follows when looking in Chrome:
And the response headers are as follows:
So it seems I'm not a million miles off, I have searched around and found solutions mentioning about converting to Blob, but I think that's only relevant if I were serving back a Base64 encoded string of the document.
Can anyone suggest what may be my issue here?
Thanks
Here's a router:
router.get('/get-pdf-doc', async (req, res, next)=>{ try {
var binaryResult = await createPdf();
res.contentType('application/pdf').send(binaryResult);
} catch(err){
saveError(err);
res.send('<h2>There was an error displaying the PDF document.
'</h2>Error message: ' + err.message);
}});
And here's a function to return the pdf.
const PdfPrinter = require('pdfmake');
const Promise = require("bluebird");
createPdf = async ()=>{
var fonts = {
Helvetica: {
normal: 'Helvetica',
bold: 'Helvetica-Bold',
italics: 'Helvetica-Oblique',
bolditalics: 'Helvetica-BoldOblique'
};
var printer = new PdfPrinter(fonts);
var docDefinition = {
content: [
'First paragraph',
'Another paragraph, this time a little bit longer to make sure,'+
' this line will be divided into at least two lines'
],
defaultStyle: {
font: 'Helvetica'
}
};
var pdfDoc = printer.createPdfKitDocument(docDefinition);
return new Promise((resolve, reject) =>{ try {
var chunks = [];
pdfDoc.on('data', chunk => chunks.push(chunk));
pdfDoc.on('end', () => resolve(Buffer.concat(chunks)));
pdfDoc.end();
} catch(err) {
reject(err);
}});
};
Everything seems fine to me, the only thing missing is the logic to trigger the download.
Check out this CodePen as an example.
Here I'm using base64 encoded data, but you can just use binary data as well, just don't forget to change the href, where I'm mentioning scope.dataURL = base64....
I had issue serving PDF files from Node.js as well, so I made use of phantomjs. You can checkout this repository for full codebase and implementation.
console.log('Loading web page')
const page = require('webpage').create()
const args = require('system').args
const url = 'www.google.com'
page.viewportSize = { width: 1024, height: 768 }
page.clipRect = { top: 0, left: 0 }
page.open(url, function(status) {
console.log('Page loaded')
setTimeout(function() {
page.render('docs/' + args[1] + '.pdf')
console.log('Page rendered')
phantom.exit()
}, 10000)
})

Resources