how many items can a array handle node.js - arrays

I am trying to put objects to array based on txt file that has around 500.000 lines or more
I am using require('readline') to handle it, but the processing "pause" for yourself when achieve line 470000(e.g) without errors, warnings, notices...
this is examplo of my code ( the original code fill the dataRow object then it "pauses" when achieve line 411000):
let myList = [];
let lineReader = require('readline').createInterface({
input: require('fs').createReadStream(filePath).pipe(iconv.decodeStream('latin1'))
});
lineReader.on('line', function (line) {
// here there are a lot more fields, but I have to cut off for this example
let dataRow = JSON.parse('{"Agencia_Cobranca":"","Aliquota_ICMS":""}');
myList.push(dataRow);
//this if is only to follow what is happen
if( myList.length %10000 == 0 || myList.length>420000) {
console.log(" myList executed: ",myList.length, ' - ', JSON.stringify( myList[myList.length-1] ).length, ' - ' ,new Date() );
}
}).on('close',function(){
console.log('finished');
process.exit(0);
});
I am using this command line to execute
node --max-old-space-size=8192 teste
Welll... this is the result, the screen just stay this way when achieve this line... never ends and without errors :(

Your stack/Ram is probably full and erroring out in a weird way. I would recommend if at all possible to make your program more memory efficient, do everything you need to do with a line as you read it and then discard it. Storing it all in memory is never going to be a solution.

In NodeJs (javascript too) maximum size of an array object is 2^32 -1. Just try to execute this in a nodejs application
console.log(new Array(4294967295))
try {
console.log(new Array(4294967296))
} catch(err){
console.log(err);
}

Consider using database if you work with that much data. Store it in a table then query the data you need to work with would be more efficient.

Related

Looping through REST API calls improperly breaks at first iteration or last iteration

I am trying to send a REST API call to retrieve a lot of data. Now this data is returned in JSON format and is limited to 2000 records each call. However, if there are more than 2000 records then there is a key called nextRecordsUrl with a link to an endPoint with the next 2000 records.
And this pattern continues until there are less than 2000 records in a single call and then nextRecordsUrl is undefined.
I have this loop that essentially pushes the data to an array and then calls the endpoint listed in the nextRecordsUrl key.
do {
for (var i in arrLeads.records) {
let data = arrLeads.records[i];
let createDate = new GMT(data.CreatedDate, "dd-MM-YYYY");
let fAssocDate = new GMT(data['First_Campaign_assoc_date__c'],"dd-MM-YYYY");
let lAssocDate = new GMT(data['Last_Campaign_assoc_date__c'], "dd-MM-YYYY");
let convDate = new GMT(data.ConvertedDate, "dd-MM-YYYY");
leads.push([data.FirstName, data.LastName, data.Id, data.Email, data.Title, data['hs_contact_id__c'], createDate, data.Company, data.OwnerId, data.Country, data['Region_by_manager__c'], data.Status, data.Industry, data['Reason_for_Disqualifying_Nurture__c'], data['Reason_for_Disqualifying_Dead__c'], data['Lead_Type__c'], data['First_Campaign_Name__c'], data['First_CampaignType__c'], fAssocDate, data['Last_Campaign_Name__c'], lAssocDate, data['Last_Campaign_Type__c'], convDate, data.ConvertedAccountId, data.ConvertedContactId, data.ConvertedOpportunityId]);
}
var arrLeads = getQueryWithFullEndPoint(arrLeads.nextRecordsUrl);
} while (arrLeads.nextRecordsUrl != null && arrLeads.nextRecordsUrl != undefined);
I used to use the regular while loop, but it caused obvious problems in that it wouldn't run at all if the initial call had an empty nextRecordsUrl field.
But this also has an issue. While the first iteration works well, the last one does not because it makes the call on the next iteration before the loop checks the nextRecordsUrl field.
So essentially, it will loop through all the records normally. But when it gets to the last one, it has already run the last endPoint and will break the loop, because now the 'nextRecordsUrl key is empty. So the last iteration will be ignored.
I thought of moving the call to after the check, right after the do {, but that will cause problems for the first iteration.
I also thought about duplicating the for loop after the do, but I prefer a cleaner solution that doesn't involve duplicating code.
So how do I write this code in a way that takes into account the first iteration, even if there no second iteration and the last iteration, where it has records but has an empty nextRecordsUrl, without having to double up the code?
I actually managed to solve this with a small tweak in the code.
My issue is that I getting the next batch before I could test the original, and so I was always 1 batch short.
The solutions was actually to define a variable at the beginning of each loop arrLeadsNext = arrLeads.nextRecordsUrl;
and have the while statement check this variable.
So it looks like this:
do {
var arrLeadsNext = arrLeads.nextRecordsUrl;
for (var i in arrLeads.records) {
let data = arrLeads.records[i];
let createDate = new GMT(data.CreatedDate, "dd-MM-YYYY").process();
let fAssocDate = new GMT(data['First_Campaign_assoc_date__c'], "dd-MM-YYYY").process();
let lAssocDate = new GMT(data['Last_Campaign_assoc_date__c'], "dd-MM-YYYY").process();
let convDate = new GMT(data.ConvertedDate, "dd-MM-YYYY").process();
leads.push([data.FirstName, data.LastName, data.Id, data.Email, data.Title, data['hs_contact_id__c'], createDate, data.Company, data.OwnerId, data.Country, data['Region_by_manager__c'], data.Status, data.Industry, data['Reason_for_Disqualifying_Nurture__c'], data['Reason_for_Disqualifying_Dead__c'], data['Lead_Type__c'], data['First_Campaign_Name__c'], data['First_CampaignType__c'], fAssocDate, data['Last_Campaign_Name__c'], lAssocDate, data['Last_Campaign_Type__c'], convDate, data.ConvertedAccountId, data.ConvertedContactId, data.ConvertedOpportunityId]);
}
var arrLeads = (arrLeadsNext != null) ? getQueryWithFullEndPoint(arrLeadsNext) : '';
} while (arrLeadsNext != null && arrLeadsNext != undefined);
So the check at the end hasn't been changed by the next API call for the next batch as it will only change once the next iteration of the loop begins.

Google Script is returning the index of the array but I need the value

I have a google spreadsheet that gets data logged to it via a google form.
When the form is logged each time, it triggers a script that gets values from a certain section using:
var tabNumsVal = sheet.getSheetValues(lastRow, tabOneCol.getColumn(), 1, 6)[0];
When I check the array, I can see that the array has the values such as:
0: 12
1: 24
2: 26W
3: 0
4: 0
5: 0
However when I use the following command, it puts the index numbers (0 to 5) into the array instead of the values in the array.
var tabNumsFinal = [];
for (var tabard in tabNumsVal) {
if (tabard !== "") {
tabNumsFinal.push(tabard);
}
}
It used to work but I have had to upgrade my code to Google Script v8 and it seems that this has broken the code.
I had to alter the 'for each' code block to a 'for' code block and it seems this is handling the values differently.
I am quite sure this is simple for many people but I really only touch Google Script 1 time each year. I have tried using Logger.log(tabard) to output the data to the execution log, but it just traverses the code and doesn't output anything. I figured this might be because of the !== "" operator, so I placed it above the if statement but still inside the for statement and it still outputs nothing.
I tried using Logger.log(tabNumsVal) and Logger.log(tabNumsFinal) and again it output nothing.
To recap:
The data from the form is returning correctly into the columns of the spreadsheet, hence it is showing inside the array properly. It's just that the index numbers are being output instead of the values from the array.
Since you're using for in loop, tabard is the index here.
var tabNumsFinal = [];
for (var i in tabNumsVal) {
let val = tabNumsVal[i];
if (val !== "") {
tabNumsFinal.push(val);
}
}
For in loop

Mongoose Model, cleaning/parsing an Array efficiently

I am having some issues with a mongoose array, it's likely due to my lacking understanding of the library, and I can't find the exact answer I'm looking for in the docs.
For starters I have my schema and model declarations:
const gConfig = new Schema({ aList: Array, maxChanLimit: Number }), globalConfiguration = mongoose.model('globalConfig', gConfig);
And I have my command which fetches the array, parses out _id, then pushes the new item to the array, and overwrites the existing one in the database.
if((message.author.id === g.ownerID) && (g.id.toString() === tocGuild) && message.content.startsWith("!updatealist"))
{
let mc = message.content.replace("!updatealist ", "");
globalConfiguration.findOneAndUpdate({},{$push: {"aList":mc }}, { upsert: true }, function(err, data) {
if (err) return console.log(err);
var str = JSON.stringify(data); str = str.replace(RegExp(/"_id"|"__v"|'/g),""); var arr = str.split(`","`);
});
}
I feel like there has to be a better way to do this, I've tried something like this based on what I've read:
globalConfiguration.findOneAndUpdate({},{$push: {"-_id aList":mc }}
However this did not remove _id from the array. I suppose how I'm doing it is a way to do it, but I know it isn't efficient, and isn't dynamic at all, it's also extremely bulky in terms of code and could be streamlined using the library.
In practice, what is the best way to properly read an array from a model with Mongoose? How do you read from the array without the additional objects Mongoose adds by default? What is the best way to add an item to an existing model?
Any help is appreciated, thank you.
if you want to have more control over the updating process, you can do it like this, in the mongoose documents it suggest you can first query the item/document you want to update, once that document is queried and there, you can make changes to it such as if it contains an array , you can push to it or pop from it or what ever..
its in your control
so,
if((message.author.id === g.ownerID) && (g.id.toString() === tocGuild) && message.content.startsWith("!updatealist"))
{
let mc = message.content.replace("!updatealist ", "");
globalConfiguration.findOne({"your query"}, function(err, data) {
if (err) throw (err);
data.array.push("something");
data.save();// save it again with updates
var str = JSON.stringify(data); str = str.replace(RegExp(/"_id"|"__v"|'/g),""); var arr = str.split(`","`);
});
}

Google apps script can't read the 1001'th row from google spreadsheets

The function bellow returns the first empty row on column A.
The sheet became full, I extended it with another 9000 rows, I ran main manually and I got an error, "TypeError: Cannot read property "0" from undefined".
The problem it seems to be that the 1001'th row cannot be read, values[1001] returns nothing, undefined. Am I missing something or am I limited to 1000 rows of data ?
Thank you for reading, here is the code:
function getLastRowNumber(sheetName){
// sheetName: string; returns intege (the last row number based on column A)
var sheet = getSheet(sheetName);
var column = sheet.getRange('A1:A'); // THIS IS THE PROBLEM
var values = column.getValues(); // get all data in one call
// DEBUG
Logger.log(values[1001][0])
var ct = 0;
while (values[ct][0] != "") {
ct++;
}
return (ct);
}
EDIT:
Solution: use .getRange('A:A'); instead of the 'A1:A' notation.
Thank you #tehhowch for the solution.
Posting this answer so people can see it, the solution was provided by #tehhowch.
By using "A:A" as the argument of getRange fixes the problem.

Please I am trying to concatenate string result to a variable in an express router but it returns an empty string

I am trying to iterate through the array get the value and search the database, then concatenate the database result to the string translation
app.get('/translate',function(req,res) {
let translate = '';
['hello','love'].forEach(async (word) => {
let trans = await
NaijaLang.find({"engword": word, "naijalang": "yoruba"});
translate +=" " + trans[0].translation;
//Returns values
console.log(translate)
});
//Returns Empty String;
console.log(translate)
res.send(translate);
});
Because you do some async stuff there, but you send the value synchronously. Basically, this code will run in this order:
run let translate='';
run ['hello','love'].forEach(...)
run await NaijaLang.find(...) asynchronously for word=hello
run await NaijaLang.find(...) asynchronously for word=love
run console.log(translate) and res.send(translate);
resolve the value of await NaijaLang.find(...) -> this is the time when the translate is updated for the first time (either for word=hello or word=love. Whatever finishes earlier)
resolve the value of second call await NaijaLang.find(...) -> this is the time when the translate is updated for the second time. But the value was already send in the 5th step.
You can find more detailed explanation here: https://blog.lavrton.com/javascript-loops-how-to-handle-async-await-6252dd3c795
And you can also find there how to fix it. You can use the for-of loop instead of forEach:
app.get('/translate',function(req,res){
let translate='';
for (let word of ['hello','love']) {
let trans=await NaijaLang.find({"engword":word,"naijalang":"yoruba"});
translate+=" " + trans[0].translation;
//Returns values
console.log(translate)
}
//Returns Empty String;
console.log(translate)
res.send(translate);
});
This time, the code will execute as you probably want. First, the find method will be called for word=hello, then, after the execution is finished, the find method will be called for word=love and finally after both calls are finished, the res.send will be called.

Resources