get_by_id() not returning values - google-app-engine

I am writing an application that shows the user a number of elements, where he has to select a few of them to process. When he does so, the application queries the DB for the rest of the data on these elements, and stacks them with their full data on the next page.
I made an HTML form loop with a checkbox next to each element, and then in Python I check for this checkbox's value to get the data.
Even when I'm just trying to query the data, ndb doesn't return anything.
pitemkeys are the ids for the elements to be queried. inpochecks is the checkbox variable.
preqitems is the dict to save the items after getting the data.
The next page queries nothing and is blank.
The comments are my original intended code, which produced lots of errors because of not querying anything.
request_code = self.request.get_all('rcode')
pitemkeys = self.request.get_all('pitemkey')
inpochecks = self.request.get_all('inpo')
preqitems = {}
#idx = 0
#for ix, pitemkey in enumerate(pitemkeys):
# if inpochecks[ix] == 'on':
# preqitems[idx] = Preqitems.get_by_id(pitemkey)
# preqitems[idx].rcode = request_code[ix]
# idx += 1
for ix, pitemkey in enumerate(pitemkeys):
preqitems[ix] = Preqitems.get_by_id(pitemkey)
#preqitems[ix].rcode = request_code[ix]
Update: When trying
preqitems = ndb.get_multi([ndb.Key(Preqitems, k) for k in pitemkeys])
preqitems returns a list full of None values, as if the db couldn't find data for these keys.. I checked the keys and for some reason they are in unicode format, could that be the reason? They look like so.
[u'T-SQ-00301-0002-0001', u'U-T-MT-00334-0007-0002', u'U-T-MT-00334-0008-0001']

Probably you need to do: int(pitemkey) or str(pitemkey), depending if you are using integer or string id

Related

Google Script is returning the index of the array but I need the value

I have a google spreadsheet that gets data logged to it via a google form.
When the form is logged each time, it triggers a script that gets values from a certain section using:
var tabNumsVal = sheet.getSheetValues(lastRow, tabOneCol.getColumn(), 1, 6)[0];
When I check the array, I can see that the array has the values such as:
0: 12
1: 24
2: 26W
3: 0
4: 0
5: 0
However when I use the following command, it puts the index numbers (0 to 5) into the array instead of the values in the array.
var tabNumsFinal = [];
for (var tabard in tabNumsVal) {
if (tabard !== "") {
tabNumsFinal.push(tabard);
}
}
It used to work but I have had to upgrade my code to Google Script v8 and it seems that this has broken the code.
I had to alter the 'for each' code block to a 'for' code block and it seems this is handling the values differently.
I am quite sure this is simple for many people but I really only touch Google Script 1 time each year. I have tried using Logger.log(tabard) to output the data to the execution log, but it just traverses the code and doesn't output anything. I figured this might be because of the !== "" operator, so I placed it above the if statement but still inside the for statement and it still outputs nothing.
I tried using Logger.log(tabNumsVal) and Logger.log(tabNumsFinal) and again it output nothing.
To recap:
The data from the form is returning correctly into the columns of the spreadsheet, hence it is showing inside the array properly. It's just that the index numbers are being output instead of the values from the array.
Since you're using for in loop, tabard is the index here.
var tabNumsFinal = [];
for (var i in tabNumsVal) {
let val = tabNumsVal[i];
if (val !== "") {
tabNumsFinal.push(val);
}
}
For in loop

Loop function to fill Crosselling template

I need to create a file containing information on crosselling for my webshop.
In my example file you'll see the tab "Basic Data", this is the data available to me already. Column A contains certain products, column B shows the assigned (product)categories.
The Tab "Starting Point" shows how the final file will be structured and how the function should come into play. It would need to do the following:
Copy the first product from the Unique product list (D2) to A2 (already done here)
Paste a filterfunction into B2 (already done here)
This filterfunction lists all products that belong to the same category like Product 1 except for Product 1 itself
Apply a numerical position tag in tens in Column C to the whole range of products related to Product 1 (in this case B2:B4), starting from 10 (..20, 30, ff) and optimally randomize it. (already done here)
Drag down A2, respectively paste "Product 1" into all cells below A2 until the end of the result of the filterfunction in Columns B is reached (already done here).
Continue the loop by pasting "Product 2" into A5, pasting the filterfunction into B5 and so on.
In "Desired Result" you can see how the end result should look like in this example. There are only 8 products, but I'd need to be able to do this for hundreds of products, that's why a function is needed.
I hope somebody is able to help me here.
Answer
You can get your desired result using Google Apps Script as I suggested.
How to use it:
Open Apps Script clicking on Tools > Script editor and you will see the script editor. It is based on JavaScript and it allows you to create, access, and modify Google Sheets files with a service called Spreadsheet Service.
Paste the following code and click on Run
Code
function main() {
const ss = SpreadsheetApp.getActiveSpreadsheet()
const sheetResult = ss.getSheetByName('Desired Result')
const sheetBasic = ss.getSheetByName('Basic Data')
// write unique products
const val = "=ARRAYFORMULA(UNIQUE('Basic Data'!A2:A))"
sheetResult.getRange('D2').setValue(val)
// get basic data
const lastColumn = sheetBasic.getRange('A:A').getValues().filter(String).length; // number of products
const cat = sheetBasic.getRange('A2:B'+lastColumn).getValues() // product | category
const productGroup = [] // array to store data
// loop each product
for (var j=0; j<cat.length; j++){
const p1 = cat[j]
var k = 1
for (var i=0; i<cat.length; i++){
if (cat[i][1] == p1[1] && cat[i][0] != p1[0]){
const val = [p1[0],cat[i][0],k*10]
k = k + 1
productGroup.push(val)
}
}
}
var n = productGroup.length+1
sheetResult.getRange('A2:C'+n).setValues(productGroup)
}
Some comments
This solution does not randomize the position value. If you need it I can help you with that, but first I want to check that this solution fits you.
Let me know if you can obtain your desired result. Keep in mind that this solution uses the name of the sheets.
Reference
Google Apps Script
Spreadsheet Service
JavaScript

Why does Flink emit duplicate records on a DataStream join + Global window?

I'm learning/experimenting with Flink, and I'm observing some unexpected behavior with the DataStream join, and would like to understand what is happening...
Let's say I have two streams with 10 records each, which I want to join on a id field. Let's assume that for each record in one stream had a matching one in the other, and the IDs are unique in each stream. Let's also say I have to use a global window (requirement).
Join using DataStream API (my simplified code in Scala):
val stream1 = ... // from a Kafka topic on my local machine (I tried with and without .keyBy)
val stream2 = ...
stream1
.join(stream2)
.where(_.id).equalTo(_.id)
.window(GlobalWindows.create()) // assume this is a requirement
.trigger(CountTrigger.of(1))
.apply {
(row1, row2) => // ...
}
.print()
Result:
Everything is printed as expected, each record from the first stream joined with a record from the second one.
However:
If I re-send one of the records (say, with an updated field) from one of the stream to that stream, two duplicate join events get emitted 😞
If I repeat that operation (with or without updated field), I will get 3 emitted events, then 4, 5, etc... 😞
Could someone in the Flink community explain why this is happening? I would have expected only 1 event emitted each time. Is it possible to achieve this with a global window?
In comparison, the Flink Table API behaves as expected in that same scenario, but for my project I'm more interested in the DataStream API.
Example with Table API, which worked as expected:
tableEnv
.sqlQuery(
"""
|SELECT *
| FROM stream1
| JOIN stream2
| ON stream1.id = stream2.id
""".stripMargin)
.toRetractStream[Row]
.filter(_._1) // just keep the inserts
.map(...)
.print() // works as expected, after re-sending updated records
Thank you,
Nicolas
The issue is that records are never removed from your global window. So you trigger the join operation on the global window, whenever a new record has arrived, but the old records are still present.
Thus, to get it running in your case, you'd need to implement a custom evictor. I expanded your example in a minimal working example and added the evictor, which I will explain after the snippet.
val data1 = List(
(1L, "myId-1"),
(2L, "myId-2"),
(5L, "myId-1"),
(9L, "myId-1"))
val data2 = List(
(3L, "myId-1", "myValue-A"))
val stream1 = env.fromCollection(data1)
val stream2 = env.fromCollection(data2)
stream1.join(stream2)
.where(_._2).equalTo(_._2)
.window(GlobalWindows.create()) // assume this is a requirement
.trigger(CountTrigger.of(1))
.evictor(new Evictor[CoGroupedStreams.TaggedUnion[(Long, String), (Long, String, String)], GlobalWindow](){
override def evictBefore(elements: lang.Iterable[TimestampedValue[CoGroupedStreams.TaggedUnion[(Long, String), (Long, String, String)]]], size: Int, window: GlobalWindow, evictorContext: Evictor.EvictorContext): Unit = {}
override def evictAfter(elements: lang.Iterable[TimestampedValue[CoGroupedStreams.TaggedUnion[(Long, String), (Long, String, String)]]], size: Int, window: GlobalWindow, evictorContext: Evictor.EvictorContext): Unit = {
import scala.collection.JavaConverters._
val lastInputTwoIndex = elements.asScala.zipWithIndex.filter(e => e._1.getValue.isTwo).lastOption.map(_._2).getOrElse(-1)
if (lastInputTwoIndex == -1) {
println("Waiting for the lookup value before evicting")
return
}
val iterator = elements.iterator()
for (index <- 0 until size) {
val cur = iterator.next()
if (index != lastInputTwoIndex) {
println(s"evicting ${cur.getValue.getOne}/${cur.getValue.getTwo}")
iterator.remove()
}
}
}
})
.apply((r, l) => (r, l))
.print()
The evictor will be applied after the window function (join in this case) has been applied. It's not entirely clear how your use case exactly should work in case you have multiple entries in the second input, but for now, the evictor only works with single entries.
Whenever a new element comes into the window, the window function is immediately triggered (count = 1). Then the join is evaluated with all elements having the same key. Afterwards, to avoid duplicate outputs, we remove all entries from the first input in the current window. Since, the second input may arrive after the first inputs, no eviction is performed, when the second input is empty. Note that my scala is quite rusty; you will be able to write it in a much nicer way. The output of a run is:
Waiting for the lookup value before evicting
Waiting for the lookup value before evicting
Waiting for the lookup value before evicting
Waiting for the lookup value before evicting
4> ((1,myId-1),(3,myId-1,myValue-A))
4> ((5,myId-1),(3,myId-1,myValue-A))
4> ((9,myId-1),(3,myId-1,myValue-A))
evicting (1,myId-1)/null
evicting (5,myId-1)/null
evicting (9,myId-1)/null
A final remark: if the table API offers already a concise way of doing what you want, I'd stick to it and then convert it to a DataStream when needed.

AppScript: 'number of columns in the data does not match the number of columns in the range.' setValues method not reading array correctly?

I'm trying to automate the collection of phone numbers from an API into a Google Sheet with app script. I can get the data and place it in an array with the following code:
const options = {
method: 'GET',
headers: {
Authorization: 'Bearer XXXXXXXXXXXXXXX',
Accept: 'Application/JSON',
}
};
var serviceUrl = "dummyurl.com/?params";
var data=UrlFetchApp.fetch(serviceUrl, options);
if(data.getResponseCode() == 200) {
var response = JSON.parse(data.getContentText());
if (response !== null){
var keys = Object.keys(response.call).length;
var phoneArray = [];
for(i = 0; i < keys; i++) {
phoneArray.push(response.call[i].caller.caller_id);
}
This works as expected - it grabs yesterday's caller ID values from a particular marketing campaign from my API. Next, I want to import this data into a column in my spreadsheet. To do this, I use the setValues method like so:
Logger.log(phoneArray);
var arrayWrapper = [];
arrayWrapper.push(phoneArray);
Logger.log(arrayWrapper);
for(i = 0; i < keys; i++) {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var cell = sheet.getRange("A8");
cell.setValues(arrayWrapper);
}
}
}
}
I am aware that I need my array length to equal the length of the selected range of cells in my sheet. However, I get conflicting errors depending on the length I set for my getRange method. If I set it to a single cell, as you see above, the error I get is:
The number of columns in the data does not match the number of columns in the range. The data has 8 but the range has 1.
However, if I set the length of my range to 8 (or any value except 1), I get the error:
The number of columns in the data does not match the number of columns in the range. The data has 1 but the range has 8.
As you see, the error swaps values. Now I have the appropriate number of columns in the range, but my script only finds 1 cell of data. When I check the log, I see that my 2D array looks normal in both cases - 8 phone numbers in an array wrapped in another array.
What is causing this error? I cannot find reference to similar errors on SO or elsewhere.
Also, please note that I'm aware this code is a little wonky (weird variables and two for loops where one would do). I've been troubleshooting this for a couple hours and was originally using setValue instead of setValues. While trying to debug it, things got split up and moved around a lot.
The dimension of your range is one row and several columns
If you push an array into another array, the dimension will be [[...],[...],[...]] - i.e. you have one column and multiple rows
What you want instead is one row and multiple columns: [[...,...,...]]
To achieve this you need to create a two-dimensional array and push all entries into the first row of your array: phoneArray[0]=[]; phoneArray[0].push(...);
Sample:
var phoneArray = [];
phoneArray[0]=[];
for(i = 0; i < keys; i++) {
var phoneNumber = response.call[i].caller.caller_id;
phoneNumber = phoneNumber.replace(/-/g,'');
phoneArray[0].push(phoneNumber);
}
var range = sheet.getRange(1,8,1, keys);
range.setValues(phoneArray);
So I figured out how to make this work, though I can't speak to why the error is occurring, or rather why one receives reversed error messages depending on the setRange value.
Rather than pushing the whole list of values from the API to phoneArray, I structured my first for loop to reset the value of phoneArray each loop and push a single value array to my arrayWrapper, like so:
for(i = 0; i < keys; i++) {
var phoneArray = [];
var phoneNumber = response.call[i].caller.caller_id;
phoneNumber = phoneNumber.replace(/-/g,'');
phoneArray.push(phoneNumber);
arrayWrapper.push(phoneArray);
}
Note that I also edited the formatting of the phone numbers to suit my needs, so I pulled each value into a variable to make replacing a character simple. What this new for loop results in is a 2D array like so:
[[1235556789],[0987776543],[0009872345]]
Rather than what I had before, which was like this:
[[1235556789,0987776543,0009872345]]
It would appear that this is how the setValues method wants its data structured, although the documentation suggests otherwise.
Regardless, if anyone were to run into similar issues, this is the gist of what must be done to fix it, or at least the method I found worked. I'm sure there are far more performant and elegant solutions than mine, but I will be dealing with dozens of rows of data, not thousands or millions. Performance isn't a big concern for me.
var correct = [[data],[data]] -
is the data structure that is required for setValues()
therefore
?.setValues(correct)

Saving users and items features to HDFS in Spark Collaborative filtering RDD

I want to extract users and items features (latent factors) from the result of collaborative filtering using ALS in Spark. The code I have so far:
import org.apache.spark.mllib.recommendation.ALS
import org.apache.spark.mllib.recommendation.MatrixFactorizationModel
import org.apache.spark.mllib.recommendation.Rating
// Load and parse the data
val data = sc.textFile("myhdfs/inputdirectory/als.data")
val ratings = data.map(_.split(',') match { case Array(user, item, rate) =>
Rating(user.toInt, item.toInt, rate.toDouble)
})
// Build the recommendation model using ALS
val rank = 10
val numIterations = 10
val model = ALS.train(ratings, rank, numIterations, 0.01)
// extract users latent factors
val users = model.userFeatures
// extract items latent factors
val items = model.productFeatures
// save to HDFS
users.saveAsTextFile("myhdfs/outputdirectory/users") // does not work as expected
items.saveAsTextFile("myhdfs/outputdirectory/items") // does not work as expected
However, what gets written to HDFS is not what I expect. I expected each line to have a tuple (userId, Array_of_doubles). Instead I see the following:
[myname#host dir]$ hadoop fs -cat myhdfs/outputdirectory/users/*
(1,[D#3c3137b5)
(3,[D#505d9755)
(4,[D#241a409a)
(2,[D#c8c56dd)
.
.
It is dumping the hash value of the array instead of the entire array. I did the following to print the desired values:
for (user <- users) {
val (userId, lf) = user
val str = "user:" + userId + "\t" + lf.mkString(" ")
println(str)
}
This does print what I want but I can't then write to HDFS (this prints on the console).
What should I do to get the complete array written to HDFS properly?
Spark version is 1.2.1.
#JohnTitusJungao is right and also the following lines works as expected :
users.saveAsTextFile("myhdfs/outputdirectory/users")
items.saveAsTextFile("myhdfs/outputdirectory/items")
And this is the reason, userFeatures returns an RDD[(Int,Array[Double])]. The array values are denoted by the symbols you see in the output e.g. [D#3c3137b5 , D for double, followed by # and hex code which is created using the Java toString method for this type of objects. More on that here.
val users: RDD[(Int, Array[Double])] = model.userFeatures
To solve that you'll need to make the array as a string :
val users: RDD[(Int, String)] = model.userFeatures.mapValues(_.mkString(","))
The same goes for items.

Resources