GAE/Go: datastore iterator too slow - google-app-engine

Iteration to datastore query result in GAE/Go is very slow.
q := datastore.NewQuery("MyStruct")
gaeLog.Infof(ctx, "run") // (1)
it := client.Run(ctx, q)
list := make([]MyStruct, 0, 10000)
gaeLog.Infof(ctx, "start mapping") // (2)
for {
var m MyStruct
_, err := it.Next(&m)
if err == iterator.Done {
break
}
if err != nil {
gaeLog.Errorf(ctx, "datastore read error : %s ", err.Error())
<some error handling>
break
}
list = append(list , m)
}
gaeLog.Infof(ctx, "end mapping. count : %d", len(list)) // (3)
The result is below.
18:02:11.283 run // (1)
18:02:11.291 start mapping // (2)
18:02:15.741 end mapping. count : 2400 // (3)
It takes about 4.5 seconds between (2) and (3), just only 2400 record. It is very slow.
How can I improve performance?
[Update]
I added the query in above code q := datastore.NewQuery("MyStruct").
I tried to retrieve all the entities in the kind MyStruct. This kind has 2400 entities.

I was using cloud.google.com/go/datastore and found it is slow. I migrated to use google.golang.org/appengine/datastore.
The result is as follows, less than 1 second.
13:57:46.216 run
13:57:46.367 start mapping
13:57:47.063 end mapping. count : 2400

Related

How can I get the maximum value of a specific day in Gorm?

I have written following code to get the daily maximum of a certain value with GORM.
I pass the current time and get the day's start and end.
I select all values between the day's start and end.
I order the temperatures and get the first.
My Code:
func GetDailyMaxTemperature(ts time.Time) (*Temperature, error) {
temp:= &Temperature{}
start, end := getStartAndEndOfDay(ts)
if tx := db.Where("ts BETWEEN ? AND ?", start, end).Order("temperature ASC").First(temp); tx.Error != nil {
return temp, tx.Error
}
return temp, nil
func getStartAndEndOfDay(ts time.Time) (time.Time, time.Time) {
dayStart := time.Date(ts.Year(), ts.Month(), ts.Day(), 0, 0, 0, 0, ts.Location())
dayEnd := time.Date(ts.Year(), ts.Month(), ts.Day(), 23, 59, 59, 999, ts.Location())
return dayStart, dayEnd
}
This code works, however I am not very satisfied with it and wonder if there are more "GORM-ish" ways to get a maximum value of a certain table especially when there are dates involved.
You could use the max operation in SQL. Maybe not a more 'GORM' way to do this, but in my opinion a more semantically correct/appealing version:
var result float64
row := db.Table("temperatures").Where("ts BETWEEN ? AND ?", start, end).Select("max(temperature)").Row()
err := row.Scan(&result)
You could make this a method of the Temperature struct like this:
func (t Temperature) MaxInRange(db *gorm.DB, start, end time.Time) (float64, err) {
var result float64
row := db.Table("temperatures").Where("ts BETWEEN ? AND ?", start, end).Select("max(temperature)").Row()
err := row.Scan(&result)
return result, err
}

How to match two string arrays - 3 conditions?

I have a logical question here. I need to print a string in the format "user#email.com group#email.com" for each user whose business title and manager fit the business title, department and manager in the group CSV. If there's more than one matching group the string should be printed as above but with another group. What would be the best approach here? Create arrays from both CSV files then do an if in the loop?
Example
GroupCSV:
Group,Job title,Department,Manager (email address)
somesalesgroup#dundermifflin.com,Senior Sales Manager,Sales,michael.scott#dundermifflin.com
anothersalesgroup#dundermifflin.com,Senior Sales Manager,Sales,michael.scott#dundermifflin.com
UserCSV:
First name,Last name,Location,Start date,Job title,Department,Manager (email address)
Jim,Halpert,Scranton,7/1/2021,Senior Sales Manager,Sales,michael.scott#dundermifflin.com
Dwight,Schrute,Scranton,7/1/2021,Assistant to the Regional Manager,Sales,michael.scott#dundermifflin.com
I would like to have an output like:
[jim.halpert#dundermifflin#takeaway.com somesalesgroup#dundermifflin.com]
[jim.halpert#dundermifflin#takeaway.com anothersalesgroup#dundermifflin.com]
At the moment I've got this
var match [][]string
for _, u := range userRows {
for _, g := range groupRows {
if u[0] == g[0] {
match = append(match, string{g, u})
}
}
}
But I'm not sure what may be wrong here (string{g, u})
Resolved this way:
var match [][]string
for _, u := range userRows {
for _, g := range groupRows {
if u[6] == g[1] && u[2] == g[0] {
match = append(match, []string{u[5], g[2]})
}
}
}
The way I can think of is to convert GroupCSV into a map, and then read UserCSV to match the map.
If the process of reading the CSV file is a "for", there are two "for" in total.

How to wait 1 second using Structured Text?

I am currently writing a program that, when a variable reaches a certain point, a connected light will flash on and off every second. I know the light is properly hooked up, and I know that the program to alternate between on and off works, because it did it multiple times a second. I tried adding a wait timer to slow the flashing down.
Here is the chunk of code I am trying to add:
VAR
delay : TON;
Count : INT := 0;
END_VAR
delay(IN := TRUE, PT:= T#5S);
IF NOT (delay.Q) THEN
RETURN;
END_IF;
delay(IN := FALSE);
When I add it to my code, I get the error invalid time constant.
I'm not sure if it matters too much, but I am using Schneider Electric's EcoStruxure Machine Expert to write and execute my code.
For those that wish to see the entire program, if it would help, here it is:
IF (change < 70) THEN
Light13 := FALSE;
END_IF;
IF (change >= 70) AND (change <= 90) THEN
Light13 := TRUE;
END_IF;
IF (change > 90) THEN
WHILE change > 90 DO
IF (index MOD 2 = 0) THEN
Light13 := TRUE;
END_IF;
IF (index MOD 2 <> 0) THEN
Light13 := FALSE;
END_IF;
delay(IN := TRUE, PT:= T#5s);
IF NOT (delay.Q) THEN
RETURN;
END_IF;
delay(IN := FALSE);
index := index + 1;
END_WHILE;
END_IF;
To avoid getting a repeat question to this question, Timers in PLC - Structured Text, I will again reiterate that I am getting an error using this method. Just wanted to clarify beforehand.
I am not at all set on using this way if there is a better option. Thanks for the help!
Schneider Electric's EcoStruxure Machine Expert is CoDeSys based. So you have a few options.
Use BLINK in Util library
Open library manager, search for BLINK and double click it. Now you have blink block available. Use it like this.
VAR
fbBlink: BLINK;
END_VAR
fbBlink(ENABLE := TRUE, TIMELOW := T#1s, TIMEHIGH := T#300ms, OUT => bSignal);
The advantage of this method that you can set a different times for LOW and HIGH states of your lite and use different signals. For instance, short blink once a 2 seconds error 1 and short blink every half second error 2.
Create your own BLINK function as it is suggested by #Filippo.
If you want to flash your light on and off each second you can use this code:
Declaration part:
FUNCTION_BLOCK FB_Flash
VAR_INPUT
tFlashTime : TIME;
END_VAR
VAR_OUTPUT
bSignal : BOOL;
END_VAR
VAR
fbTonDelay : TON;
END_VAR
Implementation part:
fbTonDelay(IN := NOT fbTonDelay.q, PT:= tFlashTime);
IF fbTonDelay.Q
THEN
bSignal := NOT bSignal;
END_IF
You can call it like this:
fbFlash(tFlashTime := T#1S, bSignal => bFlashLight);
Where bFlashLight is your hardware output.
Now if you want the light to flash when a special condition is fullfilled, you can do like this:
IF bSpecialCondition
THEN
fbFlash(tFlashTime := T#1S, bSignal => bFlashLight);
ELSE
bFlashLight := FALSE;
END_IF
Try to reach your goals with maximum simplicity and clarity.

List of valid operations in Tensorflow

I am using (learning) Tensorflow through the eager C API, or to be even more precise through a FreePascal wrapper around it.
When I want to do e.g. a matrix multiplication, I call
TFE_Execute(Op, #OutTensH, #NumOutVals, Status);
where Op.Op_Name is 'MatMul'. I have a couple of other instructions figured out, e.g. 'Transpose', 'Softmax', 'Inv', etc., but I do not have a complete list. In particular I want to get the determinant of a matrix, but cannot find it (assume it exists). I tried to find it on the web, as well as in the source on GitHub, but no success.
In Python there is tf.linalg.det, but already in C++ API I do not find it.
Could someone direct me to a place where I can find a complete list of supported operations?
Can someone tell me how to calculate the determinant with Tensorflow?
Edit: On Gaurav's request I attach a small program. As said above, it is in Pascal, and calls the C API through a wrapper. I therefore copied also the relevant part of the wrapper here (full version: https://macpgmr.github.io/). The set-up works, the "only" question is that I do not find a list of supported operations.
// A minimal program to transpose a matrix
program test;
uses
SysUtils,
TF;
var
Tensor:TTensor;
begin
Tensor:=TTensor.CreateSingle([2,1],[1.0,2.0]);
writeln('Before transpose ',Tensor.Dim[0],' x ',Tensor.Dim[1]); // 2 x 1
Tensor:=Tensor.Temp.ExecOp('Transpose',TTensor.CreateInt32([1,0]).Temp);
writeln('After transpose ',Tensor.Dim[0],' x ',Tensor.Dim[1]); // 1 x 2
FreeAndNil(Tensor);
end.
// extract from TF.pas ( (C) Phil Hess ). It basically re-packages the operation
// and calls the relevant C TFE_Execute, with the same operation name passed on:
// in our case 'Transpose'.
// I am looking for a complete list of supported operations.
function TTensor.ExecOp(const OpName : string;
Tensor2 : TTensor = nil;
Tensor3 : TTensor = nil;
Tensor4 : TTensor = nil) : TTensor;
var
Status : TF_StatusPtr;
Op : TFE_OpPtr;
NumOutVals : cint;
OutTensH : TFE_TensorHandlePtr;
begin
Result := nil;
Status := TF_NewStatus();
Op := TFE_NewOp(Context, PAnsiChar(OpName), Status);
try
if not CheckStatus(Status) then
Exit;
{Add operation input tensors}
TFE_OpAddInput(Op, TensorH, Status);
if not CheckStatus(Status) then
Exit;
if Assigned(Tensor2) then {Operation has 2nd tensor input?}
begin
TFE_OpAddInput(Op, Tensor2.TensorH, Status);
if not CheckStatus(Status) then
Exit;
end;
if Assigned(Tensor3) then {Operation has 3rd tensor input?}
begin
TFE_OpAddInput(Op, Tensor3.TensorH, Status);
if not CheckStatus(Status) then
Exit;
end;
if Assigned(Tensor4) then {Operation has 4th tensor input?}
begin
TFE_OpAddInput(Op, Tensor4.TensorH, Status);
if not CheckStatus(Status) then
Exit;
end;
{Set operation attributes}
TFE_OpSetAttrType(Op, 'T', DataType); //typically result type same as input's
if OpName = 'MatMul' then
begin
TFE_OpSetAttrBool(Op, 'transpose_a', #0); //default (False)
TFE_OpSetAttrBool(Op, 'transpose_b', #0); //default (False)
end
else if OpName = 'Transpose' then
TFE_OpSetAttrType(Op, 'Tperm', Tensor2.DataType) //permutations type
else if OpName = 'Sum' then
begin
TFE_OpSetAttrType(Op, 'Tidx', Tensor2.DataType); //reduction_indices type
TFE_OpSetAttrBool(Op, 'keep_dims', #0); //default (False)
end
else if (OpName = 'RandomUniform') or (OpName = 'RandomStandardNormal') then
begin
TFE_OpSetAttrInt(Op, 'seed', 0); //default
TFE_OpSetAttrInt(Op, 'seed2', 0); //default
TFE_OpSetAttrType(Op, 'dtype', TF_FLOAT); //for now, use this as result type
end
else if OpName = 'OneHot' then
begin
TFE_OpSetAttrType(Op, 'T', Tensor3.DataType); //result type must be same as on/off
TFE_OpSetAttrInt(Op, 'axis', -1); //default
TFE_OpSetAttrType(Op, 'TI', DataType); //indices type
end;
NumOutVals := 1;
try
// **** THIS IS THE ACTUAL CALL TO THE C API, WHERE Op HAS THE OPNAME
TFE_Execute(Op, #OutTensH, #NumOutVals, Status);
// ***********************************************************************
except on e:Exception do
raise Exception.Create('TensorFlow unable to execute ' + OpName +
' operation: ' + e.Message);
end;
if not CheckStatus(Status) then
Exit;
Result := TTensor.CreateWithHandle(OutTensH);
finally
if Assigned(Op) then
TFE_DeleteOp(Op);
TF_DeleteStatus(Status);
{Even if exception occurred, don't want to leave any temps dangling}
if Assigned(Tensor2) and Tensor2.IsTemp then
Tensor2.Free;
if Assigned(Tensor3) and Tensor3.IsTemp then
Tensor3.Free;
if Assigned(Tensor4) and Tensor4.IsTemp then
Tensor4.Free;
if IsTemp then
Free;
end;
end;
In the meantime, I found the description file of TensorFlow at: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/ops.pbtxt. This includes all the operations and their detailed specification.
If someone is interested in a pascal interface to TF, I created one at https://github.com/zsoltszakaly/tensorflowforpascal.
Here is how you can obtain the list of valid operation names from Python:
from tensorflow.python.framework.ops import op_def_registry
registered_ops = op_def_registry.get_registered_ops()
valid_op_names = sorted(registered_ops.keys())
print(len(valid_op_names)) # Number of operation names in TensorFlow 2.0
1223
print(*valid_op_names, sep='\n')
# Abort
# Abs
# AccumulateNV2
# AccumulatorApplyGradient
# AccumulatorNumAccumulated
# AccumulatorSetGlobalStep
# AccumulatorTakeGradient
# Acos
# Acosh
# Add
# ...

Golang (w/gocql driver) not returning all entries in Cassandra DB

I have what appears to be a strange bug in either the gocql driver for Cassandra, or in the Cassandra database itself.
I am trying to do a simple write and then read all request in two separate functions. I would expect that I would get all entries on the read all request, but I am only getting the last entry in Cassandra.
Here is how I am doing the write:
util.CassSession, _ = util.CassCluster.CreateSession()
defer util.CassSession.Close()
keySpaceMeta, _ := util.CassSession.KeyspaceMetadata("platypus")
valC, exists := keySpaceMeta.Tables["cassmessage"]
if exists==true {
fmt.Println("cassmessage exists!!!")
}else{
fmt.Println("cassmessage doesnt exist!")
}
if valC!=nil{
fmt.Println("return from valC cassmessage: ", valC)
}
insertString:=`INSERT INTO cassmessage
(messagefrom, messageto, messagecontent)
VALUES('`+sendMsgReq.MessageFrom+`', '`
+sendMsgReq.MessageTo+`', '`+sendMsgReq.MessageContent+`')`
fmt.Println("insertString value: ", insertString)
err := util.CassSession.Query(insertString).Exec()
if err != nil {
fmt.Println("there was an error in appending data to cassmessage: ", err)
} else {
fmt.Println("inserted data into cassmessage successfully")
}
the terminal output from the above:
app_1 | [17:59:43][WEBSERVER] : cassmessage exists!!!
app_1 | [17:59:43][WEBSERVER] : return from valC cassmessage:
&{platypus cassmessage [] []
[0xc000400140] [] map[messagefrom:0xc0004000a0
messageto:0xc000400140 messagecontent:0xc000400000]
[messagecontent messagefrom messageto]}
app_1 | [17:59:43][WEBSERVER] : inserted data into cassmessage successfully
I am not entirely sure what the output of valC is returning, although it appears to be some sort of memory address which is a good sign. I also see that I am not getting any error on the write exec function which is hopeful.
Here is how I am doing the read:
util.CassSession, _ = util.CassCluster.CreateSession()
defer util.CassSession.Close()
keySpaceMeta, _ := util.CassSession.KeyspaceMetadata("platypus")
valC, exists := keySpaceMeta.Tables["cassmessage"]
queryString := `SELECT messageto, messagecontent, messagefrom FROM cassmessage WHERE messagefrom='`+mailReq.Email+`'`
//returns nothing, should return many rows
queryString2 := `SELECT messageto, messagecontent, messagefrom FROM cassmessage`
//returns only last entry, should return many rows
queryString3 := `SELECT * FROM cassmessage WHERE messagefrom='`+mailReq.Email+`'`
//returns nothing, should return many rows
queryAllString := `SELECT * FROM cassmessage`
//returns only last entry, should return many rows
var messageto string
var messagecontent string
var messagefrom string
iter := util.CassSession.Query(queryAllString).Iter()
for iter.Scan(&messageto, &messagecontent, &messagefrom) {
fmt.Println("Iter messageto: %v", messageto)
fmt.Println("Iter messagecontent: %v", messagecontent)
fmt.Println("Iter messagefrom: %v", messagefrom)
}
the terminal output from above:
app_1 | [18:09:54][WEBSERVER] : Iter messageto: %v xyz#xyz.com
app_1 | [18:09:54][WEBSERVER] : Iter messagecontent: %v a
app_1 | [18:09:54][WEBSERVER] : Iter messagefrom: %v abc#abc.com
This is not what I expect, as this is the output from the read, after multiple writes to the database. If you look at comments on the various queryString values I have tried 2 of them return nothing when I expect all entries to be returned, and 2 of them only return the last write entry (they are all symmetric queries to my knowledge).
Does anyone know why I cannot return multiple entries using Iter or why my four different values on the different query strings I have tried are returning different results?
Thank you.
I maybe shouldn't, but I'm going to keep this here in case someone else runs into the same problem. I wasn't making sure that my primary key in my table was unique. Doing something like this:
util.CassSession.Query("CREATE TABLE cassmessage(" +
"messageto text, messagefrom text, messagecontent text, uniqueID text, PRIMARY KEY (uniqueID))").Exec()
Managed to fix the issue.
Thanks to everyone who took a look and helped. Cheers!

Resources