Subset with loop over an array of strings - arrays

I have my code like this:
for (i in 1:b) {
carteraR[[i]]=subset(carteraR[[i]],RUN.FONDO=="8026" | RUN.FONDO=="8036" | RUN.FONDO=="8048" | RUN.FONDO=="8057" | RUN.FONDO=="8059" | RUN.FONDO=="8072" | RUN.FONDO=="8094" |
RUN.FONDO=="8107" | RUN.FONDO=="8110" | RUN.FONDO=="8115" | RUN.FONDO=="8130" | RUN.FONDO=="8230" | RUN.FONDO=="8248" | RUN.FONDO=="8257" | RUN.FONDO=="8319")
}
Where b=length(carteraR), and class(carteraR[[i]])=data.frame. RUN.FONDO is one of the head of these data frames. This code is working fine but I want to save some lines.
What I want is something like:
for (i in 1:b) {
for (j in 1:length(A)){
carteraR[[i]]=subset(carteraR[[i]],RUN.FONDO==A[j])
}
}
And where A= "8026" "8036" "8048" "8057" ... "8319" ....... etc......
What should the code be like ?
Thx

Like this:
carteraR <- lapply(carteraR, subset, RUN.FONDO %in% A)
Just be aware there can be risks with using subset in a programmatic way: Why is `[` better than `subset`?. This usage is fine though.

Related

Solana web3 Program constructor is expecting a json as one of params

I am working with typescript react and I need help with how to fix the issue. My constructor object is expecting one of the arguments to be Idl type which is basically a json generated from solana. How do i fix this?
yeah there is a weird thing with TypeScript and the IdlType on args if you look into the IDL object representation.
It is related to this line:
export declare type IdlType = "bool" | "u8" | "i8" | "u16" | "i16" | "u32" | "i32" | "f32" | "u64" | "i64" | "f64" | "u128" | "i128" | "bytes" | "string" | "publicKey" | IdlTypeDefined | IdlTypeOption | IdlTypeCOption | IdlTypeVec | IdlTypeArray;
The way fixed it is by using a workaround:
import YOUR_IDL_JSON_OBJECT from '../config/abiSolana/solanaIDL.json'
const a = JSON.stringify(YOUR_IDL_JSON_OBJECT)
const b = JSON.parse(a)
return new Program(b, address, provider)
When you do this the compiler should not scream at you. If someone cares to explain what the hell is wrong with the enum there, I would be happy. :)

Calculate total number of orders in different statuses

I want to create simple dashboard where I want to show the number of orders in different statuses. The statuses can be New/Cancelled/Finished/etc
Where should I implement these criteria? If I add filter in the Cube Browser then it applies for the whole dashboard. Should I do that in KPI? Or should I add calculated column with 1/0 values?
My expected output is something like:
--------------------------------------
| Total | New | Finished | Cancelled |
--------------------------------------
| 1000 | 100 | 800 | 100 |
--------------------------------------
I'd use measures for that, something like:
CountTotal = COUNT('Orders'[OrderID])
CountNew = CALCULATE(COUNT('Orders'[OrderID]), 'Orders'[Status] = "New")
CountFinished = CALCULATE(COUNT('Orders'[OrderID]), 'Orders'[Status] = "Finished")
CountCancelled = CALCULATE(COUNT('Orders'[OrderID]), 'Orders'[Status] = "Cancelled")

Join importxml & importhtml by braces

This formula is in A1:
=
{
{"LINK DA FOTO","LINK DO PERFIL"}
;
{"LINK DA FOTO","LINK DO PERFIL"}
;
ArrayFormula(SEERRO(PROCH(1,{1;IMPORTXML('Time Casa'!B12,"//table[#class='table squad sortable']//td[#class='photo']/a/img/#src | //table[#class='table squad sortable']//td[#class='name large-link']/a/#href")},(LIN($A$1:$A$52)+1)*2-TRANSPOR(sort(LIN($A$1:$A$2)+0,1,0)))))
}
This formula is in C1:
=
{TRANSPOR(IMPORTXML(
'Time Casa'!B12,"
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th/img/#title |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[6]/span |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[5] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[3] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[4] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[1] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[2]
"))
;
IMPORTHTML('Time Casa'!B12,"table","1")
}
Result:
By uniting like this in Sheet2, it works perfectly, the result is exactly that of the image above.
=
{
Sheet3!A:B
,
Sheet3!C:S
}
But when joining via the same formula I did below, it gives error and says ↓
Function ARRAY_ROW parameter 2 has mismatched line length. Expected 54
and have: 39.
=
{
{
{"LINK DA FOTO","LINK DO PERFIL"}
;
{"LINK DA FOTO","LINK DO PERFIL"}
;
ArrayFormula(SEERRO(PROCH(1,{1;IMPORTXML('Time Casa'!B12,"//table[#class='table squad sortable']//td[#class='photo']/a/img/#src | //table[#class='table squad sortable']//td[#class='name large-link']/a/#href")},(LIN($A$1:$A$52)+1)*2-TRANSPOR(sort(LIN($A$1:$A$2)+0,1,0)))))
}
,
{TRANSPOR(IMPORTXML(
'Time Casa'!B12,"
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th/img/#title |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[6]/span |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[5] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[3] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[4] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[1] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[2]
"))
;
IMPORTHTML('Time Casa'!B12,"table","1")
}
}
I would like to know what I need to adjust so it can work, I tried to use =FILTER(X,X<>"") but the same error continued.
Link to Spreadsheet:
https://docs.google.com/spreadsheets/d/1DNhl5hf5ofST84nawfBF6kuzhn83UMldh1lS20VTJpA/edit?usp=sharing
try:
={{{"LINK DA FOTO", "LINK DO PERFIL"};
{"LINK DA FOTO", "LINK DO PERFIL"};
ARRAYFORMULA(QUERY(IFERROR(HLOOKUP(1, {1; IMPORTXML('Time Casa'!B12,
"//table[#class='table squad sortable']//td[#class='photo']/a/img/#src |
//table[#class='table squad sortable']//td[#class='name large-link']/a/#href")},
(ROW($A$1:$A$52)+1)*2-TRANSPOSE(SORT(ROW($A$1:$A$2)+0, 1, 0)))),
"where Col1 is not null"))},
{TRANSPOSE(IMPORTXML('Time Casa'!B12,
"//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th/img/#title |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[6]/span |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[5] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[3] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[4] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[1] |
//*[#id='page_team_1_block_team_squad_3-table']/thead/tr/th[2]"));
IMPORTHTML('Time Casa'!B12, "table", "1")}}

Scala: Delete empty array values from a Spark DataFrame

I'm a new learner of Scala. Now given a DataFrame named df as follows:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [null]| [0.0]| [0.0]| [null]|
| [IND1]| [5.0]| [6.0]| [A]|
| [IND2]| [7.0]| [8.0]| [B]|
| []| []| []| []|
+-------+-------+-------+-------+
I'd like to delete rows if all columns is an empty array (4th row).
For example I might expect the result is:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [null]| [0.0]| [0.0]| [null]|
| [IND1]| [5.0]| [6.0]| [A]|
| [IND2]| [7.0]| [8.0]| [B]|
+-------+-------+-------+-------+
I'm trying to use isNotNull (like val temp=df.filter(col("Column1").isNotNull && col("Column2").isNotNull && col("Column3").isNotNull && col("Column4").isNotNull).show()
) but still show all rows.
I found python solution of using a Hive UDF from link, but I had hard time trying to convert to a valid scala code. I would like use scala command similar to the following code:
val query = "SELECT * FROM targetDf WHERE {0}".format(" AND ".join("SIZE({0}) > 0".format(c) for c in ["Column1", "Column2", "Column3","Column4"]))
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext.sql(query)
Any help would be appreciated. Thank you.
Using the isNotNull or isNull will not work because it is looking for a 'null' value in the DataFrame. Your example DF does not contain null values but empty values, there is a difference there.
One option: You could create a new column that has the length of of the array and filter for if the array is zero.
val dfFil = df
.withColumn("arrayLengthColOne", size($"Column1"))
.withColumn("arrayLengthColTwo", size($"Column2"))
.withColumn("arrayLengthColThree", size($"Column3"))
.withColumn("arrayLengthColFour", size($"Column4"))
.filter($"arrayLengthColOne" =!= 0 && $"arrayLengthColTwo" =!= 0
&& $"arrayLengthColThree" =!= 0 && $"arrayLengthColFour" =!= 0)
.drop("arrayLengthColOne", "arrayLengthColTwo", "arrayLengthColThree", "arrayLengthColFour")
Original DF:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [A]| [B]| [C]| [d]|
| []| []| []| []|
+-------+-------+-------+-------+
New DF:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [A]| [B]| [C]| [d]|
+-------+-------+-------+-------+
You could also create a function that will map across all the columns and do it.
Another approach (in addition to accepted answer) would be using Datasets.
For example, by having a case class:
case class MyClass(col1: Seq[String],
col2: Seq[Double],
col3: Seq[Double],
col4: Seq[String]) {
def isEmpty: Boolean = ...
}
You can represent your source as a typed structure:
import spark.implicits._ // needed to provide an implicit encoder/data mapper
val originalSource: DataFrame = ... // provide your source
val source: Dataset[MyClass] = originalSource.as[MyClass] // convert/map it to Dataset
So you could do filtering like following:
source.filter(element => !element.isEmpty) // calling class's instance method

Count unique rows of column in Array of object

Using
Ruby 1.9.3-p194
Rails 3.2.8
Here's what I need.
Count the different human resources (human_resource_id) and divide this by the total number of assignments (assignment_id).
So, the answer for the dummy-data as given below should be:
1.5 assignments per human resource
But I just don't know where to go anymore.
Here's what I tried:
Table name: Assignments
id | human_resource_id | assignment_id | assignment_start_date | assignment_expected_end_date
80101780 | 20200132 | 80101780 | 2012-10-25 | 2012-10-31
80101300 | 20200132 | 80101300 | 2012-07-07 | 2012-07-31
80101308 | 21100066 | 80101308 | 2012-07-09 | 2012-07-17
At first I need to make a selection for the period I need to 'look' at. This is always from max a year ago.
a = Assignment.find(:all, :conditions => { :assignment_expected_end_date => (DateTime.now - 1.year)..DateTimenow })
=> [
#<Assignment id: 80101780, human_resource_id: "20200132", assignment_id: "80101780", assignment_start_date: "2012-10-25", assignment_expected_end_date: "2012-10-31">,
#<Assignment id: 80101300, human_resource_id: "20200132", assignment_id: "80101300", assignment_start_date: "2012-07-07", assignment_expected_end_date: "2012-07-31">,
#<Assignment id: 80101308, human_resource_id: "21100066", assignment_id: "80101308", assignment_start_date: "2012-07-09", assignment_expected_end_date: "2012-07-17">
]
foo = a.group_by(&:human_resource_id)
Now I got a beautiful 'Array of hash of object' and I just don't know what to do next.
Can someone help me?
You can try to execute the request in SQL :
ActiveRecord::Base.connection.select_value('SELECT count(distinct human_resource_id) / count(distinct assignment_id) AS ratio FROM assignments');
You could do something like
human_resource_count = assignments.collect{|a| a.human_resource_id}.uniq.count
assignment_count = assignments.collect{|a| a.assignment_id}.uniq.count
result = human_resource_count/assignment_count

Resources