data set with 1 array & 2 nested objects yield <0 rows> (or 0-length row.names) error with tidyjson - arrays

I'm working with an AOL data set that I have passed through prettify(). The types and lengths of the data are:
> json_types(People)
document.id type
1 , 1 array
> json_lengths(People)
document.id length
1 , 1, 4
A glimpse of the data after it has gone through prettify():
{
"distinct_id": "159d26d852bc2-0218a9eedf5d02-1d326f50-13c680-159d26d852c2cc",
"time": 1485294450309,
"properties": {
"$browser": "Chrome",
"$browser_version": 55,
"$city": "San Francisco",
"$country_code": "US",
"$email": "amir.movafaghi#mixpanel.com",
"$initial_referrer": "$direct",
"$initial_referring_domain": "$direct",
"$name": "Amir MOvafaghi",
"$os": "Mac OS X",
"$region": "California",
"$timezone": "America/Los_Angeles",
"$transactions": [
{
"$amount": 0.99,
"$time": "2017-01-24T13:43:30.000Z"
}
],
"Favorite Genre": "Rock",
"Lifetime Song Play Count": 1,
"Lifetime Song Purchase Count": 1,
"Plan": "Premium"
},
"last_seen": 1485294450309,
"labels": [
]
},
I set up my transformation as such:
people_b <- People %>%
gather_array %>% # stack the user data
spread_values(
distinct_id = jstring("distinct_id"),
time_id = jnumber("time"),
last_seen = jstring("last_seen"),
label = jstring("label")) %>% # extract user data
enter_object("properties") %>% # stack the properties
spread_values(
browser = jstring("$browser"),
browser_version = jnumber("$browser_version"),
city = jstring("$city"),
country_code = jstring("$country_code"),
email = jstring("$email"),
initial_referrer = jstring("$initial_referrer"),
initial_referring_domain = jstring("$initial_referring_domain"),
name = jstring("$name"),
operating_system = jstring("$os"),
region = jstring("$region"),
timezone = jstring("$timezone"),
favorite_genre = jstring("Favorite Genre"),
first_login_date = jstring("First Login Date"),
lifetime_song_play_count = jnumber("Lifetime Song Play Count"),
lifetime_song_purchase_count = jnumber("Lifetime Song Purchase Count"),
plan = jstring("Plan")) %>% #extract the properties)
enter_object("transactions") %>% #stack the transactions
gather_array %>%
spread_values(
amount = jnumber("$amount"),
transaction_time = jstring("$time")) %>% # extract the transactions
select(distinct_id, time_id, last_seen, label, browser, browser_version, city, country_code, email, initial_referrer,
initial_referring_domain, name, operating_system, region, timezone, favorite_genre,
first_login_date,lifetime_song_play_count, lifetime_song_purchase_count, plan, amount, transaction_time)
However I receive an error code:
> people_b
[1] distinct_id time_id last_seen label
[5] browser browser_version city country_code
[9] email initial_referrer initial_referring_domain name
[13] operating_system region timezone favorite_genre
[17] first_login_date lifetime_song_play_count lifetime_song_purchase_count plan
[21] amount transaction_time
<0 rows> (or 0-length row.names)
sample output from a second data set (that I still need to tidy):
> event_b
name distinct_id label time sampling_factor browser_type
1 Page Loaded 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
2 Page Loaded 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
3 Sign Up 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
4 Page Loaded 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
5 Song Played 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
6 Song Played 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
7 Song Purchased 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome
8 Plan Downgraded 159f0ddf9c437c-0b4d95a6f3b9be-123a6850-13c680-159f0ddf9c525a list() 1.485776e+12 1 Chrome

It looks to me like your issue is in the enter_object('transactions') component of your pipeline. In your JSON object, you have the key $transactions, so you are using the wrong path. Changing to '$transactions' seemed to work.
...
enter_object("$transactions") %>% #stack the transactions
...
And the full example. Note that I removed gather_array since your example is only a single object.
json <- '{
"distinct_id": "159d26d852bc2-0218a9eedf5d02-1d326f50-13c680-159d26d852c2cc",
"time": 1485294450309,
"properties": {
"$browser": "Chrome",
"$browser_version": 55,
"$city": "San Francisco",
"$country_code": "US",
"$email": "amir.movafaghi#mixpanel.com",
"$initial_referrer": "$direct",
"$initial_referring_domain": "$direct",
"$name": "Amir MOvafaghi",
"$os": "Mac OS X",
"$region": "California",
"$timezone": "America/Los_Angeles",
"$transactions": [
{
"$amount": 0.99,
"$time": "2017-01-24T13:43:30.000Z"
}
],
"Favorite Genre": "Rock",
"Lifetime Song Play Count": 1,
"Lifetime Song Purchase Count": 1,
"Plan": "Premium"
},
"last_seen": 1485294450309,
"labels": [
]
}'
people_b <- json %>%
spread_values(
distinct_id = jstring("distinct_id"),
time_id = jnumber("time"),
last_seen = jstring("last_seen"),
label = jstring("label")) %>% # extract user data
enter_object("properties") %>% # stack the properties
spread_values(
browser = jstring("$browser"),
browser_version = jnumber("$browser_version"),
city = jstring("$city"),
country_code = jstring("$country_code"),
email = jstring("$email"),
initial_referrer = jstring("$initial_referrer"),
initial_referring_domain = jstring("$initial_referring_domain"),
name = jstring("$name"),
operating_system = jstring("$os"),
region = jstring("$region"),
timezone = jstring("$timezone"),
favorite_genre = jstring("Favorite Genre"),
first_login_date = jstring("First Login Date"),
lifetime_song_play_count = jnumber("Lifetime Song Play Count"),
lifetime_song_purchase_count = jnumber("Lifetime Song Purchase Count"),
plan = jstring("Plan")) %>% #extract the properties)
enter_object("$transactions") %>% #<<<--- EDITED HERE
gather_array %>%
spread_values(
amount = jnumber("$amount"),
transaction_time = jstring("$time")) %>% # extract the transactions
select(distinct_id, time_id, last_seen, label, browser, browser_version, city, country_code, email, initial_referrer,
initial_referring_domain, name, operating_system, region, timezone, favorite_genre,
first_login_date,lifetime_song_play_count, lifetime_song_purchase_count, plan, amount, transaction_time)
nrow(people_b)
## [1] 1

Related

How do I make it so the user is limited to only buy one of each item in my discord "shop"?

I have a economy discord bot which has a shop within it. I was having trouble to limit the user to only buy 1 item from the shop. This is because I have pets in the shop, and only want the user to be able the attain 1 within their inventory.
My existing code:
#client.command()
#commands.has_role("Founder")
async def adopt(ctx,item,amount = 1, member: discord.Member = None):
if member is None:
member = ctx.author
await open_account(ctx.author)
user = ctx.author
res = await buy_pet(ctx.author,item,amount)
if amount > 1:
await ctx.send("You cannot buy more than one pet")
elif not res[0]:
if res[1]==1:
await ctx.send("That pet is not in the shop!")
return
if res[1]==2:
await ctx.send(f"**You don't have enough credits to adopt {item}")
return
if res[1]==3:
await ctx.send("You already own this pet!")
Async def for buy_pet:
async def buy_pet(user,item_name,amount):
item_name = item_name.lower()
name_ = None
petshop = [{"name":"Hamster","price":5,"description":"dog",},
{"name":"pheonix","price":50,"description":"cat",},
{"name": "basilisk","price":50,"description":"basilisk"},
{"name": "centaur","price":50,"description":"Centaur"},
{"name": "pegasus","price":150,"description":"pegasus"},
{"name": "BMO","price":500,"description":"BMO"},
{"name": "plumfrog","price":500,"description":"plumfrog"},
{"name": "drill","price":750,"description":"drill"},
{"name": "elf","price":1000,"description":"elf"},
]
for item in petshop:
name = item["name"].lower()
if name == item_name:
name_ = name
price = item["price"]
break
if name_ == None:
return [False,1]
cost = price*amount
users = await get_bank_data()
bal = await update_bank(user)
if bal[0]<cost:
return [False,2]
try:
index = 0
t = None
for thing in users[str(user.id)]["bag"]:
n = thing["item"]
if n == item_name:
old_amt = thing["amount"]
new_amt = old_amt + amount
users[str(user.id)]["bag"][index]["amount"] = new_amt
t = 1
break
index+=1
if t == None:
obj = {"item":item_name , "amount" : amount}
users[str(user.id)]["bag"].append(obj)
except:
obj = {"item":item_name , "amount" : amount}
users[str(user.id)]["bag"] = [obj]
with open("mainbank.json","w") as f:
json.dump(users,f)
await update_bank(user,cost*-1,"wallet")
return [True,"Worked"]
So I basically want to make a new case within buy_pet, which makes it so the user can only buy one, and also make it so if they already have the pet, then they can't buy the same one again.
You need some sort of database that keeps track of each member's properties.
Here is an example using a json file that you could use based on the way you've formatted your shop info:
{
"member1": {
"pets": [{name: "hamster"}, {name: "phoenix"}]
}
}
Before you let the member purchase, you check their properties in the database and see if the pet exists in their 'pets' array. If it does, don't allow them to purchase, if it doesn't exist, let them purchase and add the entry to their 'pets' array.

Funnel query with Elasticsearch

I'm trying to analyze a funnel using event data in Elasticsearch and have difficulties finding an efficient query to extract that data.
For example, in Elasticsearch I have:
timestamp action user id
--------- ------ -------
2015-05-05 12:00 homepage 1
2015-05-05 12:01 product page 1
2015-05-05 12:02 homepage 2
2015-05-05 12:03 checkout 1
I would like to extract the funnel statistics. For example:
homepage_count product_page_count checkout_count
-------------- ------------------ --------------
2 1 1
Where homepage_count represent the distinct number of users who visited the homepage, product_page_count represents the distinct numbers of users who visited the homepage after visiting the homepage, and checkout_count represents the number of users who checked out after visiting the homepage and the product page.
What would be the best query to achieve that with Elasticsearch?
This can be achieved with a combination of a terms aggregation for the actions and then a cardinality sub-aggregation for the unique user count per action, like below. note that I've also added a range query in case you want to restrict the period to observe:
{
"size": 0,
"query": {
"range": {
"timestamp": {
"gte": "2021-06-01",
"lte": "2021-06-07"
}
}
},
"aggs": {
"actions": {
"terms": {
"field": "action"
},
"aggs": {
"users": {
"cardinality": {
"field": "user_id"
}
}
}
}
}
}
UPDATE
This is a typical case where the scripted_metric aggregation comes in handy. The implementation is a bit naive, but it shows you the basics of implementing a funnel.
POST test/_search
{
"size": 0,
"aggs": {
"funnel": {
"scripted_metric": {
"init_script": """
state.users = new HashMap()
""",
"map_script": """
def user = doc['user'].value.toString();
def action = doc['action.keyword'].value;
if (!state.users.containsKey(user)) {
state.users[user] = [
'homepage': false,
'product': false,
'checkout': false
];
}
state.users[user][action] = true;
""",
"combine_script": """
return state.users;
""",
"reduce_script": """
def global = [
'homepage': 0,
'product': 0,
'checkout': 0
];
def res = [];
for (state in states) {
for (user in state.keySet()) {
if (state[user].homepage) global.homepage++;
if (state[user].product) global.product++;
if (state[user].checkout) global.checkout++;
}
}
return global;
"""
}
}
}
}
The above aggregation will return exactly the numbers you expect, i.e.:
"aggregations" : {
"funnel" : {
"value" : {
"product" : 1,
"checkout" : 1,
"homepage" : 2
}
}
}

serialize multiple related models

I am just learning django and rest-framework.
I have three models User, UserHospital and Timeslots. User is having time schedule for hospitals. I am requesting for users all details with related hospitals which displays hospital details along with timeslots. Want to represent user details in below format.
Whats wrong in my code?
Using viewsets and serializers it can be possible or i have to try another way?
{
"first_name": "abc",
"last_name": "xyz",
"mobile_number":1111111111,
"related_hospitals": [{
"id": 1,
"name": "bbbb"
"timeslot": [
{
"day": "TUE",
"start_time": "09:00:00",
"end_time": "15:00:00"
},
{
"day": "WED",
"start_time": "10:00:00",
"end_time": "20:00:00"
}
]
},
{
"id": 2,
"name": "ccc"
"timeslot": []
}]
}
created Models as below :
class Users(models.Model):
mobile_number = models.BigIntegerField()
first_name = models.CharField(max_length=255, null=True)
last_name = models.CharField(max_length=255, null=True)
class TimeSlots(BaseAbstract):
DAYS = (
('SUN', 'sunday'),
('MON', 'Monday'),
('TUE', 'tuesday'),
('WED', 'wednesday'),
('THU', 'thursday'),
('FRI', 'friday'),
('SAT', 'saturday'),
)
STATUS = (
(1, 'HOLIDAY'),
(2, 'ON_LEAVE'),
(3, 'AVAILABLE'),
(4, 'NOT_AVAILABLE')
)
DEFAULT_STATUS = 3
DEFAULT_DAY = "SUN"
day = models.CharField(default=DEFAULT_DAY, choices=DAYS, max_length=20)
start_time = models.TimeField()
end_time = models.TimeField()
status = models.SmallIntegerField(default=DEFAULT_STATUS, choices=STATUS)
class UserHospital(BaseAbstract):
user = models.ForeignKey('users.Users', on_delete=models.SET_NULL, null=True)
name = models.(Hospital,CharField(max_length=255, null=True)
timeslots = models.ManyToManyField(TimeSlots)
I have tried:
class TimeslotSerializer(serializers.ModelSerializer):
class Meta:
model = TimeSlots
fields = ('day', 'start_time', 'end_time')
read_only_fields = ('id',)
class RelatedHospitalSerializer(serializers.ModelSerializer):
timeslot = TimeslotSerializer(many=True)
class Meta:
model = UserHospital
fields = ('name', 'timeslot')
read_only_fields = ('id',)
class UserDetailsSerializer(serializers.ModelSerializer):
related_hospitals = serializers.SerializerMethodField()
def get_related_hospitals(self, obj):
hospitalData = []
if UserHospital.objects.all().filter(user=obj).exists():
hospitalData = UserHospital.objects.all().filter(user=obj)
return RelatedHospitalSerializer(hospitalData).data
class Meta:
model = Users
fields = ('first_name', 'last_name','mobile_number','related_hospitals')
read_only_fields = ('id', 'related_hospitals')
class UserDetailsViewset(mixins.CreateModelMixin, mixins.RetrieveModelMixin, mixins.ListModelMixin, viewsets.GenericViewSet):
queryset = Users.objects.all()
serializer_class = UserDetailsSerializer
def get_queryset(self):
userid = self.request.query_params.get('userid')
if userid is not None:
userData = Users.objects.filter(user=userid)
return userData
else:
return Users.objects.all()
whats wrong with my code ?
I would recommend using related_name parameter of models.ForeignKey, ..ManytoMany field etc
For example,
def Hospital(models.Model):
user = models.ForeignKey(....., related_name="hospitals")
...
def HospitalSerializer(models.Model):
...
def UserSerializer(Hyperlinkedmodelserializer ...(or other):
hospitals = HospitalSerializer(many=True)
class Meta:
....
Note: The use of "hospitals" ....
This will automatically allow one to get the result of a
UserSerializer(userModel, context={'request':request}).data ...
in your desired format

How can I convert vectors into JSON and then post it in a service

I need to convert this two vectors into a JSON with some other fields:
x_axis <- c("Dogs","Cats","Birds")
y_axis <- c(5,9,3)
The JSON must also contain these two fields:
user_id=3
model_number=4
The JSON to post must have this format:
{
"user_id":3,
"model_number": 4,
"data": [{
"x_axis": "Dogs",
"y_axis": 5
},{
"x_axis": "Cats",
"y_axis": 9
},{
"x_axis": "Birds",
"y_axis": 3
}]
}
You can use jsonlite...
library(jsonlite)
x_axis <- c("Dogs","Cats","Birds")
y_axis <- c(5,9,3)
user_id=3
model_number=4
data <- data.frame(x_axis=x_axis,y_axis=y_axis)
toJSON(list(user_id=user_id,
model_number=model_number,
data=data),
dataframe="rows",
auto_unbox = TRUE)
{"user_id":3,
"model_number":4,
"data":[{"x_axis":"Dogs","y_axis":5},
{"x_axis":"Cats","y_axis":9},
{"x_axis":"Birds","y_axis":3}]}
You can create that output using purrr and jsonlite
library(purrr)
library(jsonlite)
toJSON(list(
user_id = user_id,
model_number = model_number,
data = map2(x_axis, y_axis, ~list(x_axis=.x, y_axis=.y))
), auto_unbox = TRUE)
Basically we just create a named list in such a way that mirrors your desired output shape. Named lists result in {} objects and unnamed lists become [] arrays.

soapUI - Parse JSON Response containing arrays

Given the following JSON response from a previous test step (request) in soapUI:
{
"AccountClosed": false,
"AccountId": 86270,
"AccountNumber": "2915",
"AccountOwner": 200000000001,
"AccountSequenceNumber": 4,
"AccountTotal": 6,
"ActiveMoneyBeltSession": true,
"CLMAccountId": "",
"CustomerName": "FRANK",
"Lines": [{
"AndWithPreviousLine": false,
"IngredientId": 10000025133,
"OrderDestinationId": 1,
"PortionTypeId": 1,
"Quantity": 1,
"QuantityAsFraction": "1",
"SentToKitchen": true,
"TariffPrice": 6,
"Id": 11258999068470003,
"OriginalRingUpTime": "2014-10-29T07:37:38",
"RingUpEmployeeName": "Andy Bean",
"Seats": []
}, {
"Amount": 6,
"Cashback": 0,
"Change": 0,
"Forfeit": 0,
"InclusiveTax": 1,
"PaymentMethodId": 1,
"ReceiptNumber": "40/1795",
"Tip": 0,
"Id": 11258999068470009,
"Seats": []
}],
"MoaOrderIdentifier": "A2915I86270",
"OutstandingBalance": 0,
"SaveAccount": false,
"ThemeDataRevision": "40",
"TrainingMode": false
}
I have the following groovy script to parse the information from the response:
import groovy.json.JsonSlurper
//Define Variables for each element on JSON Response
String AccountID, AccountClosed, AccountOwner, AccountSeqNumber, AccountTotal, ActMoneyBelt
String CLMAccountID, ThemeDataRevision, Lines, MoaOrderIdentifier, OutstandingBalance, SaveAccount, TrainingMode, CustName, AccountNumber
String AndWithPreviousLine, IngredientId, OrderDestinationId, PortionTypeId, Quantity, SentToKitchen, TariffPrice, Id, OriginalRingUpTime, RingUpEmployeeName, Seats
int AccSeqNumInt
def responseContent = testRunner.testCase.getTestStepByName("ReopenAcc").getPropertyValue("response")
// Create JsonSlurper Object to parse the response
def Response = new JsonSlurper().parseText(responseContent)
//Parse each element from the JSON Response and store in a variable
AccountClosed = Response.AccountClosed
AccountID = Response.AccountId
AccountNumber = Response.AccountNumber
AccountOwner = Response.AccountOwner
AccountSeqNumber = Response.AccountSequenceNumber
AccountTotal = Response.AccountTotal
ActMoneyBelt = Response.ActiveMoneyBeltSession
CLMAccountID = Response.CLMAccountId
CustName = Response.CustomerName
Lines = Response.Lines
/*Lines Variables*/
AndWithPreviousLine = Response.Lines.AndWithPreviousLine
IngredientId = Response.Lines.IngredientId
OrderDestinationId = Response.Lines.OrderDestinationId
PortionTypeId = Response.Lines.PortionTypeId
Quantity = Response.Lines.Quantity
SentToKitchen = Response.Lines.SentToKitchen
TariffPrice = Response.Lines.TariffPrice
Id = Response.Lines.Id
OriginalRingUpTime = Response.Lines.OriginalRingUpTime
RingUpEmployeeName = Response.Lines.RingUpEmployeeName
Seats = Response.Lines.Seats
/*End Lines Variables*/
MoaOrderIdentifier = Response.MoaOrderIdentifier
OutstandingBalance = Response.OutstandingBalance
SaveAccount = Response.SaveAccount
ThemeDataRevision = Response.ThemeDataRevision
TrainingMode = Response.TrainingMode
As you can see above there is an element (Array) called "Lines". I am looking to get the individual element value within this array. My code above when parsing the elements is returning the following:
----------------Lines Variables----------------
INFO: AndWithPreviousLine: [false, null]
INFO: IngredientId: [10000025133, null]
INFO: OrderDestinationId: [1, null]
INFO: PortionTypeId: [1, null]
INFO: Quantity: [1.0, null]
INFO: SentToKitchen: [true, null]
INFO: TariffPrice: [6.0, null]
INFO: Id: [11258999068470003, 11258999068470009]
INFO: OriginalRingUpTime: [2014-10-29T07:37:38, null]
INFO: RingUpEmployeeName: [Andy Bean, null]
INFO: Seats: [[], []]
How can i get the required element from the above, also note the the 2 different data contracts contain the same element "ID". Is there a way to differenciate between the 2.
any help appreciated, thanks.
The "Lines" in the Response can be treated as a simple list of maps.
Response["Lines"].each { println it }
[AndWithPreviousLine:false, Id:11258999068470003, IngredientId:10000025133, OrderDestinationId:1, OriginalRingUpTime:2014-10-29T07:37:38, PortionTypeId:1, Quantity:1, QuantityAsFraction:1, RingUpEmployeeName:Andy Bean, Seats:[], SentToKitchen:true, TariffPrice:6]
[Amount:6, Cashback:0, Change:0, Forfeit:0, Id:11258999068470009, InclusiveTax:1, PaymentMethodId:1, ReceiptNumber:40/1795, Seats:[], Tip:0]
The Response is just one huge map, with the Lines element a list of maps.
Using groovysh really helps here to interactively play with the data, i did the following commands after starting a "groovysh" console:
import groovy.json.JsonSlurper
t = new File("data.json").text
r = new JsonSlurper().parse(t)
r
r["Lines"]
r["Lines"][0]
etc
Edit: In response to your comment, here's a better way to get the entry with a valid AndWithPreviousLine value.
myline = r["Lines"].find{it.AndWithPreviousLine != null}
myline.AndWithPreviousLine
>>> false
myline.IngredientId
>>> 10000025133
If you use findAll instead of find you'll get an array of maps and have to index them, but for this case there's only 1 and find gives you the first whose condition is true.
By using Response["Lines"].AndWithPreviousLine [0] you're assuming AndWithPreviousLine exists on each entry, which is why you're getting nulls in your lists. If the json responses had been reversed, you'd have to use a different index of [1], so using find and findAll will work better for you than assuming the index value.

Resources