Return values in an array of hashes - arrays

I have an assignment that I cannot figure out where my mistake lies. I have a large array of hashes all under the method twitter_data. The hash is structured as such.
def twitter_data
[{"User"=>
{"description"=>
"Description here",
"last twenty tweets"=>
["tweets written out here"],
"number of followers"=>1000,
"number of friends"=>100,
"latest tweet"=>
"tweet written out here",
"number of tweets"=>1000,
"location"=>"Wherever, Wherever"}},]
end
Now if I wanted to for instance list all of the users and their descriptions I thought the code would read as such.
twitter_data.each do |twitter_data|
puts "#{twitter_data[:twitter_data]}: #{twitter_data[:description]}"
end
But the output for that just gives me about seven :, without the username in front of it or the description afterwards.

As you can see the description key is nested into another hash which key is User. I don't know which is the other key you want to print because data seems incomplete but if you wanted to print just the descriptions this one should work
twitter_data.each do |user_data|
description = user_data["User"]["description"]
puts description
end

There are a couple of reasons why this does not work:
1) The twitter_data element inside the each looks like this { 'User' => { 'description'.... On that hash, the value stored under the :description key is nil.
2) Even if you where to refer to the correct hash via twitter_data['User'] you would still be using symbols (e.g. :description) instead of strings. So even then, the value stored for the keys would be nil.
3) You are referencing elements that do not seem to exist in the hash even if one where to use strings (e.g. :twitter_data). Now this might simply be due to the example selected.
What will work is to correctly reference the hashes:
twitter_data.each do |data|
user_hash = data['User']
puts "#{user_hash['twitter_data']}: #{user_hash['description']}"
end

Related

Filter Array For IDs Existing in Another Array with Ruby on Rails/Mongo

I need to compare the 2 arrays declared here to return records that exist only in the filtered_apps array. I am using the contents of previous_apps array to see if an ID in the record exists in filtered_apps array. I will be outputting the results to a CSV and displaying records that exist in both arrays to the console.
My question is this: How do I get the records that only exist in filtered_apps? Easiest for me would be to put those unique records into a new array to work with on the csv.
start_date = Date.parse("2022-02-05")
end_date = Date.parse("2022-05-17")
valid_year = start_date.year
dupe_apps = []
uniq_apps = []
# Finding applications that meet my criteria:
filtered_apps = FinancialAssistance::Application.where(
:is_requesting_info_in_mail => true,
:aasm_state => "determined",
:submitted_at => {
"$exists" => true,
"$gte" => start_date,
"$lte" => end_date })
# Finding applications that I want to compare against filtered_apps
previous_apps = FinancialAssistance::Application.where(
is_requesting_info_in_mail: true,
:submitted_at => {
"$exists" => true,
"$gte" => valid_year })
# I'm using this to pull the ID that I'm using for comparison just to make the comparison lighter by only storing the family_id
previous_apps.each do |y|
previous_apps_array << y.family_id
end
# This is where I'm doing my comparison and it is not working.
filtered_apps.each do |app|
if app.family_id.in?(previous_apps_array) == false
then #non_dupe_apps << app
else "No duplicate found for application #{app.hbx_id}"
end
end
end
So what am I doing wrong in the last code section?
Let's check your original method first (I fixed the indentation to make it clearer). There's quite a few issues with it:
filtered_apps.each do |app|
if app.family_id.in?(previous_apps_array) == false
# Where is "#non_dupe_apps" declared? It isn't anywhere in your example...
# Also, "then" is not necessary unless you want a one-line if-statement
then #non_dupe_apps << app
# This doesn't do anything, it's just a string
# You need to use "p" or "puts" to output something to the console
# Note that the "else" is also only triggered when duplicates WERE found...
else "No duplicate found for application #{app.hbx_id}"
end # Extra "end" here, this will mess things up
end
end
Also, you haven't declared previous_apps_array anywhere in your example, you just start adding to it out of nowhere.
Getting the difference between 2 arrays is dead easy in Ruby: just use -!
uniq_apps = filtered_apps - previous_apps
You can also do this with ActiveRecord results, since they are just arrays of ActiveRecord objects. However, this doesn't help if you specifically need to compare results using the family_id column.
TIP: Getting the values of only a specific column/columns from your database is probably best done with the pluck or select method if you don't need to store any other data about those objects. With pluck, you only get an array of values in the result, not the full objects. select works a bit differently and returns ActiveRecord objects, but filters out everything but the selected columns. select is usually better in nested queries, since it doesn't trigger a separate query when used as a part of another query, while pluck always triggers one.
# Querying straight from the database
# This is what I would recommend, but it doesn't print the values of duplicates
uniq_apps = filtered_apps.where.not(family_id: previous_apps.select(:family_id))
I highly recommend getting really familiar with at least filter/select, and map out of the basic array methods. They make things like this way easier. The Ruby docs are a great place to learn about them and others. A very simple example of doing a similar thing to what you explained in your question with filter/select on 2 arrays would be something like this:
arr = [1, 2, 3]
full_arr = [1, 2, 3, 4, 5]
unique_numbers = full_arr.filter do |num|
if arr.include?(num)
puts "Duplicates were found for #{num}"
false
else
true
end
end
# Duplicates were found for 1
# Duplicates were found for 2
# Duplicates were found for 3
=> [4, 5]
NOTE: The OP is working with ruby 2.5.9, where filter is not yet available as an array method (it was introduced in 2.6.3). However, filter is just an alias for select, which can be found on earlier versions of Ruby, so they can be used interchangeably. Personally, I prefer using filter because, as seen above, select is already used in other methods, and filter is also the more common term in other programming languages I usually work with. Of course when both are available, it doesn't really matter which one you use, as long as you keep it consistent.
EDIT: My last answer did, in fact, not work.
Here is the code all nice and working.
It turns out the issue was that when comparing family_id from the set of records I forgot that the looped record was a part of the set, so it would return it, too. I added a check for the ID of the array to match the looped record and bob's your uncle.
I added the pass and reject arrays so I could check my work instead of downloading a csv every time. Leaving them in mostly because I'm scared to change anything else.
start_date = Date.parse(date_from)
end_date = Date.parse(date_to)
valid_year = start_date.year
date_range = (start_date)..(end_date)
comparison_apps = FinancialAssistance::Application.by_year(start_date.year).where(
aasm_state:'determined',
is_requesting_voter_registration_application_in_mail:true)
apps = FinancialAssistance::Application.where(
:is_requesting_voter_registration_application_in_mail => true,
:submitted_at => date_range).uniq{ |n| n.family_id}
#pass_array = []
#reject_array = []
apps.each do |app|
family = app.family
app_id = app.id
previous_apps = comparison_apps.where(family_id:family.id,:id.ne => app.id)
if previous_apps.count > 0
#reject_array << app
puts "\e[32mApplicant hbx id \e[31m#{app.primary_applicant.person_hbx_id}\e[32m in family ID \e[31m#{family.id}\e[32m has registered to vote in a previous application.\e[0m"
else
<csv fields here>
csv << [csv fields here]
end
end
Basically, I pulled the applications into the app variable array, then filtered them by the family_id field in each record.
I had to do this because the issue at the bottom of everything was that there were records present in app that were themselves duplicates, only submitted a few days apart. Since I went on the assumption that the initial app array would be all unique, I thought the duplicates that were included were due to the rest of the code not filtering correctly.
I then use the uniq_apps array to filter through and look for matches in uniq_apps.each do, and when it finds a duplicate, it adds it to the previous_applications array inside the loop. Since this array resets each go-round, if it ever has more than 0 records in it, the app gets called out as being submitted already. Otherwise, it goes to my csv report.
Thanks for the help on this, it really got my brain thinking in another direction that I needed to. It also helped improve the code even though the issue was at the very beginning.

Manipulating Output from an Array of Nested Hashes in Ruby

I've been pulling data from an API in JSON, and am currently stumbling over an elmementary problem
The data is on companies, like Google and Facebook, and is in an array or hashes, like so:
[
{"id"=>"1", "properties"=>{"name"=>"Google", "stock_symbol"=>GOOG, "primary_role"=>"company"}},
{"id"=>"2", "properties"=>{"name"=>"Facebook", "stock_symbol"=>FB, "primary_role"=>"company"}}
]
Below are two operations I'd like to try:
For each company, print out the name, ID, and the stock symbol (i.e. "Google - 1 - GOOG" and "Facebook - 2 - FB")
Remove "primary role" key/value from Google and Facebook
Assign a new "industry" key/value for Google and Facebook
Any ideas?
Am a beginner in Ruby, but running into issues with some functions / methods (e.g. undefined method) for arrays and hashes as this looks to be an array OF hashes
Thank you!
Ruby provides a couple of tools to help us comprehend arrays, hashes, and nested mixtures of both.
Assuming your data looks like this (I've added quotes around GOOG and FB):
data = [
{"id"=>"1", "properties"=>{"name"=>"Google", "stock_symbol"=>"GOOG", "primary_role"=>"company"}},
{"id"=>"2", "properties"=>{"name"=>"Facebook", "stock_symbol"=>"FB", "primary_role"=>"company"}}
]
You can iterate over the array using each, e.g.:
data.each do |result|
puts result["id"]
end
Digging into a hash and printing the result can be done in a couple of ways:
data.each do |result|
# method 1
puts result["properties"]["name"]
# method 2
puts result.dig("properties", "name")
end
Method #1 uses the hash[key] syntax, and because the first hash value is another hash, it can be chained to get the result you're after. The drawback of this approach is that if you have a missing properties key on one of your results, you'll get an error.
Method #2 uses dig, which accepts the nested keys as arguments (in order). It'll dig down into the nested hashes and pull out the value, but if any step is missing, it will return nil which can be a bit safer if you're handling data from an external source
Removing elements from a hash
Your second question is a little more involved. You've got two options:
Remove the primary_role keys from the nested hashes, or
Create a new object which contains all the data except the primary_role keys.
I'd generally go for the latter, and recommend reading up on immutability and immutable data structures.
However, to achieve [1] you can do an in-place delete of the key:
data.each do |company|
company["properties"].delete("primary_role")
end
Adding elements to a hash
You assign new hash values simply with hash[key] = value, so you can set the industry with something like:
data.each do |company|
company["properties"]["industry"] = "Advertising/Privacy Invasion"
end
which would leave you with something like:
[
{
"id"=>"1",
"properties"=>{
"name"=>"Google",
"stock_symbol"=>"GOOG",
"industry"=>"Advertising/Privacy Invasion"
}
},
{
"id"=>"2",
"properties"=>{
"name"=>"Facebook",
"stock_symbol"=>"FB",
"industry"=>"Advertising/Privacy Invasion"
}
}
]
To achieve the first operation, you can iterate through the array of companies and access the relevant information for each company. Here's an example in Ruby:
companies = [ {"id"=>"1", "properties"=>{"name"=>"Google", "stock_symbol"=>"GOOG", "primary_role"=>"company"}}, {"id"=>"2", "properties"=>{"name"=>"Facebook", "stock_symbol"=>"FB", "primary_role"=>"company"}}]
companies.each do |company|
name = company['properties']['name']
id = company['id']
stock_symbol = company['properties']['stock_symbol']
puts "#{name} - #{id} - #{stock_symbol}"
end
This will print out the name, ID, and stock symbol for each company.
To remove the "primary role" key/value, you can use the delete method on the properties hash. For example:
companies.each do |company|
company['properties'].delete('primary_role')
end
To add a new "industry" key/value, you can use the []= operator to add a new key/value pair to the properties hash. For example:
companies.each do |company|
company['properties']['industry'] = 'Technology'
end
This will add a new key/value pair with the key "industry" and the value "Technology" to the properties hash for each company.

AT NEW with substring access?

I have a solution that includes a LOOP which I would like to spare. So I wonder, whether you know a better way to do this.
My goal is to loop through an internal, alphabetically sorted standard table. This table has two columns: a name and a table, let's call it subtable. For every subtable I want to do some stuff (open an xml page in my xml framework).
Now, every subtable has a corresponding name. I want to group the subtables according to the first letter of this name (meaning, put the pages of these subtables on one main page -one main page for every character-). By grouping of subtables I mean, while looping through the table, I want to deal with the subtables differently according to the first letter of their name.
So far I came up with the following solution:
TYPES: BEGIN OF l_str_tables_extra,
first_letter(1) TYPE c,
name TYPE string,
subtable TYPE REF TO if_table,
END OF l_str_tables_extra.
DATA: ls_tables_extra TYPE l_str_tables_extra.
DATA: lt_tables_extra TYPE TABLE OF l_str_tables_extra.
FIELD-SYMBOLS: <ls_tables> TYPE str_table."Like LINE OF lt_tables.
FIELD-SYMBOLS: <ls_tables_extra> TYPE l_str_tables_extra.
*"--- PROCESSING LOGIC ------------------------------------------------
SORT lt_tables ASCENDING BY name.
"Add first letter column in order to use 'at new' later on
"This is the loop I would like to spare
LOOP AT lt_tables ASSIGNING <ls_tables>.
ls_tables_extra-first_letter = <ls_tables>-name+0(1). "new column
ls_tables_extra-name = <ls_tables>-name.
ls_tables_extra-subtable = <ls_tables>-subtable.
APPEND ls_tables_extra TO lt_tables_extra.
ENDLOOP.
LOOP AT lt_tables_extra ASSIGNING <ls_tables_extra>.
AT NEW first_letter.
"Do something with subtables with same first_letter.
ENDAT.
ENDLOOP.
I wish I could use
AT NEW name+0(1)
instead of
AT NEW first_letter
, but offsets and lengths are not allowed.
You see, I have to inlcude this first loop to add another column to my table which is kind of unnecessary because there is no new info gained.
In addition, I am interested in other solutions because I get into trouble with the framework later on for different reasons. A different way to do this might help me out there, too.
I am happy to hear any thoughts about this! I could not find anything related to this here on stackoverflow, but I might have used not optimal search terms ;)
Maybe the GROUP BY addition on LOOP could help you in this case:
LOOP AT i_tables
INTO DATA(wa_line)
" group lines by condition
GROUP BY (
" substring() because normal offset would be evaluated immediately
name = substring( val = wa_line-name len = 1 )
) INTO DATA(o_group).
" begin of loop over all tables starting with o_group-name(1)
" loop over group object which contains
LOOP AT GROUP o_group
ASSIGNING FIELD-SYMBOL(<fs_table>).
" <fs_table> contains your table
ENDLOOP.
" end of loop
ENDLOOP.
why not using a IF comparison?
data: lf_prev_first_letter(1) type c.
loop at lt_table assigning <ls_table>.
if <ls_table>-name(1) <> lf_prev_first_letter. "=AT NEW
"do something
lf_prev_first_letter = <ls_table>-name(1).
endif.
endloop.

Django Check if Integer Exists in Database Field Array

I have a database with a field that is a "pseudo" array. This array holds integer values. My implementation is as follows:
attendees = models.TextField(null=True) # declaring the integer array
When I say pseudo, I mean that I am using json to make it into an array.
attendees=json.dumps(members)
Now the attendees column will contain something like this ["1", "2", "3"]
So I want to check if attendees will contain the value "1" for example. Essentially, I want something like this:
eventList = Events.objects.all().filter(user_id in Event.attendees) # I know this isn't the correct syntax
Any ideas on how to do this as efficiently as possible?
You'll need to use __contains
.filter(attendees__contains='"{}"'.format(user_id))
Although the question remains why this isn't a separate model or JSONField/ArrayField...

Parsing structured text data

I extracted blob field out of mysql table in text format:
CAST(orders AS CHAR(10000) CHARACTER SET utf8)
Now each field looks like this:
a:2:{s:4:"Cart";a:5:{s:4:"cart";a:2:{i:398;a:7:{s:2:"id";s:3:"398";s:4:"name";s:14:"Some product 1";s:5:"price";i:780;s:3:"uid";s:5:"FN-02";s:3:"num";s:1:"1";s:6:"weight";s:1:"0";s:4:"user";s:1:"4";}i:379;a:7:{s:2:"id";s:3:"379";s:4:"name";s:14:"Some product 2";s:5:"price";i:750;s:3:"uid";s:5:"FR-01";s:3:"num";s:1:"1";s:6:"weight";s:1:"0";s:4:"user";s:1:"4";}}s:3:"num";i:2;s:3:"sum";s:7:"1530.00";s:6:"weight";i:160;s:8:"dostavka";s:3:"180";}s:6:"Person";a:17:{s:4:"ouid";s:6:"103-47";s:4:"data";s:10:"1278090513";s:4:"time";s:8:"21:33 pm";s:4:"mail";s:15:"mail#mailer.com";s:11:"name_person";s:8:"John Doe";s:8:"org_name";s:13:"John Doe Inc.";s:7:"org_inn";s:12:"667110804509";s:7:"org_kpp";s:0:"";s:8:"tel_code";s:3:"343";s:8:"tel_name";s:7:"2670039";s:8:"adr_name";s:26:"London, 221b, Baker street";s:14:"dostavka_metod";s:1:"8";s:8:"discount";s:0:"";s:7:"user_id";s:2:"13";s:6:"dos_ot";s:0:"";s:6:"dos_do";s:0:"";s:11:"order_metod";s:1:"1";}}
What I can notice is that this text goes in order: [type]:[length]:[data];, where [type]: s stands for string and a stands for array (or dictionary in Python). It also has i:'number': groups without [length]:.
I don't see better solution than parsing it with regex in several passes, though I don't clearly understand how to parse nested dictionaries (in Python terminology).
The question: is it a standard data structure that already has a parser?
This looks like the output from the PHP serialize function (you need to unserialize it):
http://php.net/manual/en/function.serialize.php
If you are working in python, there is a port of the serialize and unserialize functions here:
https://pypi.python.org/pypi/phpserialize
Anatomy of a serialize()'ed value:
String
s:size:value;
Integer
i:value;
Boolean
b:value; (does not store "true" or "false", does store '1' or '0')
Null
N;
Array
a:size:{key definition;value definition;(repeated per element)}
Object
O:strlen(object name):object name:object size:{s:strlen(property name):property name:property definition;(repeated per property)}
String values are always in double quotes
Array keys are always integers or strings
"null => 'value'" equates to 's:0:"";s:5:"value";',
"true => 'value'" equates to 'i:1;s:5:"value";',
"false => 'value'" equates to 'i:0;s:5:"value";',
"array(whatever the contents) => 'value'" equates to an "illegal offset type" warning because you can't use an
array as a key; however, if you use a variable containing an array as a key, it will equate to 's:5:"Array";s:5:"value";',
and
attempting to use an object as a key will result in the same behavior as using an array will.

Resources