I am trying to swap the named columns of a numpy array as shown below, but the function is not behaving accordingly to what I anticipated. I see that the original 'data' is being changed even when I use the deepcopy from the copy module. Is there something that I am missing?
import copy
import numpy as np
data = np.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y',float)])
def rot(data, i):
rotdata = copy.deepcopy(data)
print(data['x'])
if i == 0:
pass
if i == 1:
rotdata['x'] = 5-rotdata['x']
if i == 2:
rotdata.dtype.names = ['y','x']
if i == 3:
rotdata.dtype.names = ['y','x']
rotdata['x'] = 5-rotdata['x']
if i == 4:
rotdata['x'] = 5-rotdata['x']
rotdata.dtype.names = ['y','x']
if i == 5:
rotdata['x'] = 5-rotdata['x']
rotdata.dtype.names = ['y','x']
rotdata['x'] = 5-rotdata['x']
return rotdata
data1 = rot(data,5)
data2 = rot(data,5)
print(data1)
print(data2)
The result is,
[1. 3.]
[2. 4.]
[(4., 3.) (2., 1.)]
[(1., 2.) (3., 4.)]
Which is indeed against my intentions.
Apparently copy.deepcopy() does not make a deep copy of the dtype object attached to the numpy array. So the data inside the array was copied, but you were switching names 'x' and 'y' in the data.dtype. So printing data['x'] gave you a different result, as did the second call data2 = rot(data,5).
You can solve it by adding the following line:
rotdata = copy.deepcopy(data)
rotdata.dtype = copy.deepcopy(data.dtype)
I am attempting to use data from a json file which has information stored by the user. But when I print from it, it displays all the information stored in it. How do I extract only 1 part? Here is the code:
#client.command(aliases = ["shib, shibaku, Shib"])
async def Shibaku(ctx, int = 0):
if int == 1:
with open('Shibaku1.json') as f:
coins_data = json.load(f)
for oslink in coins_data[str(ctx.author.id)]:
await ctx.send(oslink)
Here is the code for storing information in "Shibaku1.json"
#client.command()
async def shibaku1(ctx, coin1, coin2, coin3, coin4, coin5, coin6, shibakunumber, oslink):
await ctx.message.delete()
with open('Shibaku10.json', 'r') as f:
coins_data = json.load(f)
coins_data[str(ctx.author.id)] = (coin1, coin2, coin3, coin4, coin5, coin6, shibakunumber, oslink)
with open('Shibaku10.json', 'w') as f:
json.dump(coins_data, f)
Sameple json file:
{"331971067788787733": ["\ud83d\ude04", "\ud83d\ude06", "\ud83d\ude00", "\ud83d\ude01", "\ud83d\ude05", "\ud83e\udd0f", "1", "1"]
I want to display only the "oslink" part.
#client.command(aliases = ["shib, shibaku, Shib"])
async def Shibaku(ctx, int = 0):
if int == 1:
with open('Shibaku1.json') as f:
coins_data = json.load(f)
for oslink in coins_data[str(ctx.author.id)]:
await ctx.send(oslink)
In this for loop block, you are iterating every single element of the value of coins_data[author_id].
And from what I can tell, your coins_data is structured like:
{
author_id1: (list of coins, shibakunumber, oslink),
author_id2: (list of coins, shibakunumber, oslink),
...
}
Because the value for key of author_id is only a list, you are sending all of coins, shibakunumber, and the oslink.
If you only want to send the oslink, you need to structure the data so that you can specifically call for the oslink.
For example, a nested dictionaries will work:
{
author_id1: {
coin1: value,
coin2: value,
...,
oslink: somevalue
},
author_id2: {
coin1: value,
coin2: value,
...,
oslink: somevalue
},
...
}
This way, you can specifically look for it like coins_data[str(stx.author.id)]["oslink"]
so u want every coin come 1 by 1 , so heres my answer :
#client.command(aliases = ["shib, shibaku, Shib"])
async def Shibaku(ctx, int = 0):
if int == 1:
with open('Shibaku1.json') as f:
coins_data = json.load(f)
user_coin_list = coins_data[str(ctx.author.id)]
for oslink in user_coin_list:
await ctx.send(oslink)
so above works like when u get coins_data[str(ctx.author.id)] it returns list and then u have to run for loop in list but u were running for loop in json.
I am trying to fetch data from a db database in batches and copy it to an ndb database, using a cursor. My code is doing it successfully for the first batch, but not fetching any further records. I did not find much information on cursors, please help me here.
Here is my code snippet: def post(self):
a = 0
chunk_size = 2
next_cursor = self.request.get("cursor")
query = db.GqlQuery("select * from BooksPost")
while a == 0:
if next_cursor:
query.with_cursor(start_cursor = next_cursor)
else:
a = 1
results = query.fetch(chunk_size)
for result in results:
nbook1 = result.bookname
nauthor1 = result.authorname
nbook1 = nBooksPost(nbookname = nbook1, nauthorname = nauthor1)
nbook1.put()
next_cursor = self.request.get("cursor")
Basically, how do I set the next cursor to iterate over?
def post(self):
chunk_size = 10
has_more_results = True
query = db.GqlQuery("select * from Post")
cursor = self.request.get('cursor', None)
#cursor = query.cursor()
if cursor:
query.with_cursor(cursor)
while has_more_results == True:
results = query.fetch(chunk_size)
new_cursor = query.cursor()
print("count: %d, results %d" % (query.count(), len(results)))
if query.count(1) == 1:
has_more_results = True
else:
has_more_results = False
for result in results:
#do this
query.with_cursor(new_cursor)
Say I have a movie database where you can search by title.
I have a Movie model that looks like the following (simplified)
class Movie(ndb.Model):
name = ndb.StringProperty(required=True)
queryName = ndb.ComputedProperty(lambda self: [w.lower() for w in self.name.split()], repeated=True)
#staticmethod
def parent_key():
return ndb.Key(Movie, 'parent')
The queryName is just a lower case list of the words in Movie.name. The parent_key() is just for the query basically
If I was searching for the movie Forest Gump, I would want it to show up for the following search terms (and more, these are just examples)
'fo' - 'forest' starts with 'fo'
'gu' - 'gump' starts with 'gu'
'gu fo' - 'forest' starts with 'fo' and 'gump' starts with 'gu'
I can get the first two easily with a query similar to the following
movies = Movie\
.query(ancestor=Movie.parent_key())\
.filter(Movie.queryName >= x)\
.filter(Movie.queryName < x + u'\ufffd')\
.feth(10)
where x is 'fo' or 'gu'. Again, this is simply a query that works not my actual code. That comes later. If I expand a bit on the above query to look for two words I thought I could do something like the following however, it doesn't work.
movies = Movie\
.query(ancestor=Movie.parent_key())\
.filter(Movie.queryName >= 'fo')\
.filter(Movie.queryName < 'fo' + u'\ufffd')\
.filter(Movie.queryName >= 'gu')\
.filter(Movie.queryName < 'gu' + u'\ufffd')\
.feth(10)
Now , this doesn't work because it is looking in queryName to see if it has any item which starts with 'fo' and starts with 'gu'. Since that could never be true for a single item in the list, the query returns nothing.
The question is how do you query for Movies which have a queryName with an item that starts with 'fo' AND an item that starts with 'gu'?
Actual Code:
class MovieSearchHandler(BaseHandler):
def get(self):
q = self.request.get('q')
if q:
q = q.replace('&', '&').lower()
filters = self.create_filter(*q.split())
if filters:
movies = Movie\
.query(ancestor=Movie.parent_key())\
.filter(*filters)\
.fetch(10)
return self.write_json([{'id': m.movieId, 'name': m.name} for m in movies])
return self.write_json([])
def create_filter(self, *args):
filters = []
if args:
for prefix in args:
filters.append(Movie.queryName >= prefix)
filters.append(Movie.queryName < prefix + u'\ufffd')
return filters
Update:
My current solution is
class MovieSearchHandler(BaseHandler):
def get(self):
q = self.request.get('q')
if q:
q = q.replace('&', '&').lower().split()
movieFilter, reducable = self.create_filter(*q)
if movieFilter:
movies = Movie\
.query(ancestor=Movie.parent_key())\
.filter(movieFilter)\
.fetch(None if reducable else 10)
if reducable:
movies = self.reduce(movies, q)
return self.write_json([{'id': m.movieId, 'name': m.name} for m in movies])
return self.write_json([])
def create_filter(self, *args):
if args:
if len(args) == 1:
prefix = args[0]
return ndb.AND(Movie.queryName >= prefix, Movie.queryName < prefix + u'\ufffd'), False
ands = [ndb.AND(Movie.queryName >= prefix, Movie.queryName < prefix + u'\ufffd')
for prefix in args]
return ndb.OR(*ands), True
return None, False
def reduce(self, movies, terms):
reducedMovies = []
for m in movies:
if len(reducedMovies) >= 10:
return reducedMovies
if all(any(n.startswith(t) for n in m.queryName) for t in terms):
reducedMovies.append(m)
return reducedMovies
Still looking for something better though
Thanks
I have two arrays of hashes with the format:
hash1
[{:root => root_value, :child1 => child1_value, :subchild1 => subchild1_value, bases => hit1,hit2,hit3}...]
hash2
[{:path => root_value/child1_value/subchild1_value, :hit1_exist => t ,hit2_exist => t,hit3_exist => f}...]
IF I do this
Def sample
results = nil
project = Project.find(params[:project_id])
testrun_query = "SELECT root_name, suite_name, case_name, ic_name, executed_platforms FROM testrun_caches WHERE start_date >= '#{params[:start_date]}' AND start_date < '#{params[:end_date]}' AND project_id = #{params[:project_id]} AND result <> 'SKIP' AND result <> 'N/A'"
if !params[:platform].nil? && params[:platform] != [""]
#yell_and_log "platform not nil"
platform_query = nil
params[:platform].each do |platform|
if platform_query.nil?
platform_query = " AND (executed_platforms LIKE '%#{platform.to_s},%'"
else
platform_query += " OR executed_platforms LIKE '%#{platform.to_s},%'"
end
end
testrun_query += ")" + platform_query
end
if !params[:location].nil? &&!params[:location].empty?
#yell_and_log "location not nil"
testrun_query += "AND location LIKE '#{params[:location].to_s}%'"
end
testrun_query += " GROUP BY root_name, suite_name, case_name, ic_name, executed_platforms ORDER BY root_name, suite_name, case_name, ic_name"
ic_query = "SELECT ics.path, memberships.pts8210, memberships.sv6, memberships.sv7, memberships.pts14k, memberships.pts22k, memberships.pts24k, memberships.spb32, memberships.spb64, memberships.sde, projects.name FROM ics INNER JOIN memberships on memberships.ic_id = ics.id INNER JOIN test_groups ON test_groups.id = memberships.test_group_id INNER JOIN projects ON test_groups.project_id = projects.id WHERE deleted = 'false' AND (memberships.pts8210 = true OR memberships.sv6 = true OR memberships.sv7 = true OR memberships.pts14k = true OR memberships.pts22k = true OR memberships.pts24k = true OR memberships.spb32 = true OR memberships.spb64 = true OR memberships.sde = true) AND projects.name = '#{project.name}' GROUP BY path, memberships.pts8210, memberships.sv6, memberships.sv7, memberships.pts14k, memberships.pts22k, memberships.pts24k, memberships.spb32, memberships.spb64, memberships.sde, projects.name ORDER BY ics.path"
if params[:ic_type] == "never_run"
runtest = TestrunCache.connection.select_all(testrun_query)
alltest = TrsIc.connection.select_all(ic_query)
(alltest.length).times do |i|
#exec_pltfrm = test['executed_platforms'].split(",")
unfinishedtest = comparison(runtest[i],alltest[i])
yell_and_log("test = #{unfinishedtest}")
yell_and_log("#{runtest[i]}")
yell_and_log("#{alltest[i]}")
end
end
end
I get in my log:
test = true
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"cli", "case_name"=>"functional", "ic_name"=>"cli_sanity_test", "executed_platforms"=>"pts22k,pts24k,sv7,"}
array of hash 2 = {"path"=>"BSDPLATFORM/cli/functional/cli_sanity_test", "pts8210"=>"f", "sv6"=>"f", "sv7"=>"t", "pts14k"=>nil, "pts22k"=>"t", "pts24k"=>"t", "spb32"=>nil, "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
test = false
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"infrastructure", "case_name"=>"bypass_pts14k_copper", "ic_name"=>"ic_packet_9", "executed_platforms"=>"sv6,"}
array of hash 2 = {"path"=>"BSDPLATFORM/infrastructure/build/copyrights", "pts8210"=>"f", "sv6"=>"t", "sv7"=>"t", "pts14k"=>"f", "pts22k"=>"t", "pts24k"=>"t", "spb32"=>"f", "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
test = false
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"infrastructure", "case_name"=>"bypass_pts14k_copper", "ic_name"=>"ic_status_1", "executed_platforms"=>"sv6,"}
array of hash 2 = {"path"=>"BSDPLATFORM/infrastructure/build/ic_1", "pts8210"=>"f", "sv6"=>"t", "sv7"=>"t", "pts14k"=>"f", "pts22k"=>"t", "pts24k"=>"t", "spb32"=>"f", "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
test = false
array of hash 1 = {"root_name"=>"BSDPLATFORM", "suite_name"=>"infrastructure", "case_name"=>"bypass_pts14k_copper", "ic_name"=>"ic_status_2", "executed_platforms"=>"sv6,"}
array of hash 2 = {"path"=>"BSDPLATFORM/infrastructure/build/ic_files", "pts8210"=>"f", "sv6"=>"t", "sv7"=>"f", "pts14k"=>"f", "pts22k"=>"t", "pts24k"=>"t", "spb32"=>"f", "spb64"=>nil, "sde"=>nil, "name"=>"pts_6_20"}
SO I get only the first to match but rest becomes different and I get result of one instead of 4230
I would like some way to match by path and root/suite/case/ic and then compare the executed platforms passed in array of hashes 1 vs platforms set to true in array of hash2
Not sure if this is fastest, and I wrote this based on your original question that didn't provide sample code, but:
def compare(h1, h2)
(h2[:path] == "#{h1[:root]}/#{h1[:child1]}/#{h1[:subchild1]}") && \
(h2[:hit1_exist] == ((h1[:bases][0] == nil) ? 'f' : 't')) && \
(h2[:hit2_exist] == ((h1[:bases][1] == nil) ? 'f' : 't')) && \
(h2[:hit3_exist] == ((h1[:bases][2] == nil) ? 'f' : 't'))
end
def compare_arr(h1a, h2a)
(h1a.length).times do |i|
compare(h1a[i],h2a[i])
end
end
Test:
require "benchmark"
h1a = []
h2a = []
def rstr
# from http://stackoverflow.com/a/88341/178651
(0...2).map{65.+(rand(26)).chr}.join
end
def rnil
rand(2) > 0 ? '' : nil
end
10000.times do
h1a << {:root => rstr(), :child1 => rstr(), :subchild1 => rstr(), :bases => [rnil,rnil,rnil]}
h2a << {:path => '#{rstr()}/#{rstr()}/#{rstr()}', :hit1_exist => 't', :hit2_exist => 't', :hit3_exist => 'f'}
end
Benchmark.measure do
compare_arr(h1a,h2a)
end
Results:
=> 0.020000 0.000000 0.020000 ( 0.024039)
Now that I'm looking at your code, I think it could be optimized by removing array creations, and splits and joins which are creating arrays and strings that need to be garbage collected which also will slow things down, but not by as much as you mention.
Your database queries may be slow. Run explain/analyze or similar on them to see why each is slow, optimize/reduce your queries, add indexes where needed, etc. Also, check cpu and memory utilization, etc. It might not just be the code.
But, there are some definite things that need to be fixed. You also have several risks of SQL injection attack, e.g.:
... start_date >= '#{params[:start_date]}' AND start_date < '#{params[:end_date]}' AND project_id = #{params[:project_id]} ...
Anywhere that params and variables are put directly into the SQL may be a danger. You'll want to make sure to use prepared statements or at least SQL escape the values. Read this all the way through: http://guides.rubyonrails.org/active_record_querying.html
([element_being_tested].each do |el|
[hash_array_1, hash_array_2].reject do |x, y|
x[el] == y[el]
end
end).each {|x, y| puts (x[bases] | y[bases])}
Enumerate the hash elements to test.
[element_being_tested].each do |el|
Then iterate through the hash arrays themselves, comparing the given hashes by the elements of the given comparison defined by the outer loop, rejecting those not appropriately equal. (The == may actually need to be != but you can figure that much out)
[hash_array_1, hash_array_2].reject do |x, y|
x[el] == y[el]
end
Finally, you again compare the hashes taking the set union of their elements.
.each {|x, y| puts (x[bases] | y[bases])}
You may need to test the code. It's not meant for production so much as demonstration because I wasn't sure I read your code right. Please post a larger sample of the source including the data structures in question if this answer is unsatisfactory.
Regarding speed: if you're iterating through a large data set and comparing multiple there's probably nothing you can do. Perhaps you can invert the loops I presented and make the hash arrays the outer loop. You're not going to get lightning speed here in Ruby (really any language) if the data structure is large.