Datatype of list - moscow ml - ml

I would like to do representation of list of katalogs and list of books.
datatype catalog = KAT of string*catalog list| KIS of string*book list | EMPTY;
and I would like to count the books. I am trying to do something like this.
fun count([]) = 0
| count(book) = LIST.length(book)
| count(x1::xs) = count(x1) + count(xs);`
and I receive cannot have type error. What can I do to caunt books ?

I think you have to be more careful with the definition of functions. For example, it seems that you want to count all the books and the catalogs inside with the same function. This is not possible in ML, because although you can include several instances of the function for pattern matching, the matching has to be done over just one type.
For example, in your code, it seems that you want to use count() for counting either given a catalog or a list of books or catalogs. This can be done, but it is not the usual way in ML. You have to write a function to count all the catalogs in the catalog list of type catalog, and another function to count the books and catalogs. The length function works as expected, so the following function may work:
fun countcatalogs ([]) = 0
| countcatalogs(cat::rest) = countbooks(cat) + countcatalogs(rest)
and
countbooks (EMPTY) = 0
| countbooks (KIS(_, l)) = length(l)
| countbooks (KAT(_,cat::rest)) = countbooks(cat) + countcatalogs(rest);

Related

FAME: How do I combine 2 data objects (each a list of series)?

I am not sure if anyone here is familiar with FAME as it is not commonly used.
But here is my code
DB="C:\..."
A_DATA="Profit.20?.FIRM_A"
B_DATA="Revenue.20?.FIRM_XX"
EXEC "OPEN <AC RE> """ + DB + """ AS OPENEDDATABASE" -- BASE STORING RAW DATA SERIES
NLIST = WILDLIST(OPENEDDATABASE,A_DATA)
NLIST2 = WILDLIST(OPENEDDATABASE,B_DATA)
I want to combine NLIST & NLIST2 into a single variable. But there doesn't seem any FAME function that allows me to do.
Put in another way, I need to have all the series named "Profit.20?.FIRM_A" and "Revenue.20?.FIRM_XX"
in 1 variable.
Thank you
unless I misunderstand something, namelists in FAME can simply be added together with the '+' operator:
fruits = {apples,oranges}
vegetables = {cabbage,carrots}
food = fruits + vegetables
type !food
apples,oranges,cabbage,carrots

Cypher statement with distinct match conditions is returning the same result

I am using Neo4j as a database to store voting information related to another database object.
I have a Vote object which has fields:
type:String with values of UP or DOWN.
argId:String which is a string ID value linking to a unique argument object
I am trying to query the number of votes assigned to a given argId using the following queries:
MATCH (v:Vote) WHERE v.argId = '214' AND v.type='DOWN'
RETURN {downvotes: COUNT(v)} AS votes
UNION
MATCH (v:Vote) WHERE v.argId = '214' AND v.type='UP'
RETURN {upvotes: COUNT(v)} AS votes
Note that this above cypher -- works and returns the expected result result like so:
[
{
"downvotes": 1
},
{
"upvotes": 10
}
]
But I feel like the query could be a bit neater and want to write something like this:
MATCH (v:Vote) WHERE v.argId = '214' AND v.type='UP'
MATCH (b:Vote) WHERE b.argId = '214' AND b.type='DOWN'
RETURN {upvotes: COUNT(v), downvotes: COUNT(b)}
Just reading it through, I think it makes sense, b and v are declared as separate variables, so all should be good (so I thought).
But running it given me this:
{
"upvotes": 10,
"downvotes": 10
}
But it should be what I have above.
Why is this?
I'm kinda new to neo4j and cypher so I've probably not understood how cypher works fully.
Can anyone shine any light?
Thank you!
p.s. I'm using Neo4j 3.5.6 and running the queries via the Desktop web browser app.
I think if you run this query you will get a clearer picture of what is happeneing. Your query produces a cartesian product of the upvotes(10) and the downvotes(1). The product is a result set of 10 rows. When they are subsequently counted, there are ten of each.
MATCH (v:Vote) WHERE v.argId = '214' AND v.type='UP'
MATCH (b:Vote) WHERE b.argId = '214' AND b.type='DOWN'
RETURN v.type, b.type
In order to get the result you want you need to filter the values and count them individually.
Rather than have two match statements, have a single match statement that retreives all of the values of interest and then use a conditional statement to filter them into upvotes and downbotes buckets.
Something like this may suit you.
MATCH (v:Vote {argId: '214'})
WHERE v.type IN ['UP', 'DOWN']
RETURN {
upvotes: count(CASE WHEN v.type = 'DOWN' THEN 1 END),
downvotes: count(CASE WHEN v.type = 'UP' THEN 1 END)
} AS vote_result
Using APOC you could do something like this whereby you use the type values themselves to aggregate the counts and then use APOC to convert it to a map with the types as the keys in the map.
MATCH (v:Vote {argId: '214'})
WHERE v.type IN ['UP', 'DOWN']
WITH [v.type, count(*)] AS vote_pair
RETURN apoc.map.fromPairs(collect(vote_pair)) AS votes

Csv file to a Lua table and access the lines as new table or function()

Currently my code have simple tables containing the data needed for each object like this:
infantry = {class = "army", type = "human", power = 2}
cavalry = {class = "panzer", type = "motorized", power = 12}
battleship = {class = "navy", type = "motorized", power = 256}
I use the tables names as identifiers in various functions to have their values processed one by one as a function that is simply called to have access to the values.
Now I want to have this data stored in a spreadsheet (csv file) instead that looks something like this:
Name class type power
Infantry army human 2
Cavalry panzer motorized 12
Battleship navy motorized 256
The spreadsheet will not have more than 50 lines and I want to be able to increase columns in the future.
Tried a couple approaches from similar situation I found here but due to lacking skills I failed to access any values from the nested table. I think this is because I don't fully understand how the tables structure are after reading each line from the csv file to the table and therefore fail to print any values at all.
If there is a way to get the name,class,type,power from the table and use that line just as my old simple tables, I would appreciate having a educational example presented. Another approach could be to declare new tables from the csv that behaves exactly like my old simple tables, line by line from the csv file. I don't know if this is doable.
Using Lua 5.1
You can read the csv file in as a string . i will use a multi line string here to represent the csv.
gmatch with pattern [^\n]+ will return each row of the csv.
gmatch with pattern [^,]+ will return the value of each column from our given row.
if more rows or columns are added or if the columns are moved around we will still reliably convert then information as long as the first row has the header information.
The only column that can not move is the first one the Name column if that is moved it will change the key used to store the row in to the table.
Using gmatch and 2 patterns, [^,]+ and [^\n]+, you can separate the string into each row and column of the csv. Comments in the following code:
local csv = [[
Name,class,type,power
Infantry,army,human,2
Cavalry,panzer,motorized,12
Battleship,navy,motorized,256
]]
local items = {} -- Store our values here
local headers = {} --
local first = true
for line in csv:gmatch("[^\n]+") do
if first then -- this is to handle the first line and capture our headers.
local count = 1
for header in line:gmatch("[^,]+") do
headers[count] = header
count = count + 1
end
first = false -- set first to false to switch off the header block
else
local name
local i = 2 -- We start at 2 because we wont be increment for the header
for field in line:gmatch("[^,]+") do
name = name or field -- check if we know the name of our row
if items[name] then -- if the name is already in the items table then this is a field
items[name][headers[i]] = field -- assign our value at the header in the table with the given name.
i = i + 1
else -- if the name is not in the table we create a new index for it
items[name] = {}
end
end
end
end
Here is how you can load a csv using the I/O library:
-- Example of how to load the csv.
path = "some\\path\\to\\file.csv"
local f = assert(io.open(path))
local csv = f:read("*all")
f:close()
Alternative you can use io.lines(path) which would take the place of csv:gmatch("[^\n]+") in the for loop sections as well.
Here is an example of using the resulting table:
-- print table out
print("items = {")
for name, item in pairs(items) do
print(" " .. name .. " = { ")
for field, value in pairs(item) do
print(" " .. field .. " = ".. value .. ",")
end
print(" },")
end
print("}")
The output:
items = {
Infantry = {
type = human,
class = army,
power = 2,
},
Battleship = {
type = motorized,
class = navy,
power = 256,
},
Cavalry = {
type = motorized,
class = panzer,
power = 12,
},
}

Linq - how get the minimum, if value = 0, get the next value

I have a test database which logs data from when a store logs onto a store portal and how long it stays logged on.
Example:
(just for visualizing purposes - not actual database)
Stores
Id Description Address City
1 Candy shop 43 Oxford Str. London
2 Icecream shop 45 Side Lane Huddersfield
Connections
Id Store_Ref Start End
1 2 2011-02-11 09:12:34.123 2011-02-11 09:12:34.123
2 2 2011-02-11 09:12:36.123 2011-02-11 09:14:58.125
3 1 2011-02-14 08:42:10.855 2011-02-14 08:42:10.855
4 1 2011-02-14 08:42:12.345 2011-02-14 08:50:45.987
5 1 2011-02-15 08:35:19.345 2011-02-15 08:38:20.123
6 2 2011-02-19 09:08:55.555 2011-02-19 09:12:46.789
I need to get various data from the database. I've already gotten the max and average connection duration. (So probably very self-evident that..) I also need to have some information about which connection lasted the least. I ofcourse immediately thought of the Min() function of Linq, but as you can see, the database also includes connections that started and ended instantly. Therefore, that data isn't actually "valid" for data analysis.
So my question is how to get the minimum value, but if the value = 0, get the next value that is the lowest.
My linq query so far (which implements the Min() function):
var min = from connections in Connections
join stores in Stores
on connections.Store_Ref equals stores.Id
group connections
by stores.Description into groupedStores
select new
{
Store_Description = groupedStores.Key,
Connection_Duration = groupedStores.Min(connections =>
(SqlMethods.DateDiffSecond(connections.Start, connections.End)))
};
I know that it's possible to get the valid values through multiple queries and/or statements though, but I was wondering if it's possible to do it all in just one query, since my program expects linq queries to be returned and my preference goes to keeping the program as "light" as possible.
If you have to great/simple method to do so, please share it. Your contribution is very appreciated! :)
What if you add, before the select new, a let clause for the duration of the conection with something like:
let duration = SqlMethods.DateDiffSecond(connections.Start, connections.End)
And then add a where clause
where duration != 0
var min = from connections in Connections.Where(connections => (SqlMethods.DateDiffSecond(connections.Start, connections.End) > 0)
join stores in Stores
on connections.Store_Ref equals stores.Id
group connections
by stores.Description into groupedStores
select new
{
Store_Description = groupedStores.Key,
Connection_Duration = groupedStores.Min(connections =>
(SqlMethods.DateDiffSecond(connections.Start, connections.End)))
};
Try this, With filtering the "0" values you will get the right result, at least that is my taught.
Include a where clause before calculating the Min value.
groupedStores.Where(conn => SqlMethods.DateDiffSecond(conn.Start, conn.End) > 0)
.Min(conn => (SqlMethods.DateDiffSecond(conn.Start, conn.End))

Between query equivalent on App Engine datastore?

I have a model containing ranges of IP addresses, similar to this:
class Country(db.Model):
begin_ipnum = db.IntegerProperty()
end_ipnum = db.IntegerProperty()
On a SQL database, I would be able to find rows which contained an IP in a certain range like this:
SELECT * FROM Country WHERE ipnum BETWEEN begin_ipnum AND end_ipnum
or this:
SELECT * FROM Country WHERE begin_ipnum < ipnum AND end_ipnum > ipnum
Sadly, GQL only allows inequality filters on one property, and doesn't support the BETWEEN syntax. How can I work around this and construct a query equivalent to these on App Engine?
Also, can a ListProperty be 'live' or does it have to be computed when the record is created?
question updated with a first stab at a solution:
So based on David's answer below and articles such as these:
http://appengine-cookbook.appspot.com/recipe/custom-model-properties-are-cute/
I'm trying to add a custom field to my model like so:
class IpRangeProperty(db.Property):
def __init__(self, begin=None, end=None, **kwargs):
if not isinstance(begin, db.IntegerProperty) or not isinstance(end, db.IntegerProperty):
raise TypeError('Begin and End must be Integers.')
self.begin = begin
self.end = end
super(IpRangeProperty, self).__init__(self.begin, self.end, **kwargs)
def get_value_for_datastore(self, model_instance):
begin = self.begin.get_value_for_datastore(model_instance)
end = self.end.get_value_for_datastore(model_instance)
if begin is not None and end is not None:
return range(begin, end)
class Country(db.Model):
begin_ipnum = db.IntegerProperty()
end_ipnum = db.IntegerProperty()
ip_range = IpRangeProperty(begin=begin_ipnum, end=end_ipnum)
The thinking is that after i add the custom property i can just import my dataset as is and then run queries on based on the ListProperty like so:
q = Country.gql('WHERE ip_range = :1', my_num_ipaddress)
When i try to insert new Country objects this fails though, complaning about not being able to create the name:
...
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/db/__init__.py", line 619, in _attr_name
return '_' + self.name
TypeError: cannot concatenate 'str' and 'IntegerProperty' objects
I tried defining an attr_name method for the new property or just setting self.name but that does not seem to help. Hopelessly stuck or heading in the right direction?
Short answer: Between queries aren't really supported at the moment. However, if you know a priori that your range is going to be relatively small, then you can fake it: just store a list on the entity with every number in the range. Then you can use a simple equality filter to get entities whose ranges contain a particular value. Obviously this won't work if your range is large. But here's how it would work:
class M(db.Model):
r = db.ListProperty(int)
# create an instance of M which has a range from `begin` to `end` (inclusive)
M(r=range(begin, end+1)).put()
# query to find instances of M which contain a value `v`
q = M.gql('WHERE r = :1', v)
The better solution (eventually - for now the following only works on the development server due to a bug (see issue 798). In theory, you can work around the limitations you mentioned and perform a range query by taking advantage of how db.ListProperty is queried. The idea is to store both the start and end of your range in a list (in your case, integers representing IP addresses). Then to get entities whose ranges contain some value v (i.e., between the two values in your list), you simply perform a query with two inequality filters on the list - one to ensure that v is at least as big as the smallest element in the list, and one to ensure that v is at least as small as the biggest element in the list.
Here's a simple example of how to implement this technique:
class M(db.Model):
r = db.ListProperty(int)
# create an instance of M which has a rnage from `begin` to `end` (inclusive)
M(r=[begin, end]).put()
# query to find instances of M which contain a value `v`
q = M.gql('WHERE r >= :1 AND r <= :1', v)
My solution doesn't follow the pattern you have requested, but I think it would work well on app engine. I'm using a list of strings of CIDR ranges to define the IP blocks instead of specific begin and end numbers.
from google.appengine.ext import db
class Country(db.Model):
subnets = db.StringListProperty()
country_code = db.StringProperty()
c = Country()
c.subnets = ['1.2.3.0/24', '1.2.0.0/16', '1.3.4.0/24']
c.country_code = 'US'
c.put()
c = Country()
c.subnets = ['2.2.3.0/24', '2.2.0.0/16', '2.3.4.0/24']
c.country_code = 'CA'
c.put()
# Search for 1.2.4.5 starting with most specific block and then expanding until found
result = Country.all().filter('subnets =', '1.2.4.5/32').fetch(1)
result = Country.all().filter('subnets =', '1.2.4.4/31').fetch(1)
result = Country.all().filter('subnets =', '1.2.4.4/30').fetch(1)
result = Country.all().filter('subnets =', '1.2.4.0/29').fetch(1)
# ... repeat until found
# optimize by starting with the largest routing prefix actually found in your data (probably not 32)

Resources