Get column values from a Haskell Database

Get column values from a Haskell Database - database

Here's the problem. Let it be known that I'm very new to Haskell and the declarative language part is totally different from what I'm used to. I've made a database of sorts, and the user can input commands like "Add (User "Name")" or "Create (Table "Funding")". I'm trying to create a function that takes as parameters a list of commands, a User, a Table, a Column name (as a string), and returns a list containing the values in that column if the user has access to them (i.e. somewhere in the list of commands there is one that matches "Allow (User name) (Table "Funds")". We can assume the table exists.
module Database where
type Column = String
data User = User String deriving (Eq, Show)
data Table = Table String deriving (Eq, Show)
data Command =
Add User
| Create Table
| Allow (User, Table)
| Insert (Table, [(Column, Integer)])
deriving (Eq, Show)
-- Useful function for retrieving a value from a list
-- of (label, value) pairs.
lookup' :: Column -> [(Column, Integer)] -> Integer
lookup' c' ((c,i):cvs) = if c == c' then i else lookup' c' cvs
lookupColumn :: [(Column, Integer)] -> [Integer]
lookupColumn ((c, i):cvs) = if null cvs then [i] else [i] ++ lookupColumn cvs
select :: [Command] -> User -> Table -> Column -> Maybe [Integer]
select a b c d = if not (elem (b, c) [(g, h) | Allow (g, h) <- a])
then Nothing
else Just (lookupColumn [(d, x) | Insert (c, [ (d, x ), _ ]) <- a])
I have gotten it to work, but only in very select cases. Right now, the format of the input has to be such that the column we want the values from must be the first column in the table. Example input is below. Running: select example (User "Alice") (Table "Revenue") "Day" returns Just [1,2,3] like it should, but replacing Day with Amount doesn't work.
example = [
Add (User "Alice"),
Add (User "Bob"),
Create (Table "Revenue"),
Insert (Table "Revenue", [("Day", 1), ("Amount", 2400)]),
Insert (Table "Revenue", [("Day", 2), ("Amount", 1700)]),
Insert (Table "Revenue", [("Day", 3), ("Amount", 3100)]),
Allow (User "Alice", Table "Revenue")
]
A bit of explanation about the functions. select is the function which should return the list of integers in that column. Right now, it's only matching the first column, but I'd like it to work with any number of columns, not knowing which column the user wants ahead of time.
[(d, x) | Insert (c, [ (d, x ), _ ]) <- a] returns a list of tuples that match only the first tuple in each list of (Column, Integer) tuples.
lookupColumn takes in a list of tuples and returns a list of the integers within it. Unlike lookup', we know that the list this takes in has only the correct column's (Column, Integer) tuples within it. lookup' can take in a list of any number of tuples, but must check if the column names match first.
Any help at all would be greatly appreciated.

There are a couple strange things in your code; for example:
lookupColumn :: [(Column, Integer)] -> [Integer]
lookupColumn ((c, i):cvs) = if null cvs then [i] else [i] ++ lookupColumn cvs
is much longer to type in every way than the equivalent (and probably faster) map snd.
Furthermore when you're defining your own data structures often tuples are superfluous; you could just write:
data Command = Add User
| Create Table
| Allow User Table
| Insert Table [(Column, Integer)]
deriving (Eq, Show)
The actual problem is the _ in your select statement which explicitly tells Haskell to throw away the second value of the tuple. Instead you want something which grabs all (Column, Integer) pairs that are associated with a table:
getCells :: [Command] -> Table -> [(Column, Integer)]
getCells db t = concat [cis | Insert t' cis <- filter isInsert db, t == t']
where isInsert (Insert _ _) = True
isInsert _ = False
(note that this is using the un-tupled version of Insert that I wrote above). With this the algorithm becomes much easier:
select :: [Command] -> User -> Table -> Column -> Maybe [Integer]
select db user table col
| Allow user table `elem` db = Just [i | (c, i) <- getCells db t, col == c]
| otherwise = Nothing
What's doing the majority of the "work" here? Actually it's just the concat :: [[a]] -> [a] that we used in getCells. By concatenating together all of the (Column, Integer) pairs for all of the rows/cols in the table, we have a really easy time of pulling out only the column that we need.
Todo: stop this code from doing something unexpected when someone says Insert (Table "Revenue") [("Amount", 1), ("Amount", 2400)], which will appear in the output as two rows even though it only comes from one row. You can either normalize-on-input, which will do pretty well, or return [Maybe Integer], giving nulls for the rows which do not have a value (lookup in the standard Prelude will take the place of concat in doing your work for you).

Related

Convert string to variable name in Lua

In Lua, I have a set of tables:
Column01 = {}
Column02 = {}
Column03 = {}
ColumnN = {}
I am trying to access these tables dynamically depending on a value. So, later on in the programme, I am creating a variable like so:
local currentColumn = "Column" .. variable
Where variable is a number 01 to N.
I then try to do something to all elements in my array like so:
for i = 1, #currentColumn do
currentColumn[i] = *do something*
end
But this doesn't work as currentColumn is a string and not the name of the table. How can I convert the string into the name of the table?

If I understand correctly, you're saying that you'd like to access a variable based on its name as a string? I think what you're looking for is the global variable, _G.
Recall that in a table, you can make strings as keys. Think of _G as one giant table where each table or variable you make is just a key for a value.
Column1 = {"A", "B"}
string1 = "Column".."1" --concatenate column and 1. You might switch out the 1 for a variable. If you use a variable, make sure to use tostring, like so:
var = 1
string2 = "Column"..tostring(var) --becomes "Column1"
print(_G[string2]) --prints the location of the table. You can index it like any other table, like so:
print(_G[string2][1]) --prints the 1st item of the table. (A)
So if you wanted to loop through 5 tables called Column1,Column2 etc, you could use a for loop to create the string then access that string.
C1 = {"A"} --I shorted the names to just C for ease of typing this example.
C2 = {"B"}
C3 = {"C"}
C4 = {"D"}
C5 = {"E"}
for i=1, 5 do
local v = "C"..tostring(i)
print(_G[v][1])
end
Output
A
B
C
D
E
Edit: I'm a doofus and I overcomplicated everything. There's a much simpler solution. If you only want to access the columns within a loop instead of accessing individual columns at certain points, the easier solution here for you might just be to put all your columns into a bigger table then index over that.
columns = {{"A", "1"},{"B", "R"}} --each anonymous table is a column. If it has a key attached to it like "column1 = {"A"}" it can't be numerically iterated over.
--You could also insert on the fly.
column3 = {"C"}
table.insert(columns, column3)
for i,v in ipairs(columns) do
print(i, v[1]) --I is the index and v is the table. This will print which column you're on, and get the 1st item in the table.
end
Output:
1 A
2 B
3 C
To future readers: If you want a general solution to getting tables by their name as a string, the first solution with _G is what you want. If you have a situation like the asker, the second solution should be fine.

Find valid combinations based on matrix

I have a in CALC the following matrix: the first row (1) contains employee numbers, the first column (A) contains productcodes.
Everywhere there is an X that productitem was sold by the corresponding employee above
| 0302 | 0303 | 0304 | 0402 |
1625 | X | | X | X |
1643 | | X | X | |
...
We see that product 1643 was sold by employees 0303 and 0304
What I would like to see is a list of what product was sold by which employees but formatted like this:
1625 | 0302, 0304, 0402 |
1643 | 0303, 0304 |
The reason for this is that we need this matrix ultimately imported into an SQL SERVER table. We have no access to the origins of this matrix. It contains about 50 employees and 9000+ products.
Thanx for thinking with us!

try something like this
;with data as
(
SELECT *
FROM ( VALUES (1625,'X',NULL,'X','X'),
(1643,NULL,'X','X',NULL))
cs (col1, [0302], [0303], [0304], [0402])
),cte
AS (SELECT col1,
col
FROM data
CROSS apply (VALUES ('0302',[0302]),
('0303',[0303]),
('0304',[0304]),
('0402',[0402])) cs (col, val)
WHERE val IS NOT NULL)
SELECT col1,
LEFT(cs.col, Len(cs.col) - 1) AS col
FROM cte a
CROSS APPLY (SELECT col + ','
FROM cte B
WHERE a.col1 = b.col1
FOR XML PATH('')) cs (col)
GROUP BY col1,
LEFT(cs.col, Len(cs.col) - 1)

I think there are two problems to solve:
get the product codes for the X marks;
concatenate them into a single, comma-separated string.
I can't offer a solution for both issues in one step, but you may handle both issues separately.
1.
To replace the X marks by the respective product codes, you could use an array function to create a second table (matrix). To do so, create a new sheet, copy the first column / first row, and enter the following formula in cell B2:
=IF($B2:$E3="X";$B$1:$E$1;"")
You'll have to adapt the formula, so it covers your complete input data (If your last data cell is Z9999, it would be =IF($B2:$Z9999="X";$B$1:$Z$1;"")). My example just covers two rows and four columns.
After modifying it, confirm with CTRL+SHIFT+ENTER to apply it as array formula.
2.
Now, you'll have to concatenate the product codes. LO Calc lacks a feature to concatenate an array, but you could use a simple user-defined function. For such a string-join function, see this answer. Just create a new macro with the StarBasic code provided there and save it. Now, you have a STRJOIN() function at hand that accepts an array and concatenates its values, leaving empty values out.
You could add that function using a helper column on the second sheet and apply it by dragging it down. Finally, to get rid of the cells with the single product IDs, copy the complete second sheet, paste special into a third sheet, pasting only the values. Now, you can remove all columns except the first one (employee IDs) and the last one (with the concatenated product ids).

I created a table in sql for holding the data:
CREATE TABLE [dbo].[mydata](
[prod_code] [nvarchar](8) NULL,
[0100] [nvarchar](10) NULL,
[0101] [nvarchar](10) NULL,
[and so on...]
I created the list of columns in Calc by copying and pasting them transposed. After that I used the concatenate function to create the columnlist + datatype for the create table statement
I cleaned up the worksheet and imported it into this table using SQL Server's import wizard. Cleaning meant removing unnecessary rows/columns. Since the columnnames were identical mapping was done correctly for 99%.
Now I had the data in SQL Server.
I adapted the code MM93 suggested a bit:
;with data as
(
SELECT *
FROM dbo.mydata <-- here i simply referenced the whole table
),cte
and in the next part I uses the same 'worksheet' trick to list and format all the column names and pasted them in.
),cte
AS (SELECT prod_code, <-- had to replace col1 with 'prod_code'
col
FROM data
CROSS apply (VALUES ('0100',[0100]),
('0101', [0101] ),
(and so on... ),
The result of this query was inserted into a new table and my colleagues and I are querying our harts out :)
PS: removing the 'FOR XML' clause resulted in a table with two columns :
prodcode | employee
which containes al the unique combinations of prodcode + employeenumber which is a lot faster and much more practical to query.

Web2py: Write SQL query: "SELECT x FROM y" using DAL, when x and y are variables, and convert the results to a list?

My action passes a list of values from a column x in table y to the view. How do I write the following SQL: SELECT x FROM y, using DAL "language", when x and y are variables given by the view. Here it is, using exequtesql().
def myAction():
x = request.args(0, cast=str)
y = request.args(1, cast=str)
myrows = db.executesql('SELECT '+ x + ' FROM '+ y)
#Let's convert it to the list:
mylist = []
for row in myrows:
value = row #this line doesn't work
mylist.append(value)
return (mylist=mylist)
Also, is there a more convenient way to convert that data to a list?

First, note that you must create table definitions for any tables you want to access (i.e., db.define_table('mytable', ...)). Assuming you have done that and that y is the name of a single table and x is the name of a single field in that table, you would do:
myrows = db().select(db[y][x])
mylist = [r[x] for r in myrows]
Note, if any records are returned, .select() always produces a Row object, which comprises a set of Row objects (even if only a single field was selected). So, to extract the individual values into a list, you have to iterate over the Rows object and extract the relevant field from each Row object. The above code does so via a list comprehension.
Also, you might want to add some code to check whether db[y] and db[y][x] exist.

Loop over data values in neo4j

I have a movies.csv file, which has a feature vector per line (E.g - id|Name|0|1|1|0|0|0|1 has 2 features for name and id, 7 features for genre classification)
I want a node m from class Movies to have a relationship [:HAS_GENRE] with nodes g from class Genres. For that, I need to loop over all the '|' separated features and only make a relationship if the value is 1.
IN essence, I want to have -
x = a //where a is the index of the first genre feature
while (x < lim) //lim is the last index of the feature vector
{
if line[x] is 1:
(m{id:toInt(line[0]})-[:HAS_GENRE]->(g{id=line[x]})
}
How do I do that?

try this
WITH ["Genre1","Genre2",...] as genres
LOAD CSV FROM "file:movies.pdv" using fieldterminator "|" AS row
MERGE (m:Movie {id:row[0]}) ON CREATE SET m.title = row[1]
FOREACH (idx in filter(range(0,size(genres)-1) WHERE row[2+idx]="1") ) |
MERGE (g:Genre {name:genres[idx]})
CREATE (m)-[:HAS_GENRE]->(g)
)
it loads each row of the file of as a collection
the first two elements are used to create a movie
then filter the potential indexes range(0,size(genres)-1) by the existence of a "1" in the input row,
the resulting list of indexes is then used to lookup the genre-name or id
and connect the movie with the genre

Find first non-null value along a path (array of nodes) in a hierarchical table

I have been fruitlessly trying for several hours to make a function that filter array subscripts based upon a criteria on the array from which the subscripts and then create an array of those subscripts.
The data structure I am dealing with is similar to the following sample (except with many more columns to compare and more complicated rules and mixed data types):
id hierarchy abbreviation1 abbreviation2
1 {1} SB GL
2 {2,1} NULL NULL
3 {3,2,1} NULL TC
4 {4,2,1} NULL NULL
I need to run a query that takes the next non-null value closest to the parent for abbreviation1 and abbreviation2 and compares them based upon the hierarchical distance from the current record in order to get a single value for an abbreviation. So, for example, if the first non-null values of abbreviation1 and abbreviation2 are both on the same record level abbreviation1 would take priority; on the other hand, if the first non-null abbreviation2 is closer to the current record then the corresponding non-null value for abbreviation1, then abbreviation2 would be used.
Thus the described query on the above sample table would yield;
id abbreviation
1 SB
2 SB
3 TC
4 SB
To accomplish this task I need to generate a filtered array of array subscripts (after doing an array_agg() on the abbreviation columns) which only contain subscripts where the value in an abbreviation column is not null.
The following function, based on all the logic in my tired mind, should work but does not
CREATE OR REPLACE FUNCTION filter_array_subscripts(rawarray anyarray,criteria anynonarray,dimension integer, reverse boolean DEFAULT False)
RETURNS integer[] as
$$
DECLARE
outarray integer[] := ARRAY[]::integer[];
x integer;
BEGIN
for i in array_lower(rawarray,dimension)..array_upper(rawarray,dimension) LOOP
IF NOT criteria IS NULL THEN
IF NOT rawarray[i] IS NULL THEN
IF NOT rawarray[i] = criteria THEN
IF reverse = False THEN
outarray := array_append(outarray,i);
ELSE
outarray := array_prepend(i,outarray);
END IF;
ELSE
IF reverse = False THEN
outarray := array_append(outarray,i);
ELSE
outarray := array_prepend(i,outarray);
END IF;
END IF;
END IF;
ELSE
IF NOT rawarray[i] is NULL THEN
IF reverse = False THEN
outarray := array_append(outarray,i);
ELSE
outarray := array_prepend(i,outarray);
END IF;
END IF;
END IF;
END LOOP;
RETURN outarray;
END;
$$ LANGUAGE plpgsql;
For example, the below query returns {5,3,1} when it should return {5,4,2,1}
select filter_array_subscripts(array['This',NULL,'is',NULL,'insane!']::text[]
,'is',1,True);
I have no idea why this does not work, I have tried using the foreach array iteration syntax but I cannot figure out how to cast the iteration value to the scalar type contained within the anyarray.
What can be done to fix this?

You can largely simplify this whole endeavor with the use of a RECURSIVE CTE, available in PostgreSQL 8.4 or later:
Test table (makes it easier for everyone to provide test data in a form like this):
CREATE TEMP TABLE tbl (
id int
, hierarchy int[]
, abbreviation1 text
, abbreviation2 text
);
INSERT INTO tbl VALUES
(1, '{1}', 'SB', 'GL')
,(2, '{2,1}', NULL, NULL)
,(3, '{3,2,1}', NULL, 'TC')
,(4, '{4,2,1}', NULL, NULL);
Query:
WITH RECURSIVE x AS (
SELECT id
, COALESCE(abbreviation1, abbreviation2) AS abbr
, hierarchy[2] AS parent_id
FROM tbl
UNION ALL
SELECT x.id
, COALESCE(parent.abbreviation1, parent.abbreviation2) AS abbr
, parent.hierarchy[2] AS parent_id
FROM x
JOIN tbl AS parent ON parent.id = x.parent_id
WHERE x.abbr IS NULL -- stop at non-NULL value
)
SELECT id, abbr
FROM x
WHERE abbr IS NOT NULL -- discard intermediary NULLs
ORDER BY id
Returns:
id | abbr
---+-----
1 | SB
2 | SB
3 | TC
4 | SB
This presumes that there is a non-null value on every path, or such rows will be dropped from the result.