Filesystem (file, folder) with display order - database

I want to design like os file system,
with specific display order (sequence) can be update.
I want file and folder can be same layer,
file doesn't have to inside a folder.
But in below design, if the file not in any folder I don't know how to save the sequence, save in where??
Any suggestion will be apperciate
data example
folder(id:1) top layer: sequence: 0
file(id:1) sequence_in_folder: 0
file(id:2) sequence_in_folder: 1
folder(id:2) top layer: sequence: 1
file(id:3) sequence_in_folder: 0
file(id:4) top layer: sequence: 2 << **sequence save in which table ??**
file(id:5) top layer: sequence: 3 << **sequence save in which table ??**
folder
id sequence parent_folder_id
1 0
2 1
file
id sequence_in_folder folder_id
1 0 1
2 1 1
3 0 2
4 ?????
5 ????
schema
CREATE TABLE IF NOT EXISTS "folder"(
"id" SERIAL NOT NULL,
"sequence" integer NOT NULL,
"parent_folder_id" integer Default NULL,
PRIMARY KEY ("id")
);
CREATE TABLE IF NOT EXISTS "file"(
"id" SERIAL NOT NULL,
"sequence_in_folder" integer Default NULL,
"parent_folder_id" integer NOT NULL,
PRIMARY KEY ("id")
);
UPDATE
base on #Laurenz Albe answer, no need change table design,
just create a root folder.
but how to sorting data order by a field cross/exist in two table?
the sequence exist in folder table and file table, how to sort them together
query
SELECT * FROM folder fo
LEFT JOIN file fi ON fi.parent_folder_id = fo.id
WHERE fo.parent_folder_id = $1 AND fi.parent_folder_id = $1
ORDER BY fo.sequence fi.sequence ?? ;
[1]
data example
folder
id | sequence | parent_folder_id | name
1 | 0 | | root
2 | 0 | 1 |
3 | 2 | 1 |
file
id | sequence | parent_folder_id |
1 | 1 | 1 |
output
folder(id:1, sequence:0 name:root)
folder(id:2, sequence:0)
file(id:1, sequence:1)
folder(id:3 sequence:2)

Two suggestions:
Introduce an “anonymous” top folder that contains all the top level elements.
Rename the sequence column of bookmerk_folder to max_sequence or so to avoid confusion with bookmark.sequence.

Supplemental to Laurenz's answer:
unify your bookmark and folder columns, maybe bookmark_node and require that everything have a parent which is not a bookmark. Something like
CREATE TABLE IF NOT EXISTS fsnode(
"id" SERIAL NOT NULL,
"name" text,
"is_folder" bool,
"parent_is_folder" bool not null,
"sequence" integer NOT NULL,
"parent_folder_id" integer Default NULL,
CHECK (parent_is_folder),
PRIMARY KEY ("id"),
UNIQUE(id, is_folder), # needed for fkey below
FOREIGN KEY (parent_folder_id, parent_is_folder) REFERENCES fsnode (id, is_folder)
);

Related

Store a list of values as a string when creating a table in snowflake

I am trying to create a table with 5 columns. COLUMN #2 (PROGRESS) is a comma seperated list (i.e 1,2,3,4 etc.) but when trying to create this table as either a string, variant or varchar, Snowflake refuses to allow this. Any advice on how I can create a column seperated list from a CSV? I tried to import the data as a TSV, XML, as well as a JSON file but no success.
create or replace TABLE AD_HOC.TEMP.NEW_DATA (
VISITOR_ID VARCHAR(16777216),
PROGRESS VARCHAR(16777216),
DATE DATETIME,
ROLE VARCHAR(16777216),
FIRST_VISIT DATETIME
)COMMENT='Interaction data'
;
Goal:
VISITOR_ID | PROGRESS | DATE | ROLE | FIRST_VISIT
111 | [1,2,3] | 1/1/2022 | OWNER | 1/1/2021
123 | [1] | 1/2/2022 | ADMIN | 2/2/2021
23321 | [1,2,3,4] | 2/22/2022 | USER | 3/12/2021
I encoded the column in python and loaded the data in Snowflake!
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
df = doc_data.join(pd.DataFrame(mlb.fit_transform(doc_data.pop('PROGRESS')),
columns=mlb.classes_,
index=doc_data.index))
df

How can I update many rows with different values for the same column?

I have a table with a column containing a path to a file. The path is an absolute path, and values for this column look like this: C:\CI\Media\animal.jpg.
The table looks like so, except there are many rows so editing by hand is not practical:
`+----+-----------------------------------+
| ID | Path |
+----+-----------------------------------+
| 1 | C:\CI\Media\sushi.jpg |
| 2 | C:\CI\Media\animal.jpg |
| 3 | C:\CI\Media\Tuscany Trip\pisa.png |
+----+-----------------------------------+`
Path is an nvarchar(260)
And what'd I'd like to do is run a query that will update each record so the path for each record replaces C:\CI\ with C:\CI\Net, and end up with a table that looks like so:
`+----+---------------------------------------+
| ID | Path |
+----+---------------------------------------+
| 1 | C:\CI\Net\Media\sushi.jpg |
| 2 | C:\CI\Net\Media\animal.jpg |
| 3 | C:\CI\Net\Media\Tuscany Trip\pisa.png |
+----+---------------------------------------+`
Is there a way to format a query that will update every record, but update it based on the existing value (replace the C:\CI portion with C:\CI\Net for each record while maintaining the rest of the the value) instead of setting each column to the same value like a normal Update table set column = value ?
Gosh you almost wrote the code yourself.
Update YourTable
set path = replace(path, 'C:\CI', 'C:\CI\Net')

How to write additional data to a table when using CakePHP 2.x Tree Behaviour

I'm working with a CakePHP 2.x application and using the Tree Behaviour which comes with it - https://book.cakephp.org/2.0/en/core-libraries/behaviors/tree.html
My table is called navigations and has the following schema which is provided in the docs:
mysql> DESCRIBE navigations;
+-----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| parent_id | int(10) | YES | | NULL | |
| lft | int(10) | YES | | NULL | |
| rght | int(10) | YES | | NULL | |
| name | varchar(255) | YES | | | |
+-----------+------------------+------+-----+---------+----------------+
5 rows in set (0.02 sec)
My Model, Navigation.php is configured to use the Tree behaviour:
class Navigation extends AppModel {
public $actsAs = array('Tree');
}
I can write data into my table as per the documentation, e.g.
$data['Navigation']['parent_id'] = 3;
$data['Navigation']['name'] = 'United Kingdom';
$this->Navigation->save($data);
However... I have a requirement to store some additional data in the table.
If I add a field to my table, jstree_data of type VARCHAR(255), I cannot write it to the table with the following:
$data['Navigation']['parent_id'] = 3;
$data['Navigation']['name'] = 'United Kingdom';
$data['Navigation']['jstree_data'] = 'foo'; // Attempt to write to 'jstree_data' field in database
$this->Navigation->save($data);
Does anyone know if it's possible to do this? It seems that by using the Tree Behaviour (public $actAs = array('Tree') in the model) you lose the normal functionality of models and being able to save other data.
Background info:
Why am I trying to do this? I'm using jstree on a project. I need to delete all of the existing records before resaving the tree. However, because the id column is an auto_increment I need to store a reference for the "old ID" in jstree_data because that's what's present in the JSON data that comes from jstree. Based on this I can then look up the "new" ID (the auto increment value) and assign that with $data['Navigation']['parent_id'] when resaving the new child elements.

Check a value in an array inside a object json in PostgreSQL 9.5

I have an json object containing an array and others properties.
I need to check the first value of the array for each line of my table.
Here is an example of the json
{"objectID2":342,"objectID1":46,"objectType":["Demand","Entity"]}
So I need for example to get all lines with ObjectType[0] = 'Demand' and objectId1 = 46.
This the the table colums
id | relationName | content
Content column contains the json.
just query them? like:
t=# with table_name(id, rn, content) as (values(1,null,'{"objectID2":342,"objectID1":46,"objectType":["Demand","Entity"]}'::json))
select * From table_name
where content->'objectType'->>0 = 'Demand' and content->>'objectID1' = '46';
id | rn | content
----+----+-------------------------------------------------------------------
1 | | {"objectID2":342,"objectID1":46,"objectType":["Demand","Entity"]}
(1 row)

Hive query, better option to self join

So I am working with a hive table that is set up as so:
id (Int), mapper (String), mapperId (Int)
Basically a single Id can have multiple mapperIds, one per mapper such as an example below:
ID (1) mapper(MAP1) mapperId(123)
ID (1) mapper(MAP2) mapperId(1234)
ID (1) mapper(MAP3) mapperId(12345)
ID (2) mapper(MAP2) mapperId(10)
ID (2) mapper(MAP3) mapperId(12)
I want to return the list of mapperIds associated to each unique ID. So for the above example I would want the below returned as a single row.
1, 123, 1234, 12345
2, null, 10, 12
The mapper Strings are known, so I was thinking of doing a self join for every mapper string I am interested in, but I was wondering if there was a more optimal solution?
If the assumption that the mapper column is distinct with respect to a given ID is correct, you could collect the mapper column and the mapperid column to a Map using brickhouse collect. You can clone the repo from that link and build the jar with Maven.
Query:
add jar /complete/path/to/jar/brickhouse-0.7.0-SNAPSHOT.jar;
create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
select id
,id_map['MAP1'] as mapper1
,id_map['MAP2'] as mapper2
,id_map['MAP3'] as mapper3
from (
select id
,collect(mapper, mapperid) as id_map
from some_table
group by id
) x
Output:
| id | mapper1 | mapper2 | mapper3 |
------------------------------------
1 123 1234 12345
2 10 12

Resources