Virtual fields In CakePHP 2.- using multiple tables - cakephp-2.0

I Seem to really be struggling to find any information actually covering the use of Virtual Fields in CakePHP. Yes I know there is official documentation on the CakePHP site, however it does not cover the use of separate tables.
Eg
Table: Products
ID | PRODUCT | PRICE | QUANTITY
1 | Butter | 2.50 | 250
2 | Flour | 6.00 | 16000
3 | Egg | 0.99 | 6
Table: Products_Recipes
Product_ID | Recipe_ID | Quantity | COST ("VIRTUAL FIELD")
1 | 1 | 200 |= SUM (Products.Price/Products.Quantity) * Products_Recipes.Quantity
2 | 1 | 400
3 | 1 | 3
Table: Recipes
ID | RECIPE | COST ("Virtual Field")
1 | Pastry | Sum of Costs where Recipe_id is 1
Bit of a newbie to Mysql however I think this is the way I should be doing it. How do I then use Virtual Fields to access this information? I can get it to work in one model but not to access other models?
Doug.

I assume your tables are such:
Products (id, product, price, quantity)
Products_Recipes (product_id,
recipe_id, quantity)
Recipes (id, recipe)
(The missing fields are those you are trying to create.)
I find it easier to create in MySQL, then when it works decide if MySQL or CakePHP is the right place for production implementation. (And, you require a bit more complexity than the CakePHP examples.)
To create a working solution in MySQL:
Create a select view for the products_recipes
CREATE VIEW vw_recipe_products AS
SELECT products_recipes.product_id, products_recipes.recipe_id, products_recipes.quantity,
,(Products.Price/Products.Quantity) * Products_Recipes.Quantity as COST
FROM products_recipes
JOIN products
ON products_recipes.product_id = products.product_id
(I don't think you should have the SUM operator, since you there is no need for a GROUP BY clause)
Then the recipes
CREATE VIEW vw_recipes_with_cost
SELECT recipes.id, recipes.recipe, SUM(vw_recipes_products.cost) as cost
FROM recipes
JOIN vw_recipes_products
ON recipes.id = vw_recipes_products.recipe_id
GROUP BY recipes.id, recipes.recipe
Ultimately, I think you should be implementing the solution in MySQL due to the GROUP BY clause and use of the intermediary view. (Obviously, there are other solutions, but take advantage of MySQL. I find the CakePHP implementation of virtual fields to be used for simple solutions (e.g., concatenating two field)) {Once you create the views, you will need to create new models or change the useTable variable of existing tables to use the views instead.}

Related

Mapping Tables of Database in NiFi

Here is my requirement.
I have a big table in Vertica say base_table as follows.
base_table
ID | path | service | experience
20 | /abc/xyz | trz | moderate
22 | /wer/cmz | brd | professional
Mapping Tables
map_table1
path_id | path
1 | /abc/xyz
map_table2
exp_id | experience
1 | beginner
Final Table
ID | path_id | service | exp_id
20 | 1 | trz | -
22 | - | brd | 2
In the First case, I need to get ID as 1 as the path column is present in the map_table1 as well as base table and insert that record into the final table.
In the Second case, I need to insert id as 2 in map_table2 as experience professional is not present in that table as well as insert it into the final table.
which processors should I go for or how the flow should look like in Nifi?
I am not sure if I understand your question correctly, but if I generalize the situation, you want to insert a record if it does not exist, and then get the value of the corresponding ID (which may or may not have existed before).
The good news is that NiFi can easily work with a database like Vertica, have a look at the QueryDatabaseTable processor.
The challenge here however, is that NiFi is designed to efficiently handle many individual messages, and is therefore designed not to be very context aware. For your usecase you would probably want to use a tool that is built to work with tables. In general the solution for this would be Spark, or perhaps it can be built into your database with some procedures.

Friendship Website Database Design

I'm trying to create a database for a frienship website I'm building. I want to store multiple attributes about the user such as gender, education, pets etc.
Solution #1 - User table:
id | age | birth day | City | Gender | Education | fav Pet | fav hobbie. . .
--------------------------------------------------------------------------
0 | 38 | 1985 | New York | Female | University | Dog | Ping Pong
The problem I'm having is the list of attributes goes on and on and right now my user table has 20 something columns.
I feel I could normalize this by creating another table for each attribute see below. However this would create many joins and I'm still left with a lot of columns in the user table.
Solution #2 - User table:
id | age | birth day | City | Gender | Education | fav Pet | fav hobbies
--------------------------------------------------------------------------
0 | 38 | 1985 | New York | 0 | 0 | 0 | 0
Pets table:
id | Pet Type
---------------
0 | Dog
Anyone have any ideas how to approach this problem it feels like both answers are wrong. What is the proper table design for this database?
There is more to this than meets the eye: First of all - if you have tons of attributes, many of which will likely be null for any specific row, and with a very dynamic selection of attributes (i.e. new attributes will appear quite frequently during the code's lifecycle), you might want to ask yourself, whether a RDBMS is the best way to materialize this ... essentially non-schema. Maybe a document store would be a better fit?
If you do want to stay in the RDBMS world, the canonical answer is to have either one or one-per-datatype property table plus a table of properties:
Users.id | .name | .birthdate | .Gender | .someotherfixedattribute
----------------------------------------------------------
1743 | Me. | 01/01/1970 | M | indeed
Propertytpes.id | .name
------------------------
234 | pet
235 | hobby
Poperties.uid | .pid | .content
-----------------------------
1743 | 234 | Husky dog
You have a comment and an answer that recommend (or at least suggest) and Entity-Attribute-Value (EAV) model.
There is nothing wrong with using EAV if your attributes need to be dynamic, and your system needs to allow adding new attributes post-deployment.
That said, if your columns and relationships are all known up front, and they don't need to be dynamic, you are much better off creating an explicit model. It will (generally) perform better and will be much easier to maintain.
Instead of a wide table with a field per attribute, or many attribute tables, you could make a skinny table with many rows, something like:
Attributes (id,user_id,attribute_type,attribute_value)
Ultimately the best solution depends greatly on how the data will be used. People can only have one DOB, but maybe you want to allow for multiple addresses (billing/mailing/etc.), so addresses might deserve a separate table.

Database design - storing a sequence

Imagine the following: there is a "recipe" table and a "recipe-step" table. The idea is to allow different recipe-steps to be reused in different recipes. The problem I'm having relates to the fact that in the recipe context, the order in which the recipe-steps show up is important, even if it does not follow the recipe-step table primary-key order, because this order will be set by the user.
I was thinking of doing something like:
recipe-step table:
id | stepName | stepDescription
-------------------------------
1 | step1 | description1
2 | step2 | description2
3 | step3 | description3
...
recipe table:
recipeId | step
---------------
1 | 1
1 | 2
1 | 3
...
This way, the order in which the steps show up in the step column is the order I need to maintain.
My concerns with this approach are:
if I have to add a new step between two existing steps, how do I query it? What if I just need to switch the order of two steps already in the sequence?
how do I make sure the order maintains its consistency? If I just insert or update something in the recipe table, it will pop up at the end of the table, right?
Is there any other way you would think of doing this? I also thought of having a previous-step and a next-step column in the recipe-step table, but I think it would be more difficult to make the recipe-steps reusable that way.
In SQL, tables are not ordered.
Unless you are using an ORDER BY clause, database engines are allowed to return records in any order they feel is fastest (for example, a covering index might have the data in a different order, and sometimes even SQLite creates temporary covering indexes automatically).
If the steps have a specific order in a specific recipe, then you have to store this information in the database.
I'd suggest to add this to the recipe table:
recipeId | step | stepOrder
---------------------------
1 | 1 | 1
1 | 2 | 2
1 | 3 | 3
2 | 4 | 1
2 | 2 | 2
Note:
The recipe table stores the relationship between recipes and steps, so it should be called recipe-step.
The recipe-step table is independent of recipes, so it should be called step.
You probably need a table that stores recipe information that is independent of steps; this table should be called recipe.

Normalizing a Table 6

I'm putting together a database that I need to normalize and I've run into an issue that I don't really know how to handle.
I've put together a simplified example of my problem to illustrate it:
Item ID___Mass___Procurement__Currency__________Amount
0__________2kg___inherited____null________________null
1_________13kg___bought_______US dollars_________47.20
2__________5kg___bought_______British Pounds______3.10
3_________11kg___inherited____null________________null
4__________9kg___bought_______US dollars__________1.32
(My apologies for the awkward table; new users aren't allowed to paste images)
In the table above I have a property (Amount) which is functionally dependent on the Item ID (I think), but which does not exist for every Item ID (since inherited items have no monetary cost). I'm relatively new to databases, but I can't find a similar issue to this addressed in any beginner tutorials or literature. Any help would be appreciated.
I would just create two new tables ItemProcurement and Currencies.
If I'm not wrong, as per the data presented, the amount is part of the procurement of the item itself (when the item has not been inherited), for that reason I would group the Amount and CurrencyID fields in the new entity ItemProcurement.
As you can see, an inherited item wouldn't have an entry in the ItemProcurement table.
Concerning the main Item table, if you expect just two different values for the kind of procurement, then I would use a char(1) column (varying from B => bougth, I => inherited).
I would looks like this:
The data would then look like this:
TABLE Items
+-------+-------+--------------------+
| ID | Mass | ProcurementMethod |
|-------+-------+--------------------+
| 0 | 2 | I |
+-------+-------+--------------------+
| 1 | 13 | B |
+-------+-------+--------------------+
| 2 | 5 | B |
+-------+-------+--------------------+
TABLE ItemProcurement
+--------+-------------+------------+
| ItemID | CurrencyID | Amount |
|--------+-------------+------------+
| 1 | 840 | 47.20 |
+--------+-------------+------------+
| 2 | 826 | 3.10 |
+--------+-------------+------------+
TABLE Currencies
+------------+---------+-----------------+
| CurrencyID | ISOCode | Description |
|------------+---------+-----------------+
| 840 | USD | US dollars |
+------------+---------+-----------------+
| 826 | GBP | British Pounds |
+------------+---------+-----------------+
Not only Amount, everything is dependent on ItemID, as this seems to be a candidate key.
The dependence you have is that Currency and Amount are NULL (I guess this means Unknown/Invalid) when the Procurement is 'inherited' (or 0 cost as pointed by #XIVsolutions and as you mention "inherited items have no monetary cost")
In other words, iems are divided into two types (of procurements) and items of one of the two types do not have all attributes.
This can be solved with a supertype/subtype split. You have a supertype table (Item) and two subtype tables (ItemBought and ItemInherited), where each one of them has a 1::0..1 relationship with the supertype table. The attributes common to all items will be in the supertype table and every other attribute in the respecting subtype table:
Item
----------------------------
ItemID Mass Procurement
0 2kg inherited
1 13kg bought
2 5kg bought
3 11kg inherited
4 9kg bought
ItemBought
---------------------------------
ItemID Currency Amount
1 US dollars 47.20
2 British Pounds 3.10
4 US dollars 1.32
ItemInherited
-------------
ItemID
0
3
If there is no attribute that only inherited items have, you even skip the ItemInherited table altogether.
For other questions relating to this pattern, look up the tag: Class-Table-Inheritance. While you're at it, look up Shared-Primary-Key as well. For a more concpetual treatment, google on "ER Specialization".
Here is my off-the-cuff suggestion:
UPDATE: Mass would be a Float/Decimal/Double depending upon your Db, Cost would be whatever the optimal type is for handling money (in SQL Server 2008, it is "Money" but these things vary).
ANOTHER UPDATE: The cost of an inherited item should be zero, not null (and in fact, there sometime IS an indirect cost, in the form of taxes, but I digress . . .). Therefore, your Item Table should require a value for cost, even if that cost is zero. It should not be null.
Let me know if you have questions . . .
Why do you need to normalise it?
I can see some data integrity challenges, but no obvious structural problems.
The implicit dependency between "procurement" and the presence or not of the value/currency is tricky, but has nothing to do with the keys and so is not a big deal, practically.
If we are to be purists (e.g. this is for homework purposes), then we are dealing with two types of item, inherited items and bought items. Since they are not the same type of thing, they should be modelled as two separate entities i.e. InheritedItem and BoughtItem, with only the columns they need.
In order to get a combined view of all items (e.g. to get a total weight), you would use a view, or a UNION sql query.
If we are looking to object model in the database, then we can factor out the common supertype (Item), and model the subtypes (InheritedItem, BoughtItem) with foreign-keys to the supertype table (ypercube explanation below is very good), but this is very complicated and less future-proof than only modelling the subtypes.
This last point is the subject of much argument, but practically, in my experience, modelling concrete supertypes in the database leads to more pain later than leaving them abstract. Okay, that's probably waaay beyond what you wanted :).

Which is a better database schema for a tracking tool?

I have to generate a view that shows tracking across each month. The ultimate view will be something like this:
| Person | Task | Jan | Feb | Mar| Apr | May | June . . .
| Joe | Roof Work | 100% | 50% | 50% | 25% |
| Joe | Basement Work | 0% | 50% | 50% | 75% |
| Tom | Basement Work | 100% | 100% | 100% | 100% |
I already have the following tables:
Person
Task
I am now creating a new table to foreign key into the above 2 tables and i am trying to figure out the pros and cons of creating 1 or 2 tables.
Option 1:
Create a new table with the following Columns:
Id
PersonId
TaskId
Jan2012
Feb2012
Mar2012
Apr2013
or
Option 2:
have 2 seperate tables
One table for just
Id
PersonId
TaskId
and another table for just the following columns
Id
PersonTaskId (the id from table above)
MonthYearKey
MonthYearValue
So an example record would be
| 1 | 13 | Jan2011 | 100% |
where 13 would represent a specific unique Person and Task combination. This second way would avoid having to create new columns to continue over time (which seems right) but i also want to avoid overkill.
which would be a more scalable way to have this schema. Also, any other suggestions or more elegant ways of doing this would be great as well?
You can have a m2m table with data columns. I don't see a reason why you can't just put MonthYearKey, MonthYearValue on the same table with PersonId and TaskId
Id
TaskId
PersonId
MonthYearKey
MonthYearValue
It's possible too that you would want to move the MonthYearKey out into their own table, it really just comes down to common queries and what this data is used for.
I would note, you never want to design a schema where you are adding columns due to time. The first option would require maintenance all the time, and would become very difficult to query also.
Option 2 is definitely more scalable and is not overkill.
Option 1 would require you to add a new column every month and simple date based queries of your data would not be possible, e.g. Show me all people who worked at least 90% in any month last year.
The ultimate view would be generated from a particular query or view of your data.

Resources