Database Structure and Querying - database

Trying to figure out how to change a structure from what I currently have which is this:
tblHaulLogs
intLogID
intHaulType
intSerial
intOriginSource
intOrigin
intDestinationSource
intDestination
dtmHaulDate
ccyLogPay
intHauler
txtLogNotes
intInvoiceID
In this table, what I am doing is using the origin and destination source fields to determine which table the fk for the origin and destination comes from. This feels very wrong to me.
tblHaulTypes
intHaulTypeID
chrHaulType
intOriginSourceType
intDestinationSourceType
Data in the Haul Types Table:
LOT, 1, 1
DEL, 1, 2
RPO, 2, 1
Now let me explain:
The first type happens when an item goes from a sales lot to another sales lot.
The second type happens when an item goes from a sales lot to a customer(sale gets delivered).
The third type happens when an item returns from the customer back to the sales lot.
Then the Item can be resold/returned/resold/returned(rent-to-own system).
Now, here are the problems I have:
An Haul Log's origin will always be the destination of the last move. Therefore I thought that the origin field is redundant. However, it's the relation between the destination of the last move and the destination of the new move that defines what the shipper gets paid and what type of haul it is.
In other words, even though the first type and the third type technically have the same fields, the type of move is not the same because of the previous move type. What do I need to do here? Am I totally missing the boat on what the structure should be?
The questions I need to answer based on this data is:
How many Items do I have on my sales lots that are new inventory(have never been sold).
How many Items do I have that have been sold and returned(doesn't matter how many times).

I'm guessing at the relationship between the various fields and tables.
Your tblHaulTypes table looks fine.
intHaulTypeID
chrHaulType
intOriginSourceType
intDestinationSourceType
You're missing a haul type that accounts for deliveries from suppliers to your lots.
There has to be some table that lists your lots. I'd call it tblHaulLot.
intLotNumber
txtLotName
...
I'd make a tblHaulTransaction table that looks like this.
intTransactionID
intHaulTypeID
intHauler
intOriginOrganizationID
intDestinationOrganizationID
intOriginLot (null if origin is supplier)
intDestinationLot (null if destination is customer)
dtmHaulDate
txtLogNotes
Now, we need an tblOrganization.
intOrganizationID
txtOrganizationName
txtOrganizationAddress
...
The organization at ID 0 is your organization. Suppliers and customers would fill the rest of the table.
I'd make a tblHaulInvoice table that looks like this.
intInvoiceID
intTransactionID
ccyTransactionPay
dtmDateInvoiced
AmountInvoiced
The amount invoiced (and amount paid) have to be accounted for in some table. I don't know what ccy stands for, and I don't know your 3 letter code for a decimal (money) field.
How many Items do I have on my sales lots that are new inventory(have never been sold). How many Items do I have that have been sold and returned(doesn't matter how many times).
Nowhere in your data model is there any kind of inventory table. I'd need to know a lot more about your business to create one or more inventory tables.

Related

How to stop Heap Analytics grouping assets into "OTHER" Category

I think this might be very simple.
I wrote a query in heap to tell me which users were part of an event and how many times they engaged in it during the year.
The result is a simple table with username and number of occurrences.
It worked. However, Heap has this weird behavior of choosing multiple results (maybe at random?) and throwing them into a single "Other (X other results)" category. Where x is a number of others.
So i end up with a table of 20 maybe 30 users and occurences, and one row of "Other (X other results)".
I shrunk the query to see results from a smaller subset of dates and the "Other" category disappeared.
I really need to see every individual row in my query results! Even if it's paginated.
Help! Thank you
You can export the result as a CSV. The downloaded file will contain all the results (all single entries without the grouped OTHER).
Inte the current UI, you can find Export to CSV at the top of the report view.

Recipe Database Design

I am trying to create a database to store my recipes. However, I am not sure how to implement it. I looked at other questions like this but they do not have the same focus as I.
I assume any dish is actually just an ingredient, which can then be used in other dishes, or in this case in other ingredients. Any ingredient may have multiple recipes. For now, each recipe indicates how much of each ingredient is needed, but I also want to know how these ingredients are combined without having a long text description of it.
For example, in text, I would describe one (very bad) scrambled eggs recipes like this:
Scrambled eggs:
Cooked for 5 minutes(
1g Butter,
Whisked(
1g Salt,
1g Pepper,
2 Eggs
)
and then Scrambled eggs could be used in another recipe as an ingredient.
But how would that translate in a database? I don't need that database to be SQL based since this is a personal project, but I don't know any other kind of databases so far.
I thought about defining an Ingredient, as having an optional Technique associated with it but that means Whisked(1g salt, 1g pepper, 2 eggs) would have to be an Ingredient. Which I guess could work and I could also make the name of ingredients optional, but it seems awkward.
I also thought about defining a Recipe as having multiple TransformedIngredients which would contain a Technique applied to many Ingredients but sometimes a Recipe contains raw, untransformed, Ingredients and sometimes TransformedIngredients would need to be applied to TransformedIngredient. From what I know of databases that wouldn't work.
PS: I stumbled onto a functional programming Tiramisu recipe which, though very much focused on the techniques, displays fairly well what I'm trying to implement for my database.
I think what's confusing is that there are two different things to think about with a recipe, 'Items' and 'Steps'.
One database structure that comes to mind for this is a Star Schema structure which separates these ideas nicely (into Dimension and Fact tables, respectively).
A quick description of each:
Dimension
"The state of something" i.e. a record is merely there to describe what the thing is. A customer's address table would be an example of a dimension table.
Fact
"Things changing over time" i.e. each record relates to a dimension table, but has changing values. An example would be shipped purchases from a website to a customer's address. The address stays the same, but the shipments are getting constantly added to the table.
This isn't to say that Dimension tables don't change, too; obviously new users sign up for websites all the time. In the above address example, if a customer were to change his address, a new primary key value would be added for the new address.
Now on to your recipe examples:
Imagine you're cooking something. I would put anything that you hold in your hands in a "dimension" table. For example: DIM_INGREDIENT (with columns such as INDREDIENT_ID, INGREDIENT_NAME), and DIM_AMOUNT (AMOUNT_ID, AMOUNT, UNITS) to describe the amounts. And DIM_ACTION (ACTION_ID, TYPE, LENGTH, UNITS) to describe the action. There are more you can come up with; these are a few to get started.
Any steps I'd be taking could go in a FACT_RECIPE_STEPS table that would map to all the dimension tables. Any step that doesn't have a logical step would have a null value (i.e. stir for 5 minutes would have null for INGREDIENT_ID).
The FACT_RECIPE_STEPS could look like this:
RECIPE_ID, RECIPE_STEP, ACTION_STEP_ID, INGREDIENT_ID, AMOUNT_ID, ACTION_ID
What gets confusing is the "substep" of whisking the stuff together. I put that in another FACT table called FCT_ACTION_STEP since "whisking" is one action in the recipe list, but to perform the action you actually need to do three things.
I think the following is what some of the tables would look like with your data:
DIM_INGREDIENT
INGREDIENT_ID: 1
INGREDIENT_NAME: 'Scrambled eggs'
INGREDIENT_ID: 2
INGREDIENT_NAME: 'Salt'
INGREDIENT_ID: 3
INGREDIENT_NAME: 'Pepper'
INGREDIENT_ID: 4
INGREDIENT_NAME: 'Eggs'
INGREDIENT_ID: 5
INGREDIENT_NAME: 'Butter'
DIM_ACTION
ACTION_ID: 1
TYPE: 'Cook'
LENGTH: 5
UNITS: 'minutes'
ACTION_ID: 2
TYPE: 'Whisk'
LENGTH: null
UNITS: null
FCT_ACTION_STEP
STEP_ID: 1
ACTION_ID: 2
DIM_AMOUNT
AMOUNT_ID: 1
AMOUNT: 1
UNITS: 'grams'
AMOUNT_ID: 2
AMOUNT: 2
UNITS: null
FACT_RECIPE_STEPS
RECIPE_ID, RECIPE_STEP, ACTION_STEP_ID, INGREDIENT_ID, AMOUNT_ID, ACTION_ID
EDIT:
I was a bit unsure myself as to how to do the "Whisked" part of the recipe and thought that, when you add the whisked mixture to the final result, it's like adding in one ingredient to the recipe. However, you need to prepare the mixture before and it has three steps. It's basically like it's own little recipe, and the FACT_ACTION_STEP takes that other 'recipe' into account to be able to add the result one row in the FACT_RECIPE_STEPS table.
Now that I think about it a bit more, it might be better to just assign "Whisked" as its own recipe in FACT_RECIPE_STEPS and DIM_INGREDIENT (called something like "Whisked spices for eggs") +and get rid of the FACT_ACTION_STEP table altogether. That way you can easily make more complex recipes, such as "Eggs and Pancake Breakfast" where the Eggs part is the result of this recipe.
You can add some other fields to tables but I believe this schema works for you.
recipe
------------
r_id PK
recipe_name
cooking_time
recipe_of_recipes
-----------------
ror_id PK
ror_name
recipe_ror (table for many to many relation-> defining a recipe as an ingredient)
-------------
r_ror_id PK
r_id FK
ror_id FK
ingredients
-------------
i_id PK
t_id FK
r_id FK
ror_id FK (added later)
ingredient_name
quantity
technique
-------------
t_id PK
technique_name
EDIT
Let's say you want to store a recipe (X) which is a combination of x and y recipes plus z ingredient.
To prepare X recipe (big X),
in recipe,ingredients and technique tables you store
the x recipe and w,t,r ingredients with technique of p
the y recipe and b,n,m ingredients with technique of v
also z ingredient with technique of f (for this I forgot to add field ror_id as a FK in ingredients table)
You can define 2 different recipes (x and y) as ingredients of a recipe (X) using the recipe_ror table. This table relates to different recipes as one.(many to many relationship between tables recipe and recipe_of_recipes)
If you also want to store the technique for X,x or y recipes(like cook in your example), you can also add t_id field as FK to recipe and recipe_of_recipes table.

modeling correct star schema for ssas tabular

I'm using ssas tabular (powerpivot) and need to design a data-model and write some DAX.
I have 4 tables in my relational database-model:
Orders(order_id, order_name, order_type)
Spots (spot_id,order_id, spot_name, spot_time, spot_price)
SpotDiscount (spot_id, discount_id, discount_value)
Discounts (discount_id, discount_name)
One order can include multiple spots but one spot (spot_id 1) can only belong to one order.
One spot can include different discounts and every discount have one discount_value.
Ex:
Order_1 has spot_1 (spot_price 10), spot_2 (spot_price 20)
Spot_1 has discount_name_1(discount_value 10) and discount_name_2 (discount_value 20)
Spot_2 has discount_name_1(discount_value 15) and discount_name_3 (discount_value 30)
I need to write two measures: price(sum) and discount_value(average)
How do I correctly design a star schema with fact table (or maybe two fact tables) so that I in my powerpivot cube can get:
If i choose discount_name_1 I should get
order_1 with spot_1 and spot_2 and price on order_1 level will have value 50 and discount_value = 12,5
If I choose discount_name_3 I should get
order_1 with only spot_2 and price on order level = 20 and discount_value = 30
Fact(OrderKey, SpotKey, DiscountKey, DateKey, TimeKey Spot_Price, Discount_Value,...)
DimOrder, DimSpot, DimDiscount, etc....
TotalPrice:=
SUMX(
SUMMARIZE(
Fact
,Fact[OrderKey]
,Fact[SpotKey]
,Fact[Spot_Price]
)
,Fact[Spot_Price]
)
AverageDiscount:=
AVERAGE(Fact[Discount_Value])
Fact table is denormalized and you end up with the simplest star schema you can have.
First measure deserves some explanation. [Spot_Price] is duplicated for any spot with multiple discounts, and we would get wrong results with a simple SUM(). SUMMARIZE() does a group by on all the columns passed to it, following relationships (if necessary, we're looking at a single table here so nothing to follow).
SUMX() iterates over this table and accumulates the value of the expression in its second argument. The SUMMARIZE() has removed our duplicate [Spot_Price]s so we accumulate the unique ones (per unique combination of [OrderKey] and [SpotKey]) in a sum.
You say
One order can include multiple spots but one spot (spot_id 1) can only
belong to one order.
That's is not supported in the table definitions you give just above that statement. In the table definitions, one order has only one spot but (unless you've added a unique index to Orders on spot_id) each Spot can have multiple orders. Each Spot can also have multiple discounts.
If you want to have the relationship described in your words, the table definitions should be:
Orders(order_id, order_name, order_type)
OrderSpot(order_id, spot_id) -- with a Unique index on spot_id)
Spots (spot_id, spot_name, spot_time, price)
or:
Orders(order_id, order_name, order_type)
Spots (spot_id, spot_name, spot_time, order_id, price)
You can create the ssas cube with Order as the fact table, with one dimention in the Spot Table. If you then add the SpotDiscount and Discount tables with their relations (SpotDiscount to Spot, Discount to SpotDiscount) you have a 1 dimentional.
EDIT as per comments
Well, the Fact table would have order_id, order_name, order_type
The Dimension would be made up of the other 3 tables and have the columns you're interested in: probably spot_name, spot_time, spot_price, discount_name, discount_value.

How to Select a record in Access if it is not a duplicate value

Ok, from the title it seems to be impossible to understand, I'll try to be as clear as possible.
Basically, I have a table, let's call it 'records'. In this table I have some products, of which I store 'id', 'codex' (which is a unique identifier for a certain product in the whole database), 'price' and 'situation'. This last one is a string which tells me wether the product has just entered the store (in that case it is set to 'IN'), or it has already been sold ('OUT' in this case).
The database was not created by us, I HAVE to work with that although it is horribly structured... The guy who originally projected the database decided to register when a product's situation passes from 'IN' to 'OUT' in the following way: instead of UPDATEing the corresponding value in the table, he used to take the row of data with 'IN' as situation, and to DUPLICATE it setting, that time, 'OUT' as situation.
Just to sum up: if a product has not been sold yet, it will have one row of dedicated data; otherwise those rows will be two, identical except for the 'situation' field.
What I need to do is: select a product if (and ONLY if) there is no duplicate for it. Basically, I can (and should) look for a 'codex', and if I my Count(codex) ends up being >1, I do not select the row.
I hope the explanation of the process is clear enough...
I tryed many alternative (no, SELECT DISTINCT is not a solution): des anyone have an idea of how to do that? Because really, none of us three could come up with a good solution!
Here is the schema for the table, I hope it is sufficiently clear, and if not do not hesitate asking for more details.
Just as a reminder: the project is in (sigh...) VB.net, the database is in Microsoft Access (mdb).
I could not find a solution on StackOverFlow, I hope this is not a duplicate question! Thanks in advance for the help.
id codex price situation
1 1 2.50 IN
2 1 2.50 OUT
3 2 3.45 IN
4 3 21.50 IN
5 2 3.45 OUT
6 4 1.50 IN
To check if I understand what your problem is... In your example table you just want to get the lines with ID 4 a 6, right?
If is that what you want, and If you want only the not sold ones try this command
SELECT
*
FROM
records
WHERE
codex
not in
(
SELECT
codex
FROM
records
WHERE
situation ='OUT'
)

Is there any contradiction against too many values in a table (database)?

I was wondering if there's any contradiction or futur problems against a table in a database which contains about 80 columns. There will be only VARCHARs, few INT and maybe 1 or 2 MESSAGE. I did some research on the net but there's nothing really talking about that kind of problem...In other terms, is this okay or even 'normal' to put that much of values inside a table??
Thanks in advance!
You shouldn't have any real problems if the fields are mostly integers. Most DBMSes have a limit on row length, so a bunch of long columns can cause issues...but unless the varchar columns are very long, you're probably OK.
I've honestly never even needed to think about that, though -- with a properly normalized database, it's quite rare to ever need that many columns in a table.
More columns you have, more memory server needs to process the records.
I recomend to use the "multiple to one" relation scheme in this case.
Example of tables:
customer
id
name
email
...
ins_app_form (Insurance application form)
id
customer_id (relation with customer)
date
... (here comes some other data if you need)
ins_app_item (Insurance application form items/fields)
id
ins_app_form_id (relation with Insurance application form)
question (the name of a question in application form)
answer (customer's answer)
So to show the application form with this scheme you will need to run a query:
SELECT
iaf.id AS application_id,
iaf.date AS `date`,
iai.question,
iai.answer
FROM ins_app_form AS iaf
LEFT JOIN ins_app_item AS iai ON iai.ins_app_form_id=iaf.id
WHERE iaf.customer_id=<ID of a customer>
This query will bring you something like this:
id date question answer
1 2014-03-31 "Year" "2008"
1 2014-03-31 "Car make" "Audi"
1 2014-03-31 "Car model" "Q7"
...

Resources