Activity list ala SO - database

We are building a set of features for our application. One of which is a list of recent user activities ala on SO. I'm having a little problem finding the best way to design the table for these activities.
Currently we have an Activities table with the following columns
UserId (Id of the user the activity is for)
Type (Type of activity - i.e. PostedInForum, RepliedInForum, WroteOnWall - it's a tinyint with values taken from an enumerator in C#)
TargetObjectId (An id of the target of the activity. For PostedInForum this will be the Post ID, for WroteOnWall this will be the ID of the User whose wall was written on)
CreatedAtUtc (Creationdate)
My problem is that TargetObjectId column doesn't feel right. It's a soft link - no foreign keys and only a knowledge about the Type tells you what this column really contains.
Does any of you have a suggestion on an alternate/better way of storing a list of user activites?
I should also mention that the site will be multilingual, so you should be able to see the activity list in a range of languages - that's why we haven't chosen for instance to just put the activity text/html in the table.
Thanks

You can place all content to a single table with a discriminator column and then just select top 20 ... from ... order by CreatedAtUtc desc.
Alternatively, if you store different type of content in different tables, you can try something like (not sure about exact syntax):
select top 20 from (
select top 20 ID, CreatedAtUtc, 'PostedToForum' from ForumPosts order by CreatedAtUtc
union all
select top 20 ID, CreatedAtUtc, 'WroteOnWalll' from WallPosts order by CreatedAtUtc) t
order by t.CreatedAtUtc desc

You might want to check out http://activitystrea.ms/ for inspiration, especially the schema definition. If you look at that spec you'll see that there is also the concept of a "Target" object. I have recently done something very similar but I had to create my own database to encapsulate all of the activity data and feed data into it because I was collecting activity data from multiple applications with disparate data sources in different databases.
Max

Related

Is it ok to create a new database table for each metric?

My engineering team builds a machine and records various metrics related to the machine such as battery voltage, name of the machine, number of times used etc. My current database structure has the following columns in one table
ID
Name of the machine
# time used
battery voltage
.
.
The team keeps changing the names of the machine or the metrics and they suggest that every time there is a name change a new table should automatically be created to avoid any code breaks. Eg. If the initial name was A1/BatteryVoltage the table would be
Id
A1/BatteryVoltage
Later if they change it to A1/Battery_Voltage, they want a new table to be created with following columns
ID
A1/Battery_Voltage
I have a sense that this doesn't make sense as it can bombard the database with huge number of tables. But my manager is asking me to be more concrete on why I think this doesn't make sense. I know that cost of creating tables is not much but I also know that I cannot create filters such as machine name in pulling metrics when I use this structure. I also know that I will have to create multiple joins to get one single metric. But is there anything else that can help me in convincing my team or convince myself about what my team wants?
Create one table for everything, adding columns to discriminate metrics. Something like:
create table metric (
id int, -- eg auto increment
created timestamp,
machine_name text, -- eg 'A1'
attribute text, -- eg 'Battery_Voltage'
value text -- eg '9'
)
Now you never have to do anything to cater for new attributes, or attribute name changes.

Oracle APEX - Data Modeling & Primary Keys

I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks
4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.

What is a recommended schema / database design to store custom report settings in my sql database?

I am building a tool to allow people to create customized reports. My question resolves around getting the right database schema and design to support some custom report settings.
In terms of design, I have various Slides and each Slide has a bunch of settings (like date range, etc). A Report would basically be an ordered list of slides
The requirements are:
A user can create a report by putting together a list of "Slides" in any order they wish
A user can include the same slide twice in a report with different settings
So I was thinking of having the following tables:
Report Table: Id, Name, Description
Slide Table:, Id, Description
ReportSlide Table: ReportId, SlideId, Order, SlideSettings
my 2 main questions are:
Order: Is this the best way to manage the fact that a user can order their slides on any given report
SlideSettings: since every slides has a different set of settings (inputs), i was thinking of storing this as just a json blob and then parsing it out on the front end. Does anything one think this is the wrong design? Is there a better way to store this information (again, each slide has different inputs and you could have the same slide listed twice in a report each with different settings
Order: Is this the best way to manage
It is the correct way.
SlideSettings: ... storing this as just a json blob
If you never intend to query these values, then that's fine.
You may want to rename ReportSlide to SlideInReport. A relationship should not just list the referenced tables, but the nature of the relationship.
Some (me) prefer to give PK-columns and FK-columns the same name. Then you cannot get away with just Id, but you need to call them sld_id, rep_id.
May be you should have a Settings table. You may also need a ValueTypes table to define which setting can take what kind of values. (such as Date Range). And then let the list of setting IDs be stored against a slide.
Needless to say, these "best way"s will depend on type and amount of data being stored etc. Am a novice in JSON etc, but as far as I read, it's not a good idea to keep JSON strings as database fields, but not a rule.
I think, from a high level view, your schema will work. However, you might consider revising some of the table structure. For example:
Settings
Rather than a JSON blob, it may be best to add columns for each setting the ReportSlide table. Depending on what inputs you allow, give a column for each. For example, your date range will need to have StartDate/EndDate, Integers, Text fields, etc.
What purpose does the Slide Table serve? If your database allows a many-to-many relationship between Slides and Reports, then the ReportSlide table will hold all your settings. Will your Slide Table have attributes? If not, then perhaps Report Slides are all you need. For example:
Report Table: ReportID | DateCreated | UserID | Description
ReportSlides Table: ReportSlideID | ReportID | SlideOrder | StartDate | EndDate | Description...
Unless your Slide table is going to hold specific attributes that will be common across every report, you don't need the extra joins or space.
Depending on the tool, you may also want to have a DateCreated, UserID, FolderID, etc. Attributes that allow people to organize their reports.
If the Slides are dependent on each other, you will want to add constraints so Slide 2 cannot be deleted if Slide 3 depends on it.
Order
Regarding order, having a SlideOrder column will work. Because each ReportSlideID will have a corresponding Report, the SlideOrder can still be changed. That way, if ReportSlideID = 1 belongs to ReportID = 1 and has specific settings, it can be ordered 7th or 3rd and still work.
Be aware of your naming convention. If the Order column is directly referencing Slide Order, then go ahead and name it SlideOrder.
I'm sure there are a million other ways to make it efficient. That's my initial idea based on what you've provided.
Report Table: ID (Primary Key), Name, Description,....
Slide Table: ID (PK), Name, Description,...
Slide_x_report Table: ID(PK), ReportID (FK), SlideID (FK), order
Slide_settings Table: ID(PK), NameSetting, DescriptionSettings, SlideXReportID (FK),...
I think that you shoud have a structure like this, and in the Slide_settings table you will have the setting of the differents slides by reports.
Imagine that the slide_settings table may contain dynamic forms and these should relate to a specific slide of a report, in this way you can have it all properly stored and the slide_settings table, you would have only columns that are needed to define an element of slide.

Getting records structured the same way only partially

While surfing through 9gag.com, an idea (problem) came up to my mind. Let's say that I want to create a website where users can add diffirent kinds of entries. Now each entry is diffirent type and needs diffirent / additional columns.
Let's say that we can add:
a youtube video
a cite which requires the cite's author name and last name
a flash game which requires additional game category, description, genre etc.
an image which requires the link
Now all the above are all entries and have some columns in common (like id, add_date, adding_user_id, etc...) and some diffirent / additional (for example: only flash game needs description or only image needs plus_18 column to be specified). The question is how should I organize DB / code for controlling all of the above as entries together? I might want to order them, or search entries by add_date etc...
The ideas that came up to my mind:
Add a "type" column which specifies what entry it is and add all the possible columns where NULL is allowed for not related to this particular type columns. But this is mega nasty. There is no data integration.
Add some column with serialized data for the additional data but it makes any filtration a total hell.
Create a master (parent) table for an entry and separate tables for concrete entry types (their additional columns / info). But here I don't even know how I'm supposed to select data properly and is just nasty as well.
So what's the best way to solve this problem?
The parent table seems like the best option.
// This is the parent table
Entry
ID PK
Common fields
Video
ID PK
EntryID FK
Unique fields
Game
ID PK
EntryID FK
Unique fields
...
What the queries will look like will largely depend on the type of query. To, for example, get all games ordered by a certain date, the query will look something like:
SELECT *
FROM Game
JOIN Entry ON Game.EntryID = Entry.ID
ORDER BY Entry.AddDate
To get all content ordered by date, will be somewhat messy. For example:
SELECT *
FROM Entry
LEFT JOIN Game ON Game.EntryID = Entry.ID
LEFT JOIN Video ON Video.EntryID = Entry.ID
...
ORDER BY Entry.AddDate
If you want to run queries like the one above, I suggest you give unique names to your primary key fields (i.e. VideoID and GameID) so you can easily identify which type of entry you're dealing with (by checking GameID IS NOT NULL for example).
Or you could add a Type field in Entry.

How do you manage "pick lists" in a database

I have an application with multiple "pick list" entities, such as used to populate choices of dropdown selection boxes. These entities need to be stored in the database. How do one persist these entities in the database?
Should I create a new table for each pick list? Is there a better solution?
In the past I've created a table that has the Name of the list and the acceptable values, then queried it to display the list. I also include a underlying value, so you can return a display value for the list, and a bound value that may be much uglier (a small int for normalized data, for instance)
CREATE TABLE PickList(
ListName varchar(15),
Value varchar(15),
Display varchar(15),
Primary Key (ListName, Display)
)
You could also add a sortOrder field if you want to manually define the order to display them in.
It depends on various things:
if they are immutable and non relational (think "names of US States") an argument could be made that they should not be in the database at all: after all they are simply formatting of something simpler (like the two character code assigned). This has the added advantage that you don't need a round trip to the db to fetch something that never changes in order to populate the combo box.
You can then use an Enum in code and a constraint in the DB. In case of localized display, so you need a different formatting for each culture, then you can use XML files or other resources to store the literals.
if they are relational (think "states - capitals") I am not very convinced either way... but lately I've been using XML files, database constraints and javascript to populate. It works quite well and it's easy on the DB.
if they are not read-only but rarely change (i.e. typically cannot be changed by the end user but only by some editor or daily batch), then I would still consider the opportunity of not storing them in the DB... it would depend on the particular case.
in other cases, storing in the DB is the way (think of the tags of StackOverflow... they are "lookup" but can also be changed by the end user) -- possibly with some caching if needed. It requires some careful locking, but it would work well enough.
Well, you could do something like this:
PickListContent
IdList IdPick Text
1 1 Apples
1 2 Oranges
1 3 Pears
2 1 Dogs
2 2 Cats
and optionally..
PickList
Id Description
1 Fruit
2 Pets
I've found that creating individual tables is the best idea.
I've been down the road of trying to create one master table of all pick lists and then filtering out based on type. While it works, it has invariably created headaches down the line. For example you may find that something you presumed to be a simple pick list is not so simple and requires an extra field, do you now split this data into an additional table or extend you master list?
From a database perspective, having individual tables makes it much easier to manage your relational integrity and it makes it easier to interpret the data in the database when you're not using the application
We have followed the pattern of a new table for each pick list. For example:
Table FRUIT has columns ID, NAME, and DESCRIPTION.
Values might include:
15000, Apple, Red fruit
15001, Banana, yellow and yummy
...
If you have a need to reference FRUIT in another table, you would call the column FRUIT_ID and reference the ID value of the row in the FRUIT table.
Create one table for lists and one table for list_options.
# Put in the name of the list
insert into lists (id, name) values (1, "Country in North America");
# Put in the values of the list
insert into list_options (id, list_id, value_text) values
(1, 1, "Canada"),
(2, 1, "United States of America"),
(3, 1, "Mexico");
To answer the second question first: yes, I would create a separate table for each pick list in most cases. Especially if they are for completely different types of values (e.g. states and cities). The general table format I use is as follows:
id - identity or UUID field (I actually call the field xxx_id where xxx is the name of the table).
name - display name of the item
display_order - small int of order to display. Default this value to something greater than 1
If you want you could add a separate 'value' field but I just usually use the id field as the select box value.
I generally use a select that orders first by display order, then by name, so you can order something alphabetically while still adding your own exceptions. For example, let's say you have a list of countries that you want in alpha order but have the US first and Canada second you could say "SELECT id, name FROM theTable ORDER BY display_order, name" and set the display_order value for the US as 1, Canada as 2 and all other countries as 9.
You can get fancier, such as having an 'active' flag so you can activate or deactivate options, or setting a 'x_type' field so you can group options, description column for use in tooltips, etc. But the basic table works well for most circumstances.
Two tables. If you try to cram everything into one table then you break normalization (if you care about that). Here are examples:
LIST
---------------
LIST_ID (PK)
NAME
DESCR
LIST_OPTION
----------------------------
LIST_OPTION_ID (PK)
LIST_ID (FK)
OPTION_NAME
OPTION_VALUE
MANUAL_SORT
The list table simply describes a pick list. The list_ option table describes each option in a given list. So your queries will always start with knowing which pick list you'd like to populate (either by name or ID) which you join to the list_ option table to pull all the options. The manual_sort column is there just in case you want to enforce a particular order other than by name or value. (BTW, whenever I try to post the words "list" and "option" connected with an underscore, the preview window goes a little wacky. That's why I put a space there.)
The query would look something like:
select
b.option_name,
b.option_value
from
list a,
list_option b
where
a.name="States"
and
a.list_id = b.list_id
order by
b.manual_sort asc
You'll also want to create an index on list.name if you think you'll ever use it in a where clause. The pk and fk columns will typically automatically be indexed.
And please don't create a new table for each pick list unless you're putting in "relationally relevant" data that will be used elsewhere by the app. You'd be circumventing exactly the relational functionality that a database provides. You'd be better off statically defining pick lists as constants somewhere in a base class or a properties file (your choice on how to model the name-value pair).
Depending on your needs, you can just have an options table that has a list identifier and a list value as the primary key.
select optionDesc from Options where 'MyList' = optionList
You can then extend it with an order column, etc. If you have an ID field, that is how you can reference your answers back... of if it is often changing, you can just copy the answer value to the answer table.
If you don't mind using strings for the actual values, you can simply give each list a different list_id in value and populate a single table with :
item_id: int
list_id: int
text: varchar(50)
Seems easiest unless you need multiple things per list item
We actually created entities to handle simple pick lists. We created a Lookup table, that holds all the available pick lists, and a LookupValue table that contains all the name/value records for the Lookup.
Works great for us when we need it to be simple.
I've done this in two different ways:
1) unique tables per list
2) a master table for the list, with views to give specific ones
I tend to prefer the initial option as it makes updating lists easier (at least in my opinion).
Try turning the question around. Why do you need to pull it from the database? Isn't the data part of your model but you really want to persist it in the database? You could use an OR mapper like linq2sql or nhibernate (assuming you're in the .net world) or depending on the data you could store it manually in a table each - there are situations where it would make good sense to put it all in the same table but do consider this only if you feel it makes really good sense. Normally putting different data in different tables makes it a lot easier to (later) understand what is going on.
There are several approaches here.
1) Create one table per pick list. Each of the tables would have the ID and Name columns; the value that was picked by the user would be stored based on the ID of the item that was selected.
2) Create a single table with all pick lists. Columns: ID; list ID (or list type); Name. When you need to populate a list, do a query "select all items where list ID = ...". Advantage of this approach: really easy to add pick lists; disadvantage: a little more difficult to write group-by style queries (for example, give me the number of records that picked value X".
I personally prefer option 1, it seems "cleaner" to me.
You can use either a separate table for each (my preferred), or a common picklist table that has a type column you can use to filter on from your application. I'm not sure that one has a great benefit over the other generally speaking.
If you have more than 25 or so, organizationally it might be easier to use the single table solution so you don't have several picklist tables cluttering up your database.
Performance might be a hair better using separate tables for each if your lists are very long, but this is probably negligible provided your indexes and such are set up properly.
I like using separate tables so that if something changes in a picklist - it needs and additional attribute for instance - you can change just that picklist table with little effect on the rest of your schema. In the single table solution, you will either have to denormalize your picklist data, pull that picklist out into a separate table, etc. Constraints are also easier to enforce in the separate table solution.
This has served us well:
SQL> desc aux_values;
Name Type
----------------------------------------- ------------
VARIABLE_ID VARCHAR2(20)
VALUE_SEQ NUMBER
DESCRIPTION VARCHAR2(80)
INTEGER_VALUE NUMBER
CHAR_VALUE VARCHAR2(40)
FLOAT_VALUE FLOAT(126)
ACTIVE_FLAG VARCHAR2(1)
The "Variable ID" indicates the kind of data, like "Customer Status" or "Defect Code" or whatever you need. Then you have several entries, each one with the appropriate data type column filled in. So for a status, you'd have several entries with the "CHAR_VALUE" filled in.

Resources