MEAN Stack: Storing functions in Database to achieve Excel-like behaviour

MEAN Stack: Storing functions in Database to achieve Excel-like behaviour - angularjs

I want to code Excel-like behaviour in my MEAN Application. What I mean with excel-like behaviour that is having cells and columns and rows (I have that in Angular already) and now I want to enable users to select e.g. 2 columns and let a self-defined calculation run over it.
Now storing User-defined Functions in MongoDB sounds like a very bad idea, since a user could drop the database.
The Stack
[User Input]
[Calculations/Functions]
[Controller]
[Middleware]
[API+Server]
[DB]
I discussed how to achieve this with MEAN here: Convert JSON String to Objects however I highly doubt this will be secure. Any idea how to achieve this in a solid fashion?

Related

Database designing for query builder project

I have this project in which user can create queries using web based UI in which there will be list of columns and applicable operators. Later users can edit these queries, too. So I need to store this in database.
I can store query in table as simple string but then editing will be not possible so I need to store it some other way. So I somehow managed to design it following way.
So let's say I have to store this query:
C1 > 8 AND (C2 <= 7 OR (C4 LIKE '%all%' AND (C1 > 15 OR C2 <= 3)))
where: C denotes some column
If I have to store it in DB as shown in image,
I would group each condition and store it in sub_operand table
then there will be recursive mapping entry in op_master table for each entry in sub_operand table
finally there will be master entry in op_master
But it seems too much complicated to handle insert and update. Can someone help me with this? I am very much stuck here.
UPDATE: I think I am missing something here in schema. It won't work as I have thought. I will update question as soon as I can correct it.

I'm not quite sure about the way you will use your data structure to represent the tree structure of a formula. See my answer to Logical Expressions rules in relational datamodel for this aspect. (But your question is not a duplicate of that one.)
I don't see the complication in inserting and updating. The only complicated aspect I can see is the GUI for your users to enter and edit these recursive formulas. It's somewhat complicated as, due to the unbounded width and depth of the formula, you can't just define one set of drop-down fields for column and operand selection, but the count of GUI elements will need to grow as the user increases the width and depth.
Once this is solved, you will have the following architecture:
Formula GUI --buildFormula--> Formula --storeFormula--> Database
<--display------- <-readFormula----
This means you have some abstract representation of a Formula in your domain layer, again some tree, that you use to actually evaluate these formulas. And you need to persist Formulas in the database. The operations propagating formulas from the GUI to the domain and further to the database, and the other way round, are also shown.
As I said, the GUI is the most complex part. Having a formula representation on the GUI, sending it to the domain and building its identically structured counterpart in your programming language isn't a problem. If the formula got edited, i.e. if it had a previous structure that now got modified by the user, I wouldn't try to incrementally update the domain object, but just throw it away and build it up from scratch. The same holds for storing in the database: Delete all parts of it and store it as a whole.
Reading is straightforward too, again with some effort in building the GUI representation.
By the way, it isn't strictly necessary to represent all subterms of your formula as records in the database. If you will never query on those subterms, but just store and read the formula as a whole, and if you never have a query like select all formulas using a specific column, it would be sufficient to store a formula as a single string.

CakePHP virtual HABTM table/relation, something like fixture

first of all I'd like to tell you that you're terrific audience.
I'm making an application where I have model Foo with table Foos. And I'd like to give Foo another parameter, HABTM parameter, lets say Bar. But I'd rather don't create table for Bar. Because Bar will have like 5 positions on start and in 5 years it will grow to maybe 7 positions or not at all. So I don't see a need to create another table and make CakePHP look into that table with another SELECT. Anyone have an idea this can be achieved ?
One solution I think is making an fixture for Bars table and adding only Bars_Foos table for real (it won't be big anyway). But I can't find a way to use test fixtures in normal Controller
Second solution is to save a JSON or serialized array in Foo one field and move logic to model, but I don't know if it is best solution. Something like virtual field.
Real life example:
So I have like Bikes. And every Bike have its main_type. Which is for now {"MTB","Road","Trekking","City","Downhill"}. I know that in long time this list would not grow much. Maybe 2 or 5 positions in few years. Still it will relatively short.
(For those who say that there maybe a hundred of specialized bike types. I have another parameter column specialized_type)
It needs to be a HABTM relation, but main_types table will be very small, so Id like to avoid creating it and find a way for simpler solution.
Because
It bothers MySQL for such small amount of data
It complicates MySQL queries
I have to make additional model for MainType
I have more models to unbind when I don't need most of data and would like use recursive
Insert here anything you'd like...

Judging from your real life example, I'd say you're on the wrong track. The queries won't be complicated, CakePHP uses additional queries for HABTM relations, it would be just one additional query which shouldn't be very costly, also it's very easy to sparse it out by using the containable behaviour. And if you really need to use recursive only (for whatever reason), then it's just one single additional model to unbind, that doesn't seem like overkill to me.
This might not be what you wanted to hear, but I really think a proper database solution is better than trying to hack in "virtual data". Also note that fixtures as used in tests, only define data which is written to the database on the fly when running the test, so that would be definitely more costly than using data that already exists in the database.
Maybe you'll get a small performance boost for selects that do not query the main type when using an additional column to store the data, but you'll definitely lose all the flexibility that the RDBMS has to offer, including faster selects using proper indexing, affecting multiple records by updating a single related value, etc. That doesn't sound like a good trade-off to me. Think about it, how would you select all Downhill Tracking bikes when this information is stored as a string in a single column? You would probably end up using ugly LIKE selects.
Now wait, there's a SET data type in MySQL hat can hold multiple values. Right, and it looks easier and less complex. Right, but in the background it isn't, while using a complex looking join-query can be pretty fast using proper indexing, the query for the SET type will have to scan every single row since the data stored in the column cannot be indexed appropriately in order to make more specific selects.
In the end it probably depends on your data, so I'd suggest testing both methods in your specific environment and see how they compare under workload.

How can I standardize user-entered data?

I have a table of data that I'm trying to "standardize". The data entered into the table wasn't static or standardized (like with drop-down lists of answers), leaving me with multiple variations of answers where I want a static, universal answer.
For instance, let's say that there's a column in the database called "Type of pet". Because user input wasn't standardized, people could enter in variations of a specific type of pet, rather than generalized form of the pet. So instead of just entering "Dog", there are different versions of dogs like "Collie", "Mutt", "Labrador", etc.
How do I go about transcribing these answers into their generalized form -- replacing Collie/Mutt/Labrador/etc answers in the table with just "Dog" (or "Cat", or "Bird", etc.)?
I realize there needs to be some form of a manually-entered "translation" function. My gut reaction is that a long-spanning list of stacked if-statements would be inefficient, as well as being tedious to control and expand.
Is there some kind of process or system for doing something like this? Like some type of lookup table system/matrix?
I'm assuming a foreach loop to iterate through the array of records would be most appropriate. And then within each iteration of the foreach loop, you'd have it do a test/comparison of the pet variable against some type of list (that I would have created manually) -- but what would you use for this lookup table/list? Or this step of the process? Would you have it as some type of a SQL database/table, an array, a CSV file, etc.?
Then, once this comparison is completed and the "translated" equivalent of the type of pet is determined, the foreach loop would update that specific row of the record, either overwriting the old non-standardized value, or perhaps just tacking on the new standardized equivalent into a new column (for later verifying).

My gut reaction is that a long-spanning list of stacked if-statements
would be inefficient, as well as being tedious to control and expand.
100% correct, and because of this you really only have one option: Manually go through the database and clean it up. Once that is done you will need to restrict user input using stop down lists rather than raw text input.
Depending on your users you might want to look at how Stackoverflow does tags - essentially allowing anyone to do the cleanup for you.

But if you have like 150000 records or something doing an SQL find-replace query might help clean up the data to start.
This sounds like a data normalization project to me though I don't have alot of experience with it in practice, but in theory you start with how the data is entered. For instance, free text fields allow users to enter anything they want. You'd want to change that after scrubbing the data. And it pays to know how the data got in in the first place. Was it freetext, a bullet, a drop-down menu? etc.
You'd also want to create a data dictionary of all the standardized terms that can replace the multitude of variations with.
Then you could create an update query that would go through the old data and update it with the new using an update query and wildcards.
https://support.office.com/en-us/article/Use-the-Find-and-Replace-dialog-box-to-change-data-2eee8d02-5a40-4328-ba56-ec0406865680
This could be a more automated way of scrubbing the data rather than find and replace too.
-Al

How to design a database for User Defined Fields(UDF)?

I am working on an application which will require one or more additional fields to be added to a table in order to track user defined information. This additional info is only used for reporting purposes(Crystal Reports), and will have no effect on the behavior of the application. The data for this field is populated from an outside application.
What would be the best way to handle this additional information? Here are some options based off of other SO answers:
Entity-Attribute-Value (would this be overkill? Seems like there are many critics of EAV)
Add additional column to table (not sure how Entity Framework would like this)
Create a new Table for each UDF and use primary of parent table to link

If I understand the requirement correctly, you will need a datapoint to save information that would come from an external application and its structure is undefined at design time. if that is correct, then I would suggest using an xml datatype. by choosing this, you will not need to redesign your database in future when new key value pairs are inserted. Crystal reports should be easily able to include an xsl for this column.
Hope this help and good luck

should I make two separate tables for two similar objects

I want to store "Tweets" and "Facebook Status" in my app as part of "Status collection" so every status collection will have a bunch of Tweets or a bunch of Facebook Statuses. For Facebook I'm only interested in text so I won't store videos/photos for now.
I was wondering in terms of best practice for DB design. Is it better to have one table (put the max for status to 420 to include both Facebook and Twitter limit) with "Type" column that determines what status it is or is it better to have two separate tables? and Why?

Strictly speaking, a tweet is not the same thing as a FB update. You may be ignoring non-text for now, but you may change your mind later and be stuck with a model that doesn't work. As a general rule, objects should not be treated as interchangeable unless they really are. If they are merely similar, you should either use 2 separate tables or use additional columns as necessary.
All that said, if it's really just text, you can probably get away with a single table. But this is a matter of opinion and you'll probably get lots of answers.

I would put the messages into one table and have another that defines the type:
SocialMediaMessage
------------------
id
SocialMediaTypeId
Message
SocialMediaType
---------------
Id
Name
They seem similar enough that there is no point to separate them. It will also make your life easier if you want to query across both Social Networking sites.

Its probably easier to use on table and use type to identify them. You will only need one query/stored procedure to access the data instead of one query for each type when you have multiple tables.