I am working on a database structure now and I've faced one tricky moment. I bet it's a common thing but I cannot decide so far what way to choose.
I've got three types of clients - companies, self-employed and retail clients.
All of them have own fields like: firstname, lastname, ... for people and fullname, shortname for companies.
I am puzzled whether I should try to pack everything into one collection - clients and for some of clients some fields won't be available. Or keep them in separate collections.
The pitfall here if I push them all into one collection it would seem messy and when I fetch data I need to pass it to a helper function which detect the type and handles the raw data. That's gonna happen on many pages.
Or if I keep them in different collections I need first detect there to look for the data and have two or three requests.
I believe that's an ordinary situation and there must be good practice on that.
I would appreciate any thoughts and ideas!
Keep them in different collections would be better, like what you said if you push them all into one collection it would seem messy.
You still can use $unionWith to make only one request to search those collections.
Related
For quite a while I am struggling with how to save custom user specific arrays of data in Mailchimp.
Simple example: I want to save the project ids for a user in Mailchimp and in best case be able to use them there properly as well. Let's say user fritz#frey.com has the 5 project ids 12345, 25345, 21342, 23424 and 48935. Why is there no array merge field that let's me save this array of project ids to a user?! (Or is there one and I am just blind...)
I know I can do drop down fields to put users in multiple groups, like types of projects for example, but the solution can hardly be a drop down with all (several thousand) project ids and I check the ones the user is a part of (and I doubt that Mailchimp would support that solution for a large number of group items anyways).
Oh and of course I could make the field myself by abusing a string field and connect the project ids with commas or a json string but that seems neither like a clean solution nor could I use the data properly in Mailchimp (as far as I know).
I googled quite a bit and couldn't find anything helpful sadly... :(
So? Can anybody enlighten me? :)
Thanks for all your help!
It sounds like you have already arrived at the correct answer: there is no "array" type, other than the interests type, which is global and not quite the same as an array.
The best solution here sort of depends on your data. If each project ID will have many different subscribers attached to it, and there won't be too many of them active at any given time, I'd just use interests. If you think there may be dozens of project ids active simultaneously, I'd not store this data on the subscribers at all, instead I'd build static segments for each project, and add users to them.
If projects won't have a bunch of subscribers associated, I'd store the data on your end and/or continue using the comma-separated string field.
I'm going to write simple news site on redis with supporting followers.
I can't imagine how can I organize users timeline like in twitter. I read about Retwis ( http://redis.io/topics/twitter-clone ), but its feed creating method seems stupid. What if I want to remove entries? I'll should to remove all entry references from followers feeds. What if I already do not follow some users?
There are several ways to attack what you describe using a bit of imagination, here are some examples that address your questions:
What if I want to remove entries?
One could mantain a set such as post:$postid:users for each post, holding all the userids that may have the post in their feeds; when the post is to be deleted one just has to extract all members from this set and iterate through the ids to remove it from each uid:$userid:posts set; speaking of which you would have to turn that last one into a set instead of a list like the original article suggests in order to be able to extract and remove individual items but that is trivial, the logic is pretty similar.
What if I already do not follow some users?
When the feed is being generated for each individual user you have to necessarily iterate and read each post:$postid key, from which you have access to the author userid; so before showing the post you read this id and look it up in the uid:$userid:following set, if it's there we show the post, if it's not we delete it from uid:$userid:posts and don't show it.
In a nutshell, this is what you have to keep in mind in order to build this kind of logic in redis:
You'll need many commands, but that's ok, Redis is supposed to be fast enough to handle it well.
Data will repeat, but that is also ok; it may look insane for someone with a relational DBMS background to store a set of users for each post if each user already has a set with their posts, but this is the only way around building relationships in a non-relational data store like redis.
Generally speaking think of sets and sorted sets when designing something relational in Redis.
With redis you get to do everything yourself, but once you get your head around it it's actually pretty powerful.
I have the following models: Client, Device and Revision
Cliente has many Device, and Device has many Revision
I want to retrieve all the latest revisions from a Client. I know I can use recursive=2 in client and I will get revisions like this:
$client['Client']['Device'][$i]['Revision'] //(array of revisions)
What I actually want is getting the latest revisions like these:
$client['Client']['Revision'][$i] //(array of revisions)
$client['Client']['Revision'][$i]['Device'] //and I may see the device like this. I know this is duplicated info, but the order is pretty much different.
I know there are plenty of ways of doing it. Much of the ones I think involve using direct SQL or processing the arrays, but is there actually a method that may be able to do this by just passing parameters?
I also thought of first getting all the device_id of a certain client, and the find all revisions where device_id IN (device1_id, device2_id, ...), but I don't really this, thought, not sure if there is a better way.
I want to store "Tweets" and "Facebook Status" in my app as part of "Status collection" so every status collection will have a bunch of Tweets or a bunch of Facebook Statuses. For Facebook I'm only interested in text so I won't store videos/photos for now.
I was wondering in terms of best practice for DB design. Is it better to have one table (put the max for status to 420 to include both Facebook and Twitter limit) with "Type" column that determines what status it is or is it better to have two separate tables? and Why?
Strictly speaking, a tweet is not the same thing as a FB update. You may be ignoring non-text for now, but you may change your mind later and be stuck with a model that doesn't work. As a general rule, objects should not be treated as interchangeable unless they really are. If they are merely similar, you should either use 2 separate tables or use additional columns as necessary.
All that said, if it's really just text, you can probably get away with a single table. But this is a matter of opinion and you'll probably get lots of answers.
I would put the messages into one table and have another that defines the type:
SocialMediaMessage
------------------
id
SocialMediaTypeId
Message
SocialMediaType
---------------
Id
Name
They seem similar enough that there is no point to separate them. It will also make your life easier if you want to query across both Social Networking sites.
Its probably easier to use on table and use type to identify them. You will only need one query/stored procedure to access the data instead of one query for each type when you have multiple tables.
I haven't been able to find an answer to my question so far, and I suppose I have to ask my first question some time. Here goes.
I have a Data Access Layer that's responsible for interacting with various data storage elements and returns POCOs or collections of POCOs when querying out things.
I have a Business Layer that sits on top of this and is responsible for implementing business rules on objects returned from the Data Access Layer.
For instance, I have a SQL Table of Dogs, my data access layer can return that list of dogs as a collection of Dog object. My business layer then would do things like filter out dogs below a certain age, or any other filtering or transformation that had to happen based on business rules.
My question is this. What's the best way to handle filtering objects based on related records? Let's say I want all the people who have Cats. Right now my data access layer can return all the cats, and all the people, but doesn't do any filtering for me.
I can implement the filtering via a different data access method (i.e. DAO.GetCatPeople()) but this could get complicated if I have a multitude of related properties or relationships to handle
I can return all from both sides and do the matching myself all in the business layer, which seems like a lot of extra work and not fully utilizing the sql server.
I can write a data filtration interface and if my data access layer changes this layer would have to change as well.
Is there some known best practices here I could be benefiting from?
The view I take is that there's two "reasons" why you'd access data: Data centric and Use Case centric.
Data Centric is stuff like CRUD and other common / obvious stuff that is a no brainer.
"Use Case" centric is where you define an interface and matching POCO's for a specific purpose. [It's possible I'm missing some common terminology here, but Use Case centric is what I mean]
I think both types are valid. For the use case driven ones it's going to be mostly driven by business focused use cases, but I can see edge cases where they could be more technically driven - I'd say that was ok as long as they didn't violate any business rules or pervert your domain model.
Should Cats and Dogs know about each other? If they exist within the same domain model, and have established relationships within that model - then yes of course you should be able to make queries like GetCatPeople().
As far as managing the complexity goes, rather than GetCatPeople() you could have a more generic method that took an attribute as a parameter: GetPeopleByAnimal(animal).