Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am creating a REST API and I have, for example, Authors and Posts.
So I can get the posts published by an author with:
/authors/123/posts
Or
/posts?authorId=123
I think building a flexible API might be a better option. To get posts by author I would do:
/posts?authorId=123&published=true&sort=created&expand=tags,category
So in this case I am getting all posts from authorId=123 that are published sort them by created date and also get the tags and category of each post.
Basically, I create a query language that kind of maps to each database table.
Then for common queries I create specific endpoints:
/posts/recent
Would return recent posts independently of the author ...
I think /authors/123/posts might become complex when using many levels.
What do you think?
UPDATE
After a few answers my idea is the following:
When Posts and Authors are two resources I would have (example for posts):
GET /posts?authorId=123&published=true
POST /posts
PUT /posts
DELETE /posts/123
If there is hierarchical dependency between Posts and Authors and I often need posts by author I would also add the following:
GET /authors/123/posts&published=true
If posts would not exist outside of resource Author then the 2 previous options would be replaced by:
GET /authors/123/posts?published=true
POST /authors/123/posts
PUT /authors/123/posts
DELETE /authors/123/posts/123
What do you think?
There is a myth that each resource must have only one path. Both of your examples are correct.
With your first example:
/authors/123/posts.
It makes sense when there is a hierarchical relationship between the two, if your current context is an author, it's no problem accessing posts of that author.
With your second example:
/posts?authorId=123&published=true&sort=created&expand=tags,category.
It makes sense to query the entire repository of posts distributed in many facets. For example, on your application, you may have a search page to search posts, it's the perfect use case for that.
Speaking in terms of model-view. Our resource is like model, our urls are like views. Nothing stops us from using many views for the same model. You just need to ensure you use the same backing code to access your stored data. There are different ways (views) to look at your data.
Summary: You can use both urls in your project. For example, use the /authors/123/posts on your author page to list posts of an author and use the /posts?authorId=123&published=true&sort=created&expand=tags,category on your post search page.
You could also have a look at this post for a similar opinion: What are best practices for REST nested resources
The rule of thumb we use on our projects: use a url path, e.g. /authors/123/posts if there is a clear hierarchical relationship between the two, and the resource you're visiting cannot live outside the context of the other one. For example, when retrieving order lines, it is straightforward to use /orders/123/lines, because an orderline has no meaning outside the context of the order itself.
In your case, posts can clearly 'live' outside the context of their author. If you can search posts by author, date, subject,... it makes sense to pass these values as a query string. So in my opinion, /posts?authorId=123 is the way to go here.
Related
I am making a simple app where one can post any thing and anyone can comment on that post.
Do I need to create an entity group making every post as parent and comments as children?If I am wrong how to model the data having post and comment as entities? Documentation says there a limit of 1 write/sec If my app has high traffic like many people are commenting on the same post at the same time there will be a problem of transaction failure. How to solve this ?
Before you start coding on a new platform, first do a bit of research and try to fully understand all the features and nuances first.
To answer your question, YES, you can create an Entity Group with the original post as parent and then each comment as a child, but WHY?
What are you trying to achieve? Having a Post with comments attached to it, right? Why not just save all of them in ONE entity?
Entities have attributes (fields); so you can have POST, COMMENTS as field types of the Entity Thread. The COMMENTS attribute/field can be an array of strings or text or objects as you see fit.
So when you are displaying the data, fetch the Thread entity, and display its POST field (the original post), then underneath it, display the COMMENTS field (extract strings from the string array then display them in order).
Basically, I've implemented the HABTM successfully in CakePHP, but the trouble is, I don't understand why it works.
The thing I hate about the CakePHP cookbook is that is tells you what to do but make very little effort to explain the underlying segments of their code.
Essentially, my data model is like this.
Task HABTM Question
I don't understand this code fragment.
$this->set('questions', $this->Task->Question->find('list'))
In particular, what is $this->Task->Question supposed to accomplish?
Also how is the above code link to this code fragment in the view?
echo $this->Form->input('Question');
One thing that is very peculiar is that with the above code fragment, I get a multiple select option.
However, if I change the code to this,
echo $this->Form->input('question');
I get a single select drop down list.
I scoured the entire documentation and still cannot find a satisfactory explanation to my doubts.
Would really appreciate if anyone can clarify this issue for me.
1. Model chaining
When a model has an association to another model (like in your example an HABTM one) then you can call methods of the associated model by chaining it to the current model. This is explained early in Associations and an example of exactly how it works is given at the end of the first section.
When you are someplace in your TasksController normally you would expect that only your Task model would be available. Instead any association described in the Task model is chained to that model in the form of $this->Model1->Model2.
So $this->set('questions', $this->Task->Question->find('list')) means:
From current model Task that you know about, access the associated model Question and then call its find('list') method. Then $this->set the results to the view as variable questions.
2. FormHelper Conventions
When you use a CamelCased single name for field input, like in $this->Form->input('Question'); you are saying to FormHelper that the data contained in the questions variable come from a model named Question with a HABTM association, therefore they should be handled as a multiple select (as HABTM points to such an association).
With a field name of model_id, like in this example question_id, you're asking for a single select (select a single id of the connected model).
With anything else, FormHelper looks at the field definition and takes the decision itself, but of course your can override any default behavior you want using options.
This is explained in detail and I'm surprised you missed both. CakePHP has one of the best documentations available, almost everything you need is there.
I'm using CakePHP2.3 and my app has many associations between models. It's very common that a controller action will involve manipulating data from another model. So I start to write a method in the model class to keep the controllers skinny... But in these situations, I'm never sure which model the method should go in?
Here's an example. Say I have two models: Book and Author. Author hasMany Book. In the /books/add view I might want to show a drop-down list of popular authors for the user to select as associated with that book. So I need to write a method in one of the two models. Should I...
A. Write a method in the Author model class and call that method from inside the BooksController::add() action...
$this->Author->get_popular_authors()
B. Write a method in the Book model class that instantiates the other model and uses it's find functions... Ex:
//Inside Book::get_popular_authors()
$Author = new Author();
$populars = $Author->find('all', $options);
return $populars;
I think my question is the same as asking "what is the best practice for writing model methods that primarily deal with associations between another model?" How best to decide which model that method should belong to? Thanks in advance.
PS: I'm not interested in hearing whether you thinking CakePHP sucks or isn't "true" MVC. This question is about MVC design pattern, not framework(s).
IMHO the function should be in the model that most closely matches the data you're trying to retrieve. Models are the "data layer".
So if you're fetching "popular authors", the function should be in the Author model, and so on.
Sometimes a function won't fit any model "cleanly", so you just pick one and continue. There are much more productive design decisions to concern yourself with. :)
BTW, in Cake, related models can be accessed without fetching "other" the model object. So if Book is related to Author:
//BooksController
$this->Book->Author->get_popular_authors();
//Book Model
$this->Author->get_popular_authors();
ref: http://book.cakephp.org/2.0/en/models/associations-linking-models-together.html#relationship-types
Follow the coding standards: get_popular_authors() this should be camel cased getPopularAuthors().
My guess is further that you want to display a list of popular authors. I would implement this using an element and cache that element and fetching the data in that element using requestAction() to fetch the data from the Authors controller (the action calls the model method).
This way the code is in the "right" place, your element is cached (performance bonus) and reuseable within any place.
That brings me back to
"what is the best practice for writing model methods that primarily
deal with associations between another model?"
In theory you can stuff your code into any model and call it through the assocs. I would say common sense applies here: Your method should be implement in the model/controller it matches the most. Is it user related? User model/controller. Is it a book that belongs to an user? Book model/controller.
I would always try to keep the coupling low and put the code into a specific domain. See also separation of concerns.
I think the key point to answer your question is defined by your specifications: "... popular authors for the user to select as associated with that book.".
That, in addition to the fact that you fetch all the authors, makes me ask:
What is the criteria that you will use to determine which authors are popular?
I doubt it, but if that depends on the current book being added, or some previous fields the user entered, there's some sense in adopting solution B and write the logic inside the Book model.
More likely solution A is the correct one because your case needs the code to find popular authors only in the add action of the Book controller. It is a "feature" of the add action only and so it should be coded inside the Author model to retrieve the list and called by the add action when preparing the "empty" form to pass the list to the view.
Furthermore, it would make sense to write some similar code inside the Book model if you wanted, e.g., to display all the other books from the same author.
In this case you seem to want popular authors (those with more books ?), so this clearly is an "extra feature" of the Author model (That you could even code as a custom find method).
In any case, as stated by others as well, there's no need to re-load the Author model as it is automatically loaded via its association with Books.
Look out for Premature Optimization. Just build your project till it works. You can always optimize your code or mvc patterns after you do a review of your code. And most important after your project is done most of the time you will see a more clear or better way to do it faster/smarter and better than you did before.
You can't and never will build a perfect mvc or project in one time. You need to find yourself a way of working you like or prefer and in time you'll learn how to improve your coding.
See for more information about Premature Optimization
I was wondering if someone could point me in the right direction.
Here is my problem: I have a large form/checklist that I would like to make digital for ease of use.
Thoughts: I would like to use existing tools that would be easy to integrate. My first option is Access 2010.
My question: I would like to enter the questions into a database and then use those entries to auto generate a form that can be used to allow the user to input the actual data into the database. An example would be I have 11 Sections of questions and under each section I have sub-sections that can contain anywhere from 1-... how many every questions we need.
Is it possible to use data stored in an Access database to generate a form with Checkboxes that can be used to input data?
Please point me in the right direction. Obviously there is the option of just creating multiple forms or one big form, but I would like this form to easily be changed etc... Less work more automation.
Thanks,
Alex
Depending on the requirements of your project, this may be quite possible. If you want to use Access as both the back-end and front-end, then you'll need to work within a few limitations:
Because Access combines the user interface and design interface into the same screen, it requires a certain amount of trust that the user can't or won't try to get too creative with changing the data, seeing everyone else's data, changing the design of your form because they are bored, etc. There are ways around these problems, but they can get complicated.
Will all your users be using Window's machines with Access 2010 installed and with the original default settings? If so, good. If not, there may be ways that this could still work.
(There's more, but that's all I can think of right now)
To get started, here's a broad outline:
Make a table for your questions. This table would just have the questions.
Make a form using that question table as the source. Leave the checkboxes and other answer fields unbounded. Include a 'submit' button at the bottom.
The submit button will create a sql query to insert the user's answers into a 2nd table.
If you have any specific questions, we here at SO will be glad to answer them.
In order to dynamically and easily change the number of questions in the sections, what I would do is:
In the main Questions table, add a field called Section to allocate the questions into diffferent ones, and another one Yes/No field to select those that are included (you may also exclude them by leaving the section field empty, as you wish). This will solve the problem of changing the design easily. you probaly will need an admin form in order to do this changes, to avoid touching the tables directly, but this is your decision.
Secondly, in order to allow the users to efectively answer the generated form, you have to ask yourself if you want to accumulate the answer sets, and if you are going to control who answers
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am interested in recommendation engines these days and I want to improve myself in this area. I am currently reading "Programming Collective Intelligence" I think this is the best book about this subject, from O'Reilly. But I don't have any ideas how to implement engine; What I mean by "no idea" is "don't know how to start". I have a project like Last.fm in my mind.
Where do (should be implemented on database side or backend side) I start creating
recommendation engine?
What level of database knowledge will be needed?
Is there any open source ones that can be used for help or any resource?
What should be the first steps that I have to do?
Presenting recommendations can be split up in to two main sections:
Feature extraction
Recommendation
Feature extraction is very specific to the object being recommended. For music, for example, some features of the object might be the frequency response of the song, the power, the genre, etc. The features for the users might be age, location, etc. You then create a vector for each user and song with the various elements of the vector corresponding to different features of interest.
Performing the actual recommendation only requires well thought out feature vectors. Note that if you don't choose the right features your recommendation engine will fail. This would be like asking you to tell me my sex based on my age. Of course my age may provide a bit of information, but I think you could imagine better questions to ask. Anyways, once you have your feature vectors for each user and song, you will need to train the recommendation engine. I think the best way to do this would be to get a whole bunch of users to take your demographic test and then tell you specific songs that they like. At this point you have all the information you need. Your job is to draw a decision boundary with the information you have. Consider a simple example. You want to predict whether or not a user likes AC/DC's "Back in Black" based on age and sex. Imagine a graph showing 100 data points. The x axis is age, the y axis is sex (1 is male, 2 is female). A black mark indicates that the user likes the song while a red mark means they don't like the song. My guess is that this graph might have a lot of black marks corresponding to users that are male and between the ages of 12 and 37 while the rest of the marks will be red. So, if we were to manually select a decision boundary, it'd be a rectangle around this area holding the majority of the black marks. This is called the decision boundary because, if a completely new person comes to you and tells you their age and sex, you only have to plot them on the graph and ask whether or not they fall within that box.
So, the hard part here is finding the decision boundary. The good news is that you don't need to know how to do that. You just need to know how to use some of the common tools. You can look into using neural networks, support vector machines, linear classifiers, etc. Again, don't let the big names fool you. Most people can't tell you what these things are really doing. They just know how to plug things in and get results.
I know it's a bit late, but I hope this helps anyone that stumbles on this thread.
I've built up one for a video portal myself. The main idea that I had was about collecting data about everything:
Who uploaded a video?
Who commented on a video?
Which tags where created?
Who visited the video? (also tracking anonymous visitors)
Who favorited a video?
Who rated a video?
Which channels was the video assigned to?
Text streams of title, description, tags, channels and comments are collected by a fulltext indexer which puts weight on each of the data sources.
Next I created functions which return lists of (id,weight) tuples for each of the above points. Some only consider a limited amount of videos (eg last 50), some modify the weight by eg rating, tag count (more often tagged = less expressive). There are functions that return the following lists:
Similar videos by fulltext search
Videos uploaded by the same user
Other videos the users from these comments also commented on
Other videos the users from these favorites also favorited
Other videos the raters from these ratings also rated on (weighted)
Other videos in the same channels
Other videos with the same tags (weighted by "expressiveness" of tags)
Other videos played by people who played this video (XY latest plays)
Similar videos by comments fulltext
Similar videos by title fulltext
Similar videos by description fulltext
Similar videos by tags fulltext
All these will be combined into a single list by just summing up the weights by video ids, then sorted by weight. This works pretty well for around 1000 videos now. But you need to do background processing or extreme caching for this to be speedy.
I'm hoping that I can reduce this to a generic recommendation engine or similarity calculator soon and release as a rails/activerecord plugin. Currently it's still a well integrated part of my project.
To give a small hint, in ruby code it looks like this:
def related_by_tags
tag_names.find(:all, :include => :videos).inject([]) { |result,t|
result + t.video_ids.map { |v|
[v, TAG_WEIGHT / (0.1 + Math.log(t.video_ids.length) / Math.log(2))]
}
}
end
I would be interested on how other people solve such algorithms.
This is really a very big question you are asking, so even if I could give you a detailed answer I doubt I'd have the time.... but I do have a suggestion, take a look at Greg Linden's blog and his papers on item-based collaborative filtering. Greg implemented the idea of recommendations engines at Amazon using the item based approach, he really knows his stuff and his blog and papers are very readable.
Blog: http://glinden.blogspot.com/
Paper: http://www.computer.org/portal/web/csdl/doi/10.1109/MIC.2003.1167344 (I'm afraid you need to log in to read it in full, as you are a CS student this should be possible).
Edit
You could also take a look at Infer.Net, they include an example of building a recommender system for movie data.
I have a 2 part blog on collaborative filtering based recommendation engine for implementation in Hadoop.
http://pkghosh.wordpress.com/2010/10/19/recommendation-engine-powered-by-hadoop-part-1/
http://pkghosh.wordpress.com/2010/10/31/recommendation-engine-powered-by-hadoop-part-2/
Here is the github repository for the open source project
https://github.com/pranab/sifarish
Feel free to use if you like it.
An example recommendation engine that is open source (AGPLv3-licensed) has been published by Filmaster.com recently. It's written in C++ and uses best practices from the white papers produced as part of the Netflix challange. An article about it can be found at: http://polishlinux.org/gnu/open-source-film-recommendation-engine/
and the code is here: http://bitbucket.org/filmaster/filmaster-test/src/tip/count_recommendations.cpp