Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm working on an ontology and I'm having an issue regarding the best approach for defining some concepts. To make my question easier to express, I'll take an example.
Let's suppose that I'm interested, while defining the concept of Football, to say that it requires 2 teams. I have 2 approaches:
Define a hasTeam object property and a Team class and make Football a subclass of:
hasTeam exactly 2 Team
Define a teamCount data property and make Football a subclass of:
teamCount value 2
Which are the advantages of each and which might be the better approach when defining an ontology?
The first solution allows you to specify which teams are involved in Football (football match, I assume), while the second does not allow for this - it is just a restriction over the integer datarange saying that the only value admissible for your property is 2.
I would go for the first solution, as the second one basically reduces the data property to a marker - since there is only one possible value, its presence is equivalent to the individual it's applied to belonging to a class, and allows for less information to be modeled.
But it really depends on the rest of your requirements.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
In our application, we have to filter some data. we are using Dynamo DB. In our team, we have a different opinion to use filter expression at dynamo or application level. I want to know what has been following in the industry. Please let me know if you know about some good blogs?
Consider the scenario, we have to deal-template in the active state which can be deactivated by the user. In the get list call, we want to send only active templates.
Dynamo:
filterCondition := expression.Name(activeColumn).Equal(expression.Value(true))
Application:
List<DealTemplate> templates = getTemplate()
for templaes := range templates {
if template.isActive {
// process
}
}
May be getting close to the line regarding questions with opinion based answers..
But the best solution is to structure your data so you don't have to filter anything out at all. Use either a Local Secondary Index (LSI - no extra cost) or a Global Secondary Index (GSI - extra cost)
This way you don't have to pay to read data that gets thrown away.
Otherwise use a filter expression. You still have to pay to read the data, but you don't have to pay to transfer it back; in real $$ if out of AWS and in overhead regardless.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I came across various problem whether to consider categorical variable which really have some impact on prediction.
I want to know , whether we should consider categorical variable while building model which has around 43 levels.
categorycategory_level
i want to build a model for binary classification problem, for that i have already tried LevelEncoder,OneHotencoder etc from scikit learn.
But nothing works out and dont know how i can consider this categorical feature.
We can use categorical variables in predictions. If you have around 43 levels as you mentioned, you may club similar levels into a single category and so on. This will be a business decision or you can see how different categories in that variable are related to output variable. This will bring down the number of levels from 43 to a less number. Then create the dummy variables on those clubbed categories.
Another way to do this would be by using ANOVA (Analysis of Variance) to see how different are the various categories in that variable. If they are not significantly different, you can club them in one category. I will share an example to explain the same.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am developing an app which requires to generate an id for new users I want to do it with the smallest number of characters that allows me to create 100 billion diferent possible ids so how should I do that and how to avoid giving two users the same it? Should I look if that id exists? Should I use a random id generator or give ids in order like 001 002 and so on?
This depends entirely on what kind of functionality you expect from this id, do you intend for these id's to correlate with persisted data, such as a database? If this is the case, it might be more prudent to let the database handle the unique ID generation for you. Otherwise, using sequential values such as 1,2,3... etc would probably be ideal. unsigned long will keep you covered for the first 2 billion users... If you somehow go beyond that, you can rethink your data storage then.
The question is very broad.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Is NSOrderedSet faster than NSSet? Has anyone done any tests where on is better over the other? If not why was NSOrderedSet introduced in the first place?
NSOrderedSet
The point of using an ordered set is that it is traversable in its original order in which items were added to it, and querying whether an object is contained is faster than for an unordered array. The "contains" operation (and set operations that build on it) is however slower than the O(1) that's possible with an unordered set for that operation.
Unordered set
The point of a set is that it allows for a best case O(1) "contains" query time. It is the data structure you should use out of these two when you need as fast "contains" time and do not need to retrieve the items in the structure in any specified sort order.
It is internally probably implemented as a hash map, although it's not pointed out in the Foundation documentation.
I'd advise reading this great blog post regarding the different uses of the different Foundation data structures.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I received a list from a customer using bullet points, and then sub bullet points. What is the best way to store these in a Postgres database, if you could give me an example of this, that would be great.
Thanks!
Structure of it is something similar to this:
Defect1
possible instance of defect1
another possible instance of defect1
Defect2
possible instance of defect2
another possible instance of defect2...
For indented lists you're basically talking about a tree structure. There are many ways to store hierarchies. See this answer for a comparison.
Design Relational Database - Use hierarchical datamodels or avoid them?
Depending on how you want to use the data, i.e., if you're just going to spit it back out as it came in, you may be able to skip the hierarchy aspect in this particular use case and just store each line in sequence with an indentation field. It won't do nearly what can be done with a tree, but it may be all that's needed in your particular case.