SQL Query to Count Duplicate Values in String - sql-server

I've searched for an answer to this here on the boards and elsewhere - I'm guessing this issue has already been resolved, but for some reason, I'm not finding what I need... so, I'll ask and hope this isn't a duplicated question.
What I have is a list of students and employees in a MSSQL database. One of the columns contains a string with the current term (semester) with an identifier for each class that the student is registered for, delimited by ';' for each entry.
What I'm trying to figure out is how to return a value of students who are signed up for more than 4 classes for the current semester. To get a count of all students registered for the current semester, the query is simple:
SELECT COUNT(*) AS Current_Students
FROM UserData
WHERE StuTermClassString LIKE '%2163%'
This works fine to return the total number of students, but I need a way to return a value of the full-time students (those that have more than 11 class hours per semester, which is typically 4+ classes). So I need a way to determine when the count of classes with '%2163%' for a record is > 4. If I haven't explained this well enough, please let me know and I'll expand on it more. Thanks!

Unfortunately handling pseudo-arrays packed into a string in SQL happens.
You will need to create a string splitting function to parse out the individual items in the list string and present them as a table variable.
See: http://sqlperformance.com/2012/07/t-sql-queries/split-strings
There are many ways to do this, but you should end up with a table variable that can use the standard built-in set oriented SQL functions to find what you need.

Direct answer to your question:
WHERE StuTermClassString LIKE "%2163%;%;%;%;%"<-- assuming the semester indicator is before the class names, and each class name has a semicolon after it.
But, might I suggest a more reasonable DB structure? Instead of having a column in your Student table with a string of ;-delimited class names, you could have a second table of Registrations, where each row has a Student ID (referring to a row in your Student table), a Semester Number, and a single class name. This way you could query for students with more than a certain number of classes by doing:
SELECT student_id FROM Registrations WHERE semester_number = "2163" GROUP BY student_id HAVING COUNT(student_id) > 3

Related

Avoiding database queries in Anylogic

I would like to avoid costly repeated data base queries in Anylogic. I have seen the following thread in Stack Overflow What is the fastest way to look up continuous data on Anylogic (Java, SQL) where a simple three step answer is provided but I'm not sure what the second point of the three actually means:
Save all rows as instances of that class at model start-up into a map - you can use Origin/Destination as the key (use Anylogic's Pair object) and the class instance as the value
I have created a class that takes as inputs the information from each column of my database. I would now like to save each row as an instance of that class - is there an easy way to do this? I may be missing something simple as I'm new to Anylogic.
I'm also unsure of how to create a mapping, if anyone could add more detail to point 2 above I'd be very grateful!
this is effectively the best advice, you created the class, which is a great step, but now, one element of that class, will be used as the key... for example the name... for instance if your class has firstName as one variable and lastName as another variable, you will use a string that is the concatenation of firstName and lastName as your key. Of course any key is fine, assuming that it is unique for all your table. Also an integer as an id is ok too.
create a collection of type LinkedHashMap
Create a class (you did that)
Your collection will take as the key a String (first + last name) and as the value of the elment the class...
now, when you read your database you will have something like this:
for(Tuple t : yourQueryResults){
YourClass yc=new YourClass(t.get(db.var1),t.get(db.var2));
String totalName=t.get(db.first_name)+"_"+t.get(db.last_name);
yourCollection.put(totalName,yc);
}
Now every time you want to find someone with the a name, for example "John Doe", instead of making a query, you will do
yourCollection.get("John_Doe").theVarYouWant;
if you use an id instead of the name, you can set an int as the key, and then you will just do yourCollection.get(theId).theVarYouWant

How to find a MoveTo destination filled by database?

I could need some help with a Anylogic Model.
Model (short): Manufacturing scenario with orders move in a individual route. The workplaces (WP) are dynamical created by simulation start. Their names, quantity and other parameters are stored in a database (excel Import). Also the orders are created according to an import. The Agent population "order" has a collection routing which contains the Workplaces it has to stop in the specific order.
Target: I want a moveTo block in main which finds the next destination of the agent order.
Problem and solution paths:
I set the destination Type to agent and in the Agent field I typed a function agent.getDestination(). This function is in order which returns the next entry of the collection WP destinationName = routing.get(i). With this I get a Datatype error (while run not compiling). I quess it's because the database does not save the entrys as WP Type but only String.
Is there a possiblity to create a collection with agents from an Excel?
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP. WP targetWP = findFirst(wps, w->w.name == destinationName);
Of corse wps (the population of Workplaces) couldn't be found.
How can I search the population?
Maybe with an Agentlink?
I think it is not that difficult but can't find an answer or a solution. As you can tell I'm a beginner... Hope the description is good an someone can help me or give me a hint :)
Thanks
Is there a possiblity to create a collection with agents from an Excel?
Not directly using the collection's properties and, as you've seen, you can't have database (DB) column types which are agent types.1
But this is relatively simple to do directly via Java code (and you can use the Insert Database Query wizard to construct the skeleton code for you).
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP
Yes, this is one approach. If your order details are in Excel/the database, they are presumably referring to workplaces via some String ID (which will be a parameter of the workplace agents you've created from a separate Excel worksheet/database table). You need to use the Java equals method to compare strings though, not == (which is for comparing numbers or whether two objects are the same object).
I want a moveTo block in main which finds the next destination of the agent order
So the general overall solution is
Create a population of Workplace agents (let's say called workplaces in Main) from the DB, each with a String parameter id or similar mapped from a DB column.
Create a population of Order agents (let's say called orders in Main) from the DB and then, in their on-startup action, set up their collection of workplace IDs (type ArrayList, element class String; let's say called workplaceIDsList) using data from another DB table.
Order probably also needs a working variable storing the next index in the list that it needs to go to (so let's say an int variable nextWorkplaceIndex which starts at 0).
Write a function in Main called getWorkplaceByID that has a single String argument id and returns a Workplace. This gets the workplace from the population that matches the ID; a one-line way similar to yours is findFirst(workplaces, w -> w.id.equals(id)).
The MoveTo block (which I presume is in Main) needs to move the Order to an agent defined by getWorkplaceByID(agent.workplaceIDsList.get(nextWorkplaceIndex++)). (The ++ bit increments the index after evaluating the expression so it is ready for the next workplace to go to.)
For populating the collection, you'd have two tables, something like the below (assuming using strings as IDs for workplaces and orders):
orders table: columns for parameters of your orders (including some String id column) other than the workplace-list. (Create one Order agent per row.)
order_workplaces table: columns order_id, sequence_num and workplace_id (so with multiple rows specifying the sequence of workplace IDs for an order ID).
In the On startup action of Order, set up the skeleton query code via the Insert Database Query wizard as below (where we want to loop through all rows for this order's ID and do something --- we'll change the skeleton code to add entries to the collection instead of just printing stuff via traceln like the skeleton code does).
Then we edit the skeleton code to look like the below. (Note we add an orderBy clause to the initial query so we ensure we get the rows in ascending sequence number order.)
List<Tuple> rows = selectFrom(order_workplaces)
.where(order_workplaces.order_id.eq(id))
.orderBy(order_workplaces.sequence_num.asc())
.list();
for (Tuple row : rows) {
workplaceIDsList.add(row.get(order_workplaces.workplace_id));
}
1 The AnyLogic database is a normal relational database --- HSQLDB in fact --- and databases only understand their own specific data types like VARCHAR, with AnyLogic and the libraries it uses translating these to Java types like String. In the user interface, AnyLogic makes it look like you set the column types as int, String, etc. but these are really the Java types that the columns' contents will ultimately be translated into.
AnyLogic does support columns which have option list types (and the special Code type column for columns containing executable Java code) but these are special cases using special logic under the covers to translate the column data (which is ultimately still a string of characters) into the appropriate option list instance or (for Code columns) into compiled-on-the-fly-and-then-executed Java).
Welcome to Stack Overflow :) To create a Population via Excel Import you have to create a method and call Code like this. You also need an empty Population.
int n = excelFile.getLastRowNum(YOUR_SHEET_NAME);
for(int i = FIRST_ROW; i <= n; i++){
String name = excelFile.getCellStringValue(YOUR_SHEET_NAME, i, 1);
double SEC_PARAMETER_TO_READ= excelFile.getCellNumericValue(YOUR_SHEET_NAME, i, 2);
WP workplace = add_wps(name, SEC_PARAMETER_TO_READ);
}
Now if you want to get a workplace by name, you have to create a method similar to your try.
Functionbody:
WP workplaceToFind = wps.findFirst(w -> w.name.equals(destinationName));
if(workplaceToFind != null){
//do what ever you want
}

Show UniData SELECT results that are not record keys

I'm looking over some UniData fields for distinct values but I'm hoping to find a simpler way of doing it. The values aren't keys to anything so right now I'm selecting the records I'm interested in and selecting the data I need with SAVING UNIQUE. The problem is, in order to see what I have all I know to do is save it out to a savedlist and then read through the savedlist file I created.
Is there a way to see the contents of a select without running it against a file?
If you are just wanted to visually look over the data, use LIST instead of SELECT.
The general syntax of the command is something like:
LIST filename WITH [criteria] [sort] [attributes | ALL]
So let's say you have a table called questions and want to look over all the author for questions that used the tag unidata. Your query might look something like:
LIST questions WITH tag = "unidata" BY author author
Note: The second author isn't a mistake, it's the start of the list of attributes you want displayed - in this case just author, but you might want the record id as well, so you could do #ID author instead. Or just do ALL to display everything in each record.
I did BY author here as it will make spotting uniques easier, but you can also use other query features like BREAK.ON to help here as well.
I don't know why I didn't think of it at the time but I basically needed something like SQL's DISTINCT statement since I just needed to view the unique values. Replicating DISTINCT in UniData is explained here, https://forum.precisonline.com/index.php?topic=318.0.
The trick is to sort on the values using BY, get a single unique value of each using BREAK-ON, and then suppress everything except those unique values using DET-SUP.
LIST BUILDINGS BY CITY BREAK-ON CITY DET-SUP
CITY.............
Albuquerque
Arlington
Ashland
Clinton
Franklin
Greenville
Madison
Milton
Springfield
Washington

Django Query Optimisation

I am working currently on telecom analytics project and newbie in query optimisation. To show result in browser it takes a full minute while just 45,000 records are to be accessed. Could you please suggest on ways to reduce time for showing results.
I wrote following query to find call-duration of a person of age-group:
sigma=0
popn=len(Demo.objects.filter(age_group=age))
card_list=[Demo.objects.filter(age_group=age)[i].card_no
for i in range(popn)]
for card in card_list:
dic=Fact_table.objects.filter(card_no=card.aggregate(Sum('duration'))
sigma+=dic['duration__sum']
avgDur=sigma/popn
Above code is within for loop to iterate over age-groups.
Model is as follows:
class Demo(models.Model):
card_no=models.CharField(max_length=20,primary_key=True)
gender=models.IntegerField()
age=models.IntegerField()
age_group=models.IntegerField()
class Fact_table(models.Model):
pri_key=models.BigIntegerField(primary_key=True)
card_no=models.CharField(max_length=20)
duration=models.IntegerField()
time_8bit=models.CharField(max_length=8)
time_of_day=models.IntegerField()
isBusinessHr=models.IntegerField()
Day_of_week=models.IntegerField()
Day=models.IntegerField()
Thanks
Try that:
sigma=0
demo_by_age = Demo.objects.filter(age_group=age);
popn=demo_by_age.count() #One
card_list = demo_by_age.values_list('card_no', flat=True) # Two
dic = Fact_table.objects.filter(card_no__in=card_list).aggregate(Sum('duration') #Three
sigma = dic['duration__sum']
avgDur=sigma/popn
A statement like card_list=[Demo.objects.filter(age_group=age)[i].card_no for i in range(popn)] will generate popn seperate queries and database hits. The query in the for-loop will also hit the database popn times. As a general rule, you should try to minimize the amount of queries you use, and you should only select the records you need.
With a few adjustments to your code this can be done in just one query.
There's generally no need to manually specify a primary_key, and in all but some very specific cases it's even better not to define any. Django automatically adds an indexed, auto-incremental primary key field. If you need the card_no field as a unique field, and you need to find rows based on this field, use this:
class Demo(models.Model):
card_no = models.SlugField(max_length=20, unique=True)
...
SlugField automatically adds a database index to the column, essentially making selections by this field as fast as when it is a primary key. This still allows other ways to access the table, e.g. foreign keys (as I'll explain in my next point), to use the (slightly) faster integer field specified by Django, and will ease the use of the model in Django.
If you need to relate an object to an object in another table, use models.ForeignKey. Django gives you a whole set of new functionality that not only makes it easier to use the models, it also makes a lot of queries faster by using JOIN clauses in the SQL query. So for you example:
class Fact_table(models.Model):
card = models.ForeignKey(Demo, related_name='facts')
...
The related_name fields allows you to access all Fact_table objects related to a Demo instance by using instance.facts in Django. (See https://docs.djangoproject.com/en/dev/ref/models/fields/#module-django.db.models.fields.related)
With these two changes, your query (including the loop over the different age_groups) can be changed into a blazing-fast one-hit query giving you the average duration of calls made by each age_group:
age_groups = Demo.objects.values('age_group').annotate(duration_avg=Avg('facts__duration'))
for group in age_groups:
print "Age group: %s - Average duration: %s" % group['age_group'], group['duration_avg']
.values('age_group') selects just the age_group field from the Demo's database table. .annotate(duration_avg=Avg('facts__duration')) takes every unique result from values (thus each unique age_group), and for each unique result will fetch all Fact_table objects related to any Demo object within that age_group, and calculate the average of all the duration fields - all in a single query.

Google App Engine query optimization

I have a Google App Engine datastore that could have several million records in it and I'm trying to figure out the best way to do a query where I need get records back that match several Strings.
For example, say I have the following Model:
String name
String level
Int score
I need to return all the records for a given "level" that also match a list of "names". There might be only 1 or 2 names in the name list, but there could be 100.
It's basically a list of high scores ("score") for players ("name") for a given level ("level"). I want to find all the scores for a given "level" for a list of players by "name" to build a high score list that include just your friends.
I could just loop over the list of "names" and do a query for each their high scores for that level, but I don't know if this is the best way. In SQL I could construct a single (complex) query to do this.
Given the size of the datastore, I want to make sure I'm not wasting time running python code that should be done by the query or vise-versa.
The "level" needs to be a String, not an Int since they are not numbered levesl but rather level names, but I don't know if that matters.
You can use IN filter operator to match property against a list of values (user names):
scores = Scores.all().filter('level ==', level).filter('user IN', user_list)
Note that under the hood this performs as much queries as there are users in user_list.
players = Player.all().filter('level =', level).order('score')
names = [name1, name2, name3, ...]
players = [p for p in players if p.name in names]
for player in players:
print name, print score
is this what you want?
...or am i simplifying too much?
No you can not do that in one pass.
You will have to either query the friends for the level one by one
or
make a friends score entity for each level. Each time the score changes check which friends list he belongs to and update all their lists. Then its just a matter or retrieving that list.
the first one will be slow and the second costly unless optimized.

Resources