IBM Watson Retrieve and Rank: Ranker training failure

IBM Watson Retrieve and Rank: Ranker training failure - ibm-watson

I'm following this Tutorial. I've successfully completed Stage 3. Now in stage 4, I downloaded the sample ground truth and tried to create the ranker using REST API. The ranker is created with status as training. When I try to get that ranker by its Id after some time, I receive following error:
{
"ranker_id": "1eec74x28-rank-4116",
"name": "First Training Data",
"created": "2017-04-05T09:26:43.925Z",
"url": "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/rankers/1eec74x28-rank-4116",
"status": "Failed",
"status_description": "Error encountered during training: Training data quality standards not met: invalid header (duplicate feature names). Row 1 of input data."
}
When I tried to identify the problem I found this link. However, I'm unable to identify the issue here. I am not understanding the minimum standards of the training data.
Please guide me with corrective measures.

For anyone looking here, a response to this question was posted on the IBM developer works forum: https://developer.ibm.com/answers/questions/366976/watson-retrieve-and-rank-failing-to-train-the-rank.html

Related

Get Display Name for License SKU in Microsoft Graph

I am trying to use Microsoft Graph to capture the products which we have licenses for.
While I can get the skupartname, that name is not exactly display-friendly.
I have come across DisplayName as a datapoint in almost all the API calls that give out an object with an id.
I was wondering if there was a DisplayName for the skus, and where I could go to get them via the graph.
For reference, the call I made was on the https://graph.microsoft.com/v1.0/subscribedSkus endpoint following the doc https://learn.microsoft.com/en-us/graph/api/subscribedsku-list?view=graph-rest-1.0
The following is what's returned (after filtering out things I don't need), and as mentioned before, while I have a unique identifier which I can use via the skuPartNumber, that is not exactly PRESENTABLE.
You might notice for some of the skus, it difficult to figure out what it is referring to based on the names in the image of the Licenses page posted after the output
[
{
"capabilityStatus": "Enabled",
"consumedUnits": 0,
"id": "aca06701-ea7e-42b5-81e7-6ecaee2811ad_2b9c8e7c-319c-43a2-a2a0-48c5c6161de7",
"skuId": "2b9c8e7c-319c-43a2-a2a0-48c5c6161de7",
"skuPartNumber": "AAD_BASIC"
},
{
"capabilityStatus": "Enabled",
"consumedUnits": 0,
"id": "aca06701-ea7e-42b5-81e7-6ecaee2811ad_df845ce7-05f9-4894-b5f2-11bbfbcfd2b6",
"skuId": "df845ce7-05f9-4894-b5f2-11bbfbcfd2b6",
"skuPartNumber": "ADALLOM_STANDALONE"
},
{
"capabilityStatus": "Enabled",
"consumedUnits": 96,
"id": "aca06701-ea7e-42b5-81e7-6ecaee2811ad_0c266dff-15dd-4b49-8397-2bb16070ed52",
"skuId": "0c266dff-15dd-4b49-8397-2bb16070ed52",
"skuPartNumber": "MCOMEETADV"
}
]
Edit:
I am aware that I can get "friendly names" of SKUs in the following link
https://learn.microsoft.com/en-us/azure/active-directory/users-groups-roles/licensing-service-plan-reference
The problem is that it contains ONLY the 70ish most COMMON SKUs (in the last financial quarter), NOT ALL.
My organization alone has 5 SKUs not present on that page, and some of our clients for who we are an MSP for, also have a few. In that context, the link really does not solve the problem, since it is not reliable, nor updated fast enough for new SKUs

You can see a match list from Product names and service plan identifiers for licensing.
Please note that:
the table lists the most commonly used Microsoft online service
products and provides their various ID values. These tables are for
reference purposes and are accurate only as of the date when this
article was last updated. Microsoft does not plan to update them for
newly added services periodically.
Here is an extra list which may be helpful.

There is a CSV download available of the data on the "Product names and service plan identifiers for licensing" page now.
For example, the current CSV (as of the time of posting this answer) is located at https://download.microsoft.com/download/e/3/e/e3e9faf2-f28b-490a-9ada-c6089a1fc5b0/Product%20names%20and%20service%20plan%20identifiers%20for%20licensing%20v9_22_2021.csv. This can be downloaded, cached and parsed in your application to resolve the product display name.
This is just a CSV format of the same table that is displayed on the webpage, which is not comprehensive, but it should have many of the products listed. If you find one that is missing, you can use the "Submit feedback for this page" button on the bottom of the page to create a GitHub issue. The documentation team usually responds in a few weeks.
Microsoft may provide an API for this data in the future, but it's only in their backlog. (source)

How to find last message in chat conversation between users in mongo db

I am building a chat system.Here is the chat_history document.
{
toUser:123 <Int32>
fromUser:456 <Int32>
message:"message 1" <String>
timeStamp:"2019-10-09 16:39:14:1414",
toUser:456,
fromUser:123,
message:"Man super man ",
timeStamp:"2019-10-09 16:43:09:0909 PM +05:30",
toUser:101,
fromUser:123,
message:"last",
timeStamp:"2019-10-09 16:43:09:0909 PM +05:30",
toUser:123 <Int32>
fromUser:456 <Int32>
message:"message 2"
timeStamp:"2019-10-11 16:39:14:1414",
}
Above are the sample collections in the document of all chat messages between all the users. Now I need to find the rows of last message based on timestamp between particular user for example:123 and other users with 123 has done chatting.
I am new to mongo i need a query in mongo.
Can anybody please help me with query.
Thanks in advance.

You can get the answer with aggregation for this document model but for high performance, I suggest this
Save timestamp as a date object and query as following
db.chat_history.find({$or: [{fromUser: 123}, {toUser: 123}]}).sort({timestamp: -1}).limit(1)

How to print the count of array elements along with another variable in MongoDB

I have a data collection which contains records in the following format.
{
"_id": 22,
"title": "Hibernate in Action",
"isbn": "193239415X",
"pageCount": 400,
"publishedDate": ISODate("2004-08-01T07:00:00Z"),
"thumbnailUrl": "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.book-thumb-images/bauer.jpg",
"shortDescription": "\"2005 Best Java Book!\" -- Java Developer's Journal",
"longDescription": "Hibernate practically exploded on the Java scene. Why is this open-source tool so popular Because it automates a tedious task: persisting your Java objects to a relational database. The inevitable mismatch between your object-oriented code and the relational database requires you to write code that maps one to the other. This code is often complex, tedious and costly to develop. Hibernate does the mapping for you. Not only that, Hibernate makes it easy. Positioned as a layer between your application and your database, Hibernate takes care of loading and saving of objects. Hibernate applications are cheaper, more portable, and more resilient to change. And they perform better than anything you are likely to develop yourself. Hibernate in Action carefully explains the concepts you need, then gets you going. It builds on a single example to show you how to use Hibernate in practice, how to deal with concurrency and transactions, how to efficiently retrieve objects and use caching. The authors created Hibernate and they field questions from the Hibernate community every day - they know how to make Hibernate sing. Knowledge and insight seep out of every pore of this book.",
"status": "PUBLISH",
"authors": ["Christian Bauer", "Gavin King"],
"categories": ["Java"]
}
I want to print title, and authors count where the number of authors is greater than 4.
I used the following command to extract records which has more than 4 authors.
db.books.find({authors:{$exists:true},$where:'this.authors.length>4'},{_id:0,title:1});
But unable to print the number of authors along with the title. I tried to use the following command too. But it gave only the title list.
db.books.find({authors:{$exists:true},$where:'this.authors.length>4'},{_id:0,title:1,'this.authors.length':1});
Could you please help me to print the number of authors here along with the title?

You can use aggregation framework's $project with $size to reshape your data and then $match to apply filtering condition:
db.collection.aggregate([
{
$project: {
title: 1,
authorsCount: { $size: "$authors" }
}
},
{
$match: {
authorsCount: { $gt: 4 }
}
}
])
Mongo Playground

Microsoft Graph - Filtering users by X500 proxyAddress

Is it possible to query for users, filtered by an X500 proxy address?
Using the following query which filters by an SMTP address, I can return all of my proxy addresses:
/v1.0/users/?$filter=proxyAddresses/any(x:x eq 'smtp:me#here.com')&$select=proxyAddresses
However, if I take one of the X500 addresses that was returned in the above query and try and filter by that:
/v1.0/users/?$filter=proxyAddresses/any(x:x eq 'x500:/o=ExchangeLabs/ou=Exchange Administrative Group (blahblah)/cn=Recipients/cn=trimmed')&$select=proxyAddresses
then I get a 400:
{
"error": {
"code": "Request_UnsupportedQuery",
"message": "Unsupported or invalid query filter clause specified for property 'proxyAddresses' of resource 'User'.",
"innerError": {
"request-id": "adcdefg",
"date": "2019-01-01T01:01:01"
}
}
}
I've tried URL encoding the address, and also tried with and without the "X500:" scheme.
Is filtering by X500 address supported?

I am able to use X500 addresses as filters without any modification to the address from a clone of GraphExplorer. The following queries both return the correct user record
https://graph.microsoft.com/v1.0/users/?$filter=proxyAddresses/any(x:x eq 'x500:/o=Company Exchange/ou=First Administrative Group/cn=Recipients/cn=UIDHere')&$select=proxyAddresses
and
https://graph.microsoft.com/v1.0/users/?$filter=proxyAddresses/any(x:x eq 'X500:/o=Company Exchange/ou=External (FYDIBOHF25SPDLT)/cn=Recipients/cn=z804261192zc46c4az4f6032z322540z')&$select=proxyAddresses

Like Lisa - this is not about parenthesis. I have Any lambda queries on proxyAddresses using X500 addresses containing parentheses that working just fine in Graph Explorer.
I suspect that the issue is actually size of the search string. I repro the error if the size of the search string is greater than 120 characters.
I'm following up with the engineering team.
In the meantime Paul, as a workaround (and excuse my lack of X500 knowledge), is there a way to query using the shortest X500 string?
Hope this helps,

As Dan Kershaw answered - this does seem to be a hard coded limit of 120 characters in the email address being filtered on.
A simple workaround is to trim the email address (including the scheme - "x500:" or "smtp:") to 120 characters, and search using a "startswith":
/v1.0/users/?$filter=proxyAddresses/any(x:startswith(x, 'x500:/o=ExchangeLabs/ou=Exchange Administrative Group (blahblah)/cn=Recipients/cn=trimmed'))&$select=proxyAddresses
This may return more than one match, so its then a case of looking through each returned user, and looking at their "proxyAddresses" collection to see which matches the original untrimmed email address that's being searched for.

I can confirm that this is still an issue as of today's date.
I'm actually using the AzureAD PowerShell cmdlets, which leverage the Graph API.
I couldn't figure out why my query was failing until I found this thread, so thanks for that.
I was getting essentially the same error message in PowerShell:
"Unsupported or invalid query filter clause specified for property 'proxyAddresses' of resource 'Group'."
When I took a substring of the first 120 characters and ran a startsWith, it worked fine.
It's a shame that this issue still hasn't been resolved.

Schema for chat application using mongodb

I am trying to build a schema for chat application in mongodb. I have 2 types of user models - Producer and Consumer. Producer and Consumer can have conversations with each other. My ultimate goal is to fetch all the conversations for any producer and consumer and show them in a list, just like all the messaging apps (eg Facebook) do.
Here is the schema I have come up with:
Producer: {
_id: 123,
'name': "Sam"
}
Consumer:{
_id: 456,
name: "Mark"
}
Conversation: {
_id: 321,
producerId: 123,
consumerId: 456,
lastMessageId: 1111,
lastMessageDate: 7/7/2018
}
Message: {
_id: 1111,
conversationId: 321,
body: 'Hi'
}
Now I want to fetch say all the consersations of Sam. I want to show them in a list just like Facebook does, grouping them with
each Consumer and sorting according to time.
I think I need to do following queries for this:
1) Get all Conversations where producerId is 123 sorted by lastMessageDate.
I can then show the list of all Conversations.
2) If I want to know all the messages in a conversation, I make query on Message and get all messages where conversationId is 321
Now, here for each new message, I also need to update the conversation with new messageId and date everytime. Is this the right way to proceed and is this optimal considering the number of queries involved. Is there a better way I can proceed with this? Any help would be highly appreciated.

Design:
I wouldn't say it's bad. Depending on the case you've described, it's actually pretty good. Such denormalization of last message date and ID is great, especially if you plan a view with a list of all conversations - you have the last message date in the same query. Maybe go even one step further and add last message text, if it's applicable in this view.
You can read more on pros and cons of denormalization (and schema modeling in general) on the MongoDB blog (parts 1, 2 and 3). It's not that fresh but not outdated.
Also, if such multi-document updates might scary you with some possible inconsistencies, MongoDB v4 got you covered with transactions.
Querying:
On one hand, you can involve multiple queries, and it's not bad at all (especially, when few of them are easily cachable, like the producer or consumer data). On the other hand, you can use aggregations to fetch all these things at once if needed.