Watson Alchemy - How to Understand Sentiment Ratings - ibm-watson

Watson Alchemy returns "sentiment ratings" of ".85 anger" for example, but I cannot understand ON WHAT SCALE... Is that 85% - (I don't think so). Does anyone have information on how to read the data? Thanks!

Yes it means 85%. Here is the API reference link. Scroll a bit down to the response section of Emotion Analysis request and you'll see the docs.
docEmotions - Object containing emotion keys and score values (0.0 to
1.0). If a score is above 0.5, then the text can be classified as conveying the corresponding emotion.
An example:
"docEmotions": {
"anger": "0.639028",
"disgust": "0.009711",
"fear": "0.037295",
"joy": "4e-05",
"sadness": "0.002552"
}

Related

in Watson Discovery News Feed API, limiting articles returned by date

Per the API documentation, manipulating the vales of "start" and "end" will result in different data sets being returned. Strangely, changing the values of start and end resulted in the same result being returned. What am I missing? Thanks!
qopts = {'query': '/automotive and vehicles',
'aggregation' : '[term(yyymmdd).term(docSentiment.type,count:3)]',
'return': 'docSentiment.type,yyyymmdd',
'count': '50',
'start': 'now-2w',
'end' : 'now-1w',
'offset': my_offset}
my_query = discovery.query(my_disc_environment_id, my_disc_collection_id, qopts)
I hope it is helpful for you.
I am not sure if this is right answer for you because I have limited information.
First of all, please check the number of return-sets. if the return-set has more then 50 dataset, result could be same. (might count param. -1 = unlimited in the rest API of WCA (Watson Context Analytics)
Second, if you can check the log from the server side, you can see the full query which manipulated from WATSON engine.
Last, I am not really sure that watson REST-API can recognize 'now-2w' style start-end form. Would you please link the tutorial? In my previous project, I wrote the start-end date by Y-M-D form.
Good Luck

how to use arima.rob

does anyone use arima.rob() function described by Eric Zivot and Jiahui Wang in { Modelling Financial Time Series with S-PLUS } ?
I have a question about it:
I used a dataset of network traffic flows that has anomaly, and I tried to predict the last part of dataset by robust ARIMA method (Arima.rob() function) .I compare this model with arima.mle of S-PLUS. But Unexpectedly, arima.rob’s prediction did not better than that.
I’m not sure my codes are correct and may be the reason of fault is my codes.
Please, help me if I used Arima.rob inappropriately?
tmp.rr<-arima.rob((tmh75)~1,p=2,d=1,q=2,freq=24,maxiter=4,max.fcal=80000)
tmp.for<-predict(tmp.rr,n.predict=10,newdata=df1,se=T)
plot(tmp.for,tmh75)
summary(tmp.for)
my code for classic arima:
`model <- list(list(order=c(2,1,2)),list(order=c(3,1,2),period=24))
fith <- arima.mle(tmh75-mean(tmh75),model=model)
foreh <- arima.forecast(tmh75,n=25,model=fith$model)
tsplot(tmh75,foreh$mean,foreh$mean+foreh$std.err,foreh$mean-foreh$std.err)
`

Parsing Solr Results - javabin format

I am trying to integrate solr with java using solrj. The result retrieved are of the format
{
numFound=3,
start=0,
docs=[
SolrDocument{
id=IW-02,
name=iPod&iPodMiniUSB2.0Cable,
manu=Belkin,
manu_id_s=belkin,
cat=[
electronics,
connector
],
features=[
carpoweradapterforiPod,
white
],
weight=2.0,
price=11.5,
price_c=11.50,
USD,
popularity=1,
inStock=false,
store=37.7752,
-122.4232,
manufacturedate_dt=TueFeb1418: 55: 59EST2006,
_version_=1452625905160552448
}
Now this is the javabin format. How do I extract results from this? Have heard that solrj does convert the results to objects by itself. But cant figure out how.
Thanks for the help in advance.
Let solrReply be the response object. The you can access different parts of the result using appropriate params. Say you want docs, you can do:
docs = solrReply['docs']
if you want the first result you could do:
first = solrReply['docs'][0]
Within a result you can access each field in the same way.

Using PyMEL to set the "Alpha to Use" attribute in an object of class psdFileTex

I am using Maya to do some procedural work, and I have a lot of textures that I need to load into Maya, and they all have transparencies (alpha channels). I would very much like to be able to automate this process. Using PyMEL, I can create my textures and hook them up to a shader, but the alpha doesn't set properly by default. There is an attribute in the psdFileTex node called "Alpha to Use", and it must be set to "Transparency" in order for my alpha channel to work. My question is this - how do I use PyMEL scripting to set the "Alpha to Use" attribute properly?
Here is the code I am using to set up my textures:
import pymel.core as pm
pm.shadingNode('lambert', asShader=True, name='myShader1')
pm.sets(renderable=True, noSurfaceShader=True, empty=True, name='myShader1SG')
pm.connectAttr('myShader1.outColor', 'myShader1SG.surfaceShader', f=True)
pm.shadingNode('psdFileTex', asTexture=True, name='myShader1PSD')
pm.connectAttr('myShader1PSD.outColor', 'myShader1.color')
pm.connectAttr('myShader1PSD.outTransparency', 'myShader1.transparency')
pm.setAttr('myShader1ColorPSD.fileTextureName', '<pathway>/myShader1_texture.psd', type='string')
If anyone can help me, I would really appreciate it.
Thanks
With any node, you can use listAttr() to get the available editable attributes. Run listAttr('myShaderPSD'), note in it's output, there will be two attributes called 'alpha' and 'alphaList'. Alpha, will return you the current selected alpha channel. AlphaList will return you however many alpha channels you have in your psd.
Example
pm.PyNode('myShader1PSD').alphaList.get()
# Result: [u'Alpha 1', u'Alpha 2'] #
If you know you'll only ever be using just the one alpha, or the first alpha channel, you can simply do this.
psdShader = pm.PyNode('myShader1PSD')
alphaList = psdShader.alphaList.get()
if (len(alphaList) > 0):
psdShader.alpha.set(alphaList[0])
else:
// No alpha channel
pass
Remember that lists start iterating from 0, so our first alpha channel will be located at position 0.
Additionally and unrelated, while you're still using derivative commands of the maya.core converted for Pymel, there's still some commands you can use to help make your code read nicer.
pm.setAttr('myShader1ColorPSD.fileTextureName', '<pathway>/myShader1_texture.psd', type='string')
We can convert this to pymel like so:
pm.PyNode('myShader1ColorPSD').fileTextureName.set('<pathway>/myShader1_texture.psd')
And:
pm.connectAttr('myShader1PSD.outColor', 'myShader1.color')
Can be converted to:
pm.connect('myShader1PSD.outColor', 'myShader1.color')
While they may only be small changes, it reads just the little bit nicer, and it's native PyMel.
Anyway, I hope I have helped you!

Plotting a word-cloud by date for a twitter search result? (using R)

I wish to search twitter for a word (let's say #google), and then be able to generate a tag cloud of the words used in twitts, but according to dates (for example, having a moving window of an hour, that moves by 10 minutes each time, and shows me how different words gotten more often used throughout the day).
I would appreciate any help on how to go about doing this regarding: resources for the information, code for the programming (R is the only language I am apt in using) and ideas on visualization. Questions:
How do I get the information?
In R, I found that the twitteR package has the searchTwitter command. But I don't know how big an "n" I can get from it. Also, It doesn't return the dates in which the twitt originated from.
I see here that I could get until 1500 twitts, but this requires me to do the parsing manually (which leads me to step 2). Also, for my purposes, I would need tens of thousands of twitts. Is it even possible to get them in retrospect?? (for example, asking older posts each time through the API URL ?) If not, there is the more general question of how to create a personal storage of twitts on your home computer? (a question which might be better left to another SO thread - although any insights from people here would be very interesting for me to read)
How to parse the information (in R)? I know that R has functions that could help from the rcurl and twitteR packages. But I don't know which, or how to use them. Any suggestions would be of help.
How to analyse? how to remove all the "not interesting" words? I found that the "tm" package in R has this example:
reuters <- tm_map(reuters, removeWords, stopwords("english"))
Would this do the trick? I should I do something else/more ?
Also, I imagine I would like to do that after cutting my dataset according to time (which will require some posix-like functions (which I am not exactly sure which would be needed here, or how to use it).
And lastly, there is the question of visualization. How do I create a tag cloud of the words? I found a solution for this here, any other suggestion/recommendations?
I believe I am asking a huge question here but I tried to break it to as many straightforward questions as possible. Any help will be welcomed!
Best,
Tal
Word/Tag cloud in R using "snippets" package
www.wordle.net
Using openNLP package you could pos-tag the tweets(pos=Part of speech) and then extract just the nouns, verbs or adjectives for visualization in a wordcloud.
Maybe you can query twitter and use the current system-time as a time-stamp, write to a local database and query again in increments of x secs/mins, etc.
There is historical data available at http://www.readwriteweb.com/archives/twitter_data_dump_infochimp_puts_1b_connections_up.php and http://www.wired.com/epicenter/2010/04/loc-google-twitter/
As for the plotting piece: I did a word cloud here: http://trends.techcrunch.com/2009/09/25/describe-yourself-in-3-or-4-words/ using the snippets package, my code is in there. I manually pulled out certain words. Check it out and let me know if you have more specific questions.
I note that this is an old question, and there are several solutions available via web search, but here's one answer (via http://blog.ouseful.info/2012/02/15/generating-twitter-wordclouds-in-r-prompted-by-an-open-learning-blogpost/):
require(twitteR)
searchTerm='#dev8d'
#Grab the tweets
rdmTweets <- searchTwitter(searchTerm, n=500)
#Use a handy helper function to put the tweets into a dataframe
tw.df=twListToDF(rdmTweets)
##Note: there are some handy, basic Twitter related functions here:
##https://github.com/matteoredaelli/twitter-r-utils
#For example:
RemoveAtPeople <- function(tweet) {
gsub("#\\w+", "", tweet)
}
#Then for example, remove #d names
tweets <- as.vector(sapply(tw.df$text, RemoveAtPeople))
##Wordcloud - scripts available from various sources; I used:
#http://rdatamining.wordpress.com/2011/11/09/using-text-mining-to-find-out-what-rdatamining-tweets-are-about/
#Call with eg: tw.c=generateCorpus(tw.df$text)
generateCorpus= function(df,my.stopwords=c()){
#Install the textmining library
require(tm)
#The following is cribbed and seems to do what it says on the can
tw.corpus= Corpus(VectorSource(df))
# remove punctuation
tw.corpus = tm_map(tw.corpus, removePunctuation)
#normalise case
tw.corpus = tm_map(tw.corpus, tolower)
# remove stopwords
tw.corpus = tm_map(tw.corpus, removeWords, stopwords('english'))
tw.corpus = tm_map(tw.corpus, removeWords, my.stopwords)
tw.corpus
}
wordcloud.generate=function(corpus,min.freq=3){
require(wordcloud)
doc.m = TermDocumentMatrix(corpus, control = list(minWordLength = 1))
dm = as.matrix(doc.m)
# calculate the frequency of words
v = sort(rowSums(dm), decreasing=TRUE)
d = data.frame(word=names(v), freq=v)
#Generate the wordcloud
wc=wordcloud(d$word, d$freq, min.freq=min.freq)
wc
}
print(wordcloud.generate(generateCorpus(tweets,'dev8d'),7))
##Generate an image file of the wordcloud
png('test.png', width=600,height=600)
wordcloud.generate(generateCorpus(tweets,'dev8d'),7)
dev.off()
#We could make it even easier if we hide away the tweet grabbing code. eg:
tweets.grabber=function(searchTerm,num=500){
require(twitteR)
rdmTweets = searchTwitter(searchTerm, n=num)
tw.df=twListToDF(rdmTweets)
as.vector(sapply(tw.df$text, RemoveAtPeople))
}
#Then we could do something like:
tweets=tweets.grabber('ukgc12')
wordcloud.generate(generateCorpus(tweets),3)
I would like to answer your question in making big word cloud.
What I did is
Use s0.tweet <- searchTwitter(KEYWORD,n=1500) for 7 days or more, such as THIS.
Combine them by this command :
rdmTweets = c(s0.tweet,s1.tweet,s2.tweet,s3.tweet,s4.tweet,s5.tweet,s6.tweet,s7.tweet)
The result:
This Square Cloud consists of about 9000 tweets.
Source: People voice about Lynas Malaysia through Twitter Analysis with R CloudStat
Hope it help!

Resources