how to use arima.rob - forecasting

does anyone use arima.rob() function described by Eric Zivot and Jiahui Wang in { Modelling Financial Time Series with S-PLUS } ?
I have a question about it:
I used a dataset of network traffic flows that has anomaly, and I tried to predict the last part of dataset by robust ARIMA method (Arima.rob() function) .I compare this model with arima.mle of S-PLUS. But Unexpectedly, arima.rob’s prediction did not better than that.
I’m not sure my codes are correct and may be the reason of fault is my codes.
Please, help me if I used Arima.rob inappropriately?
tmp.rr<-arima.rob((tmh75)~1,p=2,d=1,q=2,freq=24,maxiter=4,max.fcal=80000)
tmp.for<-predict(tmp.rr,n.predict=10,newdata=df1,se=T)
plot(tmp.for,tmh75)
summary(tmp.for)
my code for classic arima:
`model <- list(list(order=c(2,1,2)),list(order=c(3,1,2),period=24))
fith <- arima.mle(tmh75-mean(tmh75),model=model)
foreh <- arima.forecast(tmh75,n=25,model=fith$model)
tsplot(tmh75,foreh$mean,foreh$mean+foreh$std.err,foreh$mean-foreh$std.err)
`

Related

Want to pass in a string array into a WS.sendRequest using groovy

I am new to API testing and am using Katalon to develop the tests. I've Googled any question I could think of and couldn't find anything to answer my question.
We have an API with the following Body
[
"${idValue1}",
"${idValue2}",
"${idValue3}",
"${idValue4}",
"${idValue5}",
]
I believe the purpose of this one is to delete multiple records at once by id. The script that we have in the step definition is
response = WS.sendRequest(findTestObject('EquipmentAPI/data-objects/DELETE Equipment By Ids', [('host') : GlobalVariable.host, ('idValues') : GlobalVariable.equipId]))
GlobalVariable.equipId = WS.getElementPropertyValue(response, 'data[0].id')
There are other step definitions that run before this one to set the Global Variables for use. I was able to generate the string array without issue.
Is this something that's possible? Please help!
Please let me know if further information is needed. Thanks.

Appending values to DataSet in Apache Flink

I am currently writing an (simple) analytisis code to sum time connected powerreadings. With the data being assumingly raw (e.g. disturbances from the measuring device have not been calculated out) I have to account for disturbances by calculation the mean of the first one thousand samples. The calculation of the mean itself is not a problem. I only am unsure of how to generate the appropriate DataSet.
For now it looks about like this:
DataSet<Tupel2<long,double>>Gyrotron_1=ECRH.includeFields('11000000000'); // obviously the line to declare the first gyrotron, continues for the next ten lines, assuming separattion of not occupied space
DataSet<Tupel2<long,double>>Gyrotron_2=ECRH.includeFields('10100000000');
DataSet<Tupel2<long,double>>Gyrotron_3=ECRH.includeFields('10010000000');
DataSet<Tupel2<long,double>>Gyrotron_4=ECRH.includeFields('10001000000');
DataSet<Tupel2<long,double>>Gyrotron_5=ECRH.includeFields('10000100000');
DataSet<Tupel2<long,double>>Gyrotron_6=ECRH.includeFields('10000010000');
DataSet<Tupel2<long,double>>Gyrotron_7=ECRH.includeFields('10000001000');
DataSet<Tupel2<long,double>>Gyrotron_8=ECRH.includeFields('10000000100');
DataSet<Tupel2<long,double>>Gyrotron_9=ECRH.includeFields('10000000010');
DataSet<Tupel2<long,double>>Gyrotron_10=ECRH.includeFields('10000000001');
for (int=1,i<=10;i++) {
DataSet<double> offset=Gyroton_'+i+'.groupBy(1).first(1000).sum()/1000;
}
It's the part in the for-loop I'm unsure of. Does anybody know if it is possible to append values to DataSets and if so how?
In case of doubt, I could always put the values into an array but I do not know if that is the wise thing to do.
This code will not work for many reasons. I'd recommend looking into the fundamentals of Java and the basic data structures and also in Flink.
It's really hard to understand what you actually try to achieve but this is the closest that I came up with
String[] codes = { "11000000000", ..., "10000000001" };
DataSet<Tuple2<Long, Double>> result = env.fromElements();
for (final String code : codes) {
DataSet<Tuple2<Long, Double>> codeResult = ECRH.includeFields(code)
.groupBy(1)
.first(1000)
.sum(0)
.map(sum -> new Tuple2<>(sum.f0, sum.f1 / 1000d));
result = codeResult.union(result);
}
result.print();
But please take the time and understand the basics before delving deeper. I also recommend to use an IDE like IntelliJ that would point to at least 6 issues in your code.

(C) - How would one compare 2 txt files REQUESTS.txt and AVAILABLE.txt, separating each str read into a (STR6, STR3, STR3, INT) formatted Structure?

I have been working on this program for over a week with no breakthrough. The questions states as follows:
A ​disc​ ​file​ ​‘REQUESTS.TXT’​ ​contains​ ​airline​ ​flight​ ​data formatted​
​(STR6,​ ​STR3,​ ​STR3,​ ​INT)​.
Example:​
AA1011​SFx​LAx​​34​ ​(American Airlines​ ​1010,​ ​SF​ ​to​ ​LA,​ ​34​ ​seats)
W0924​DNV​DFW​​101​ ​(Western​ ​0924,​ ​DNV​ ​to​ ​DFW,​ ​101​ ​seats)
Another​ ​file​ ​‘AVAILABL.TXT’​ ​contains​ ​an​ ​unspecified​ number​ ​of​ ​reservation​ request​ ​records formatted​ ​identically​ ​as​ ​described​ ​above​ ​except​ ​the​ Seats​ ​Available​ ​field​ ​is​ ​a​ ​Seats​ ​Requested field.
Guidelines:
Read reservation flights and process requests. If the request can be fullfilled (i.e.. it is in AVAILABL and REQUESTS) then print "Reservation Processed", otherwise print "Reservation Denied".
Print out flight data file before and after reservations are processed, ordered by flight ID in a four(4) column format.
Print an overall outcome report for all processed.(Present totals for the number of requests satisfied and denied)
I have tried a few different approaches.. I tried to split up the first STR6 by isalpha/isdigit and combine them to make the FlightID (AA + 1011). Proceeded to try to then split up the remaining characters between STR3 and STR3 via isalpha + for loop. And lastly, I tried to take the last 3+ digits for the # of seats during each for loop iteration and multiply the first digit by 100(for a 3-digit value) or 10(for a 2-digit value), adding it to a running total for availSeats(INT). This, at least I thought so, would produce a
AA+1011 = AA1011(STR6) // W+0924 = W0924(STR6)
SFx(STR3) // DNV(STR3)
LAx(STR3) // DFW(STR3)
(3*10)+(4*1) = 34(INT) // (1*100)+(0*10)+(1*1) = 101(INT)
All of this stored within a Struct Array.
i.e...
FlightData Flight; ............................................FlightData Flight;
Flight[0].flightID = AA1011; .........................Flight[1].flightID = W0924;
Flight[0].fromCity = SFx; ...............................Flight[1].fromCity = DNV;
Flight[0].toCity = LAx; ..................................Flight[1].toCity = DFW;
Flight[0].seatsAvail = 34; .............................Flight[1].seatsAvail = 101;
I am really at a loss right now and have no other way to progress other than searching up different techniques/methods to use to make this work. I am a beginner clearly and will continue to practice and progress in C, but if anyone could provide me with a push in the right direction on how one would execute this via .txt into a Struct would be amazing. Also, if anyone has another method they used to solve this problem I would love to analyze it. Thanks!
(This is my first post, I spent a lot of time formatting it to be clear on Stackoverflow, so If i messed up in areas some constructive critisism would be useful! This applies to my posting and my coding practices. Thanks again!)
EDIT: The question I am asking here is how to successfully take a string such as AA1011SFxLAx34 and turn it into a Structure like the above diagram. It must also work for the second string W0924DNVDFW101 which has only 1 Char in its ID. (rather than two in AA1011). Im not sure what else I am supposed to edit after reading the guidelines.
I consider this a home work question, so I answer according to
How do I ask and answer homework questions?
Find a tutorial on C, work through it.
Then take a HelloWorld, modify it in small steps to approach your goal in steps from working program to working program. This way you should at least get to being able to read text from a file and print it.
Then learn to store parts of what you print into basic variables.
Then learn about structures.
And so on.
This way you will get quite close to the solution.
If it is not completely what you need show the code you have here at that point and ask a specific question about the first problem explaining what you suspect the problem to be. Show code which has exactly that one problem and makes it visible and has not other warnings (using at least e.g. gcc -Wall mycode).
Fix with the help of commments/answers you receive, repeat.

converting matlab code to c code readiness error

I want to convert my matlab code to c code.But when i am trying to do do lots of warnings are comping in the code readiness report. Although I have sorted out a lot of issues but still there are some functions which are not supported and i need some alternative.
Kindly tell me any alternative codes for the errors.
Let me be more specific: these are the lines of the function that are causing some issue-:
fn = fieldnames(pars); %current parametres fieldnames
main_fn = {'algorithm'};
pm_fn =
{'pm_tau','pm_window','pm_keypoints','pm_harris_prctile','pm_searchrange'};
m_fn =
{'window','range','tau','epsilon','growing','init_seeds_accept',
'seeds_accept','searchrange','searchrangeV',
'max_candidates','vis_step','mu','csbeta','csalgorithm','grow_version'};
Now in this it is saying that code generation only supports cell operations for varargin and vararout
the error screenshot

Plotting a word-cloud by date for a twitter search result? (using R)

I wish to search twitter for a word (let's say #google), and then be able to generate a tag cloud of the words used in twitts, but according to dates (for example, having a moving window of an hour, that moves by 10 minutes each time, and shows me how different words gotten more often used throughout the day).
I would appreciate any help on how to go about doing this regarding: resources for the information, code for the programming (R is the only language I am apt in using) and ideas on visualization. Questions:
How do I get the information?
In R, I found that the twitteR package has the searchTwitter command. But I don't know how big an "n" I can get from it. Also, It doesn't return the dates in which the twitt originated from.
I see here that I could get until 1500 twitts, but this requires me to do the parsing manually (which leads me to step 2). Also, for my purposes, I would need tens of thousands of twitts. Is it even possible to get them in retrospect?? (for example, asking older posts each time through the API URL ?) If not, there is the more general question of how to create a personal storage of twitts on your home computer? (a question which might be better left to another SO thread - although any insights from people here would be very interesting for me to read)
How to parse the information (in R)? I know that R has functions that could help from the rcurl and twitteR packages. But I don't know which, or how to use them. Any suggestions would be of help.
How to analyse? how to remove all the "not interesting" words? I found that the "tm" package in R has this example:
reuters <- tm_map(reuters, removeWords, stopwords("english"))
Would this do the trick? I should I do something else/more ?
Also, I imagine I would like to do that after cutting my dataset according to time (which will require some posix-like functions (which I am not exactly sure which would be needed here, or how to use it).
And lastly, there is the question of visualization. How do I create a tag cloud of the words? I found a solution for this here, any other suggestion/recommendations?
I believe I am asking a huge question here but I tried to break it to as many straightforward questions as possible. Any help will be welcomed!
Best,
Tal
Word/Tag cloud in R using "snippets" package
www.wordle.net
Using openNLP package you could pos-tag the tweets(pos=Part of speech) and then extract just the nouns, verbs or adjectives for visualization in a wordcloud.
Maybe you can query twitter and use the current system-time as a time-stamp, write to a local database and query again in increments of x secs/mins, etc.
There is historical data available at http://www.readwriteweb.com/archives/twitter_data_dump_infochimp_puts_1b_connections_up.php and http://www.wired.com/epicenter/2010/04/loc-google-twitter/
As for the plotting piece: I did a word cloud here: http://trends.techcrunch.com/2009/09/25/describe-yourself-in-3-or-4-words/ using the snippets package, my code is in there. I manually pulled out certain words. Check it out and let me know if you have more specific questions.
I note that this is an old question, and there are several solutions available via web search, but here's one answer (via http://blog.ouseful.info/2012/02/15/generating-twitter-wordclouds-in-r-prompted-by-an-open-learning-blogpost/):
require(twitteR)
searchTerm='#dev8d'
#Grab the tweets
rdmTweets <- searchTwitter(searchTerm, n=500)
#Use a handy helper function to put the tweets into a dataframe
tw.df=twListToDF(rdmTweets)
##Note: there are some handy, basic Twitter related functions here:
##https://github.com/matteoredaelli/twitter-r-utils
#For example:
RemoveAtPeople <- function(tweet) {
gsub("#\\w+", "", tweet)
}
#Then for example, remove #d names
tweets <- as.vector(sapply(tw.df$text, RemoveAtPeople))
##Wordcloud - scripts available from various sources; I used:
#http://rdatamining.wordpress.com/2011/11/09/using-text-mining-to-find-out-what-rdatamining-tweets-are-about/
#Call with eg: tw.c=generateCorpus(tw.df$text)
generateCorpus= function(df,my.stopwords=c()){
#Install the textmining library
require(tm)
#The following is cribbed and seems to do what it says on the can
tw.corpus= Corpus(VectorSource(df))
# remove punctuation
tw.corpus = tm_map(tw.corpus, removePunctuation)
#normalise case
tw.corpus = tm_map(tw.corpus, tolower)
# remove stopwords
tw.corpus = tm_map(tw.corpus, removeWords, stopwords('english'))
tw.corpus = tm_map(tw.corpus, removeWords, my.stopwords)
tw.corpus
}
wordcloud.generate=function(corpus,min.freq=3){
require(wordcloud)
doc.m = TermDocumentMatrix(corpus, control = list(minWordLength = 1))
dm = as.matrix(doc.m)
# calculate the frequency of words
v = sort(rowSums(dm), decreasing=TRUE)
d = data.frame(word=names(v), freq=v)
#Generate the wordcloud
wc=wordcloud(d$word, d$freq, min.freq=min.freq)
wc
}
print(wordcloud.generate(generateCorpus(tweets,'dev8d'),7))
##Generate an image file of the wordcloud
png('test.png', width=600,height=600)
wordcloud.generate(generateCorpus(tweets,'dev8d'),7)
dev.off()
#We could make it even easier if we hide away the tweet grabbing code. eg:
tweets.grabber=function(searchTerm,num=500){
require(twitteR)
rdmTweets = searchTwitter(searchTerm, n=num)
tw.df=twListToDF(rdmTweets)
as.vector(sapply(tw.df$text, RemoveAtPeople))
}
#Then we could do something like:
tweets=tweets.grabber('ukgc12')
wordcloud.generate(generateCorpus(tweets),3)
I would like to answer your question in making big word cloud.
What I did is
Use s0.tweet <- searchTwitter(KEYWORD,n=1500) for 7 days or more, such as THIS.
Combine them by this command :
rdmTweets = c(s0.tweet,s1.tweet,s2.tweet,s3.tweet,s4.tweet,s5.tweet,s6.tweet,s7.tweet)
The result:
This Square Cloud consists of about 9000 tweets.
Source: People voice about Lynas Malaysia through Twitter Analysis with R CloudStat
Hope it help!

Resources