algorithmic trading simulator/benchmark data [closed] - benchmarking

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I am interested about playing with algorithmic trading strategies. Does anyone know if there exists simulator or benchmark data I could possibly play with offline (without actually making any investments)?

This question is pretty broad; as it is, there is no mention of which instruments you're interested in. For instance:
Stocks
Bonds
Commodities
Forex
Derivatives (like Futures or Options)
For the moment, I will assume you're interested in stocks... if so, take a look at Ninja Trader, which offers these features for free. You can get free end-of-day stock data from Yahoo Finance, which is sufficient for longer-term trading timelines; keep in mind that the shorter the trading cycle, the more stringent your data resolution needs to be.
If you are willing to put several thousand into a trading account, any number of brokers will be happy to send you live market intra-day market feeds; but you don't need money in an account for paper trading (at least with my broker). I think the broker that is most flexible for programmers is Interactive Brokers. You can get sub-second data from them via an API, just understand they won't give you Tick-level granularity; they summarize their feeds, the specific details vary so it's better just to talk to them if you have tight granularity requirements. As for off-line simulation, you can do that with Ninja Trader, Interactive Brokers, and many other online brokers (see What online brokers offer APIs?).
BONUS MATERIAL
Since you're offering +200, I will share a bit more that you might be able to use... Keep it or toss in the trash for whatever value it brings.
Trading Timeframe
As a general rule, the shorter the timeframe, the more difficult the trades, and the harder it is to consistently make money. If you're unsure where to start with timelines, look at trading cycles of days or weeks at a time; then if you see too many opportunities passing you by, refine your system to a smaller timeline. The other thing to consider is how often you want to touch this code and tweak algorithms. The general rule is, as the trading cycles get shorter, you do more calibration and maintenance on your algorithms. It is not that unusual to find an algorithmic trader who wrote a good Swing-Trading platform that has worked as-is for the last decade. On the other hand, Day Trading Algorithms tend to require more care and feeding based on changes in market conditions.
Trading Style
Closely related to timelines are your trading tactics. Are you:
Trend-Following
Swing-Trading
Day Trading
Scalping
Sub-pennying
Using Statistical Arbitrage
Playing Options
Trade Management / Mindset
Trade management is a rather big topic, and one you'll find addressed in-depth if you lurk around trader boards like Elite Trader. While it may sound out of place to discuss some of this in the same thread about automated trading platforms, I'm sure you'd agree that your assumptions and attitude have insidious ways of leeching into your code. I will share a few things off the top of my head:
Success is primarily about preventing losing trades. Good trades take care of themselves.
Always trade with a stop-loss. Conventional wisdom is, "Your first loss is your smallest loss". If things start going south, figure out a way to consistently get out while keeping most of your previous profits; holding losers is the fast path to becoming a boiled frog.
There is no such thing as "Too High" or "Too low". The market moves in herd-mentality and doesn't care what you think it should be doing.
Closely related to point "3": Trade with the long-term trend. Fighting the trend (affectionately known as "counter-trending") may sound attractive to natural contrarians, but you need to be very good to do it well. Trading is enough work without trying to counter-trend.
Trading within the hour after a Federal Reserve market announcement is very difficult; I think it's just better to be out of the market. The quick profits can look seductive, but this is where the pros love to eat the amateur traders; brutal reversals can develop within a couple of minutes.
Avoid trading on margin unless you have a proven technique that you have backtested with at least a couple years of data.
The first thrity-minutes and last hour of regular trading can see rapid changes in volatility.
Regarding profit taking, "Bulls get fed, hogs get slaughtered"
If you find that you are not making profits, consider evaluating your trading frequency; reducing your trades to a minimum is key to success, otherwise, slippage, commissions and fees on junk trades will eat your profits.
Due to computational delays / processing time and partial order fills, limit orders are rather challenging to manage and pretty close to minutia; algorithmic traders have much bigger fish to fry.
Coding
Log every data point and decision you make in your code; three logging levels works for me. Trading is an inexact task, and tiny changes can blow up your previously-profitable algorithm. If something blows up, you need a way to compare against what was working.
Prototype in a scripting language; if something is slow, you can always offload to a compiler later. I think python is a fantastic for quantitative finance... mature unit-testing, C / C++ integration, numpy, pyplot and pandas are winners for me.
More pandas plugs... (pandas video), also see: Compute a compounded return series in Python
I started off with plain-ole csv, but I'm in process of migrating to HDF5 for tick data archives
Trade-simulation is deceptive: Simulated trades don't have problems filling due to low-liquidity or high-demand for the instrument; depending on market conditions, my real trades can see two or three seconds delay from the time I send the order to the time I get a fill. Simulated trades also don't get data blackouts; be sure to include sudden data loss (and how to recover) in your plans. Lower-cost brokers tend to suffer more blips and blackouts, but if your timeframe is longer, it may be something you can ignore.
Legal
The information set forth herein has been obtained or derived from sources believed by author to be reliable. However, the author does not make any representation or warranty, express or implied, as to the information’s accuracy or completeness, nor does the author recommend that the attached information serve as the basis of any investment decision. This data has been provided to you solely for information purposes and does not constitute an offer or solicitation of an offer, or any advice or recommendation, to purchase any securities or other financial instruments, and may not be construed as such. By using any of this information, you expressly agree that all risks associated with the performance and quality of the information is assumed solely by you. The author shall not be liable for any direct, indirect, incidental, special or consequential damages arising out of the use of or inability to use the information, even if the author has been advised of the possibility of such damages. The information is made available by the author "as is" and "with all faults".

Is this the kind of data you're looking for?
Ohio Department of Finance, Financial Data Finder
This page might also be of help:
Yahoo Finance Download, HTTP Interface parameters
And this one:
mathfinance.cn: Free financial resources
Not offline, but still a good link to give to other readers:
Google Finance API
I hope I understood the question correctly.

I use AMIBroker.
Its primarily used for backtesting algorithmic trading strategies. It is very fast and can download and get data from a variety of free sources.

Try with Cloud based backtesting Engines like Quantconnect and Quantopian
Quantopian is a python based IDE where you can write strategies and backtest online. You can do simulation trades as well as real trades with Interactive Broker.
Quanconnect is similar to Quantopian and they used .Net Based IDE and you can run simulation trades and live trader with the discount broker tradier.

QuantGo lets you rent high frequency data sets rather than buying them. I really like it because it's only a couple hundred per month instead of thousands.
Quandl has some good free data sets if you are only interested in trading at longer time intervals. Their stock data API is pretty slick (link).

One alternative would be PyAlgoTrade (http://gbeced.github.io/pyalgotrade/). Its an open source Python library to backtest trading strategies.

Related

AI for MMORTS game

I'm not sure if this is the right place to ask this, but here goes.
I have been a programmer for about 12 years now with experience in php, java, c#, vb.net and asp. I have always been rather intrigued about Artificial Intelligence. I think it is really the ultimate challenge for any developer.
I have written many simple scripts to play games but nothing compared to what I want to do next. I want to write an AI program that will play a MMORTSG (Massively Multiplayer Online Real Time Strategy Game. I have been searching through many AI techniques but none seem to tackle the problems that I know I will face:
Problems I can foresee:
The game has no "win situation", instead, the best strategy is the one that has the greatest growth in comparison to that of other players. Growth is determined by 3 factors, economy, military and research.
Parts of the game state are unpredictable. Other players can attack me at random.
The game is time based and actions take time. ie. Building a new building make take several hours. While that building is being built, no other buildings can be built.
All the AI systems I have researched, require some sort of "Winning Function" to test if the AI has found an end. Where in my situation it would more likely be something like "I have X, Y, Z options, the best one is X".
ps. Sample code would be awesome. Even Psuedo would be great.
I've seen a few applications of Artificial Intelligence in the Gaming area, but most of this was for FPS, MMORPGs and RTS games. The genre type that you appear to be relating to sounds similar to 'Clash of Clans', where research, military and economy as well as random attacks occur over a random period of time, and runs over an endless period of time.
It seems that a model would be used at key points in the game (building is finished, or research is available, or castle is full) to make strategic decisions for progression. Perhaps a Genetic Algorithm could be applied at key moments determine a suitable sequence of future steps. A modular neural network could be defined to decide the logical Growth factor to take, but training such a network may be difficult as the game rules can change over time (either from previously unknown resources, research options, military and even game updates). Scripts are quite common as well in the MMORPG genre, but defining the manual rules could also be difficult without knowing all of the available options. The fact is that there are so many ways that your challenge can be addressed that it would be difficult to give a clear-cut answer to your problem, let alone the code or psudocode.
Looking briefly over the problem, it appears that the contributing factors to the problem would be current economic state, current military state, current research state, time lost if saving for next upgrade, time required to build next upgrade, cost of upgrade as well as other unknown factors.
Given that the problem has no clear winning objectives, I guess it is a matter of maintaining a healthy balance between the three growth factors. But how does one define the balance? Is research more important? Should you always have money, or just save enough for the next planned upgrade? Should military be as large as possible?
The challenge that you are placing before yourself is quite adventurous, but I would recommend taking on smaller challenges if you are not yet familiar with the models that AI have to provide. There are quite a number of Gaming Applications for AI resources available to inspire your model (including ziggystar's examples noted above).

Reasons for absence of performance test for oss database systems? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I used open source java implementation of TPC-C benchmark (called TCJ - TPC-C via JDBC (created by MMatejka last year)) to compare the performance of Oracle and 2 OSS DBMS.
TPC-C is standard in the proprietary sphere and my question is:
What are the main reasons that there is not systematically implemented performance test for OSS database systems?
Firstly, I'm not sure your question is a perfect fit for SO as it is getting close to asking opinion, and so all of my answer is more opinion than fact. most of this I have read over the years, but will struggle to find references/proof anymore. I am not a TPC member, but did heavily investigate trying to get a dsitributed column store database tested under the TPC-H suite.
Benchmarks
These are great at testing a single feature and comparing them, unfortunately, that is nowhere as easy as it sounds. Companies will spend large amounts of effort to get better results, sometimes (so I have heard) implementing specific functions in the source for a benchmark. There is a lot of discussion about how reliable benchmark results are overall. Also, a benchmark may be a perfect fit for some product, but not another.
Your example uses Jdbc, but not every database has jdbc, or worse it may be a 'minor bolt on' just to enable that class of application. So performing benchmarks via jdbc when all main usage will be embedded sql may portray some solutions unfairly/poorly.
There is some arguments around that benchmarks distract vendors from real priorities, they spend effort and implement features solely for benchmarks.
Benchmarks can also be very easily misunderstood, even TPC is a suite of different benchmarks and you need to select the correct one for your needs ( tpc-c for oltp, tpc-h for dss etc)
TPC
If this reads as negative for tpc, please forgive me, I am pro tpc.
Tpc defines a very tight set of test requirements. You must follow these to a letter. For tpc-h this is an example of what you must do
do multiple runs, some in parallel, some single user
use exactly the sql provided, you must not change it at all. If you need to because your system uses a slightly different syntax, but must get a waiver.
you must use an external auditor.
you may not index colmns etc beyond what is specified.
for tpch you must do writing in a specified way (which eliminates 'single writer' style databases)
The above ensures that people reading the results can have trust in the integrity of the results, which is great for a corporate buyer.
Tpc is a non profit org and anybody can join. There is a fee but it isnt a major barrier, except for OSS. You are only realistically going to pay this fee if you think you can achieve really great results, or you need published results to bid for govt contracts etc.
The biggest problem I see with tpc for oss is that it is heavily skewed towards relational vendors and very few oss solutions can met the entry criteria with their offerings, or if they do they may not perform well enough for every test. Doing a benchmark may also be a distraction for some teams.
Alternatives to tpc
Of course alternatives exist to tpc, but none really gain traction, as yet, that i am aware of. Major vendors often stipulate that you cannot benchmark their products and publish the results. So any new benchmark will need to be politically astutue to get them on board. I agree with the vendors stance here, I would hate someone to mis-implement a benchmark and report my product poorly.
The database landscape has fractured a lot since tpc started, but many 'bet you business' applications still run on 'classic' databases, so they still have a place. However, with the rise in nosql etc, there is a place for new benchmarks, but the real question becomes what to measure - even chosing xyz like '%kitten%'. Or xyz like 'kitten%'. Will have dramatic effects on different solutions. If you solve that, what common interface wil,you allow (odbc, jdbc, http/ajax, embedded sql, etc) each of these interfaces affects performance greatly. What about the actual models, such as ACID for relational models vs eventual consistency models? What about hardware/software solutions that use specificaly designed hardware?
Each database has made design trade offs for different needs, and a benchmark is attempting to level the playing field, which is only really possible if you have something in common, or report lots of different metrics.
One of the problems with trying to create an alternative is that 'who will pay'? You need consenus over the type of tests to perform, and then you need to audit results for them to be meaningful. This all costs money.

Why would you not want to consolidate Mission Critical Databases?

Suppose if you wanted to consolidate all of your mission critical databases into one instance so that you can save some licensing money, what would be the potential risks and are there any good articles or case studies on this? I realize this is a terrible idea but I have somebody that wants to do this and is willing to maximize the hardware resources if needed. I am trying to present him with something quantifiable or some articles that can steer him away from doing this.
There are three big mission critical databases which includes Billing, Dynamics CRM, and an in house built application to keep track of transactions. These are high volume databases for a small/mid sized company. I need quantifiable or a good case study support in order to convince somebody that this is the wrong path to go towards. Any other advice on how I can convince this person would be helpful also.
The answer depends. On first glance, it may look like a bad idea. On the other hand, if the goal is to consolidate everything on one server, and then replicate that server in a remote environment, then you are on the way to a more reliable system. IT might prefer having everything in one place, rather than dealing with mission critical servers spread out over the terrain.
One major issue is the need for a larger machine. So, if any of the systems use software whose license depensd no the size of the machine, you nmight end up spending more money because you need a larger server. I've seen this happen with SAS licencing, for instance.
Perhaps the biggest issue, though, is that the different applications are probably on different development cycles -- whether developed in-house or from outside vendors. So, updating the hardware/operating system/software can become a nightmare. A fix or enhanced functionality in A might require an OS patch, which in turn, has not been tested on B. This maintenance issue is the reason why I would strongly advocate separate servers.
That said, mission critical applications are exactly that, mission critical. The driving factor should not be a few dollars on hardware. The driving factors should be reliability, maintenance, performance, sustainability, and recovery.
The comments made by Oded, Catcall and Gilbert are spot on.
The bank where I learnt the IT trade ran its entire core business on a single MVS (later Z/OS) mainframe, which ran a single DBMS and a single transaction processor (unless you counted TSO as a transaction processor).
The transaction processor went down regularly (say, once a day). It never caused the bank to go broke because it was always up and running again in less than a minute. Mileage may vary, but losing one minute of business time in an entire working day (480 minutes, or < 0.25%) really isn't dangerously disruptive.
The single DBMS went down too, at times (say, twice a month). I can still hear the sysprogs yelling "DBMS is down" over the fence to the helpdesk officers, meaning "expect to be getting user calls". It never caused the bank to go broke because it was always up and running again in a matter of minutes. Mileage may vary, but losing a couple of minutes of business time each month really shouldn't be dangerously disruptive.
The one time I do remember when the bank was really close to bankruptcy was when the development team had made a mess out of a new project in the bank's absolute core business, and the bank was as good as completely out of business (its REAL business) for three or four days in a row. That wasn't 0.25% loss of business time, but almost 100 TIMES more.
Moral of my story ? Two of them. (a) It's all about risk assessment (= probability assessment) and risk-weighted (= probability-weighted) cost estimation. (b) If you ask a question on SO (which implies a kind of recognition/expectation that answerers have more expertise than you on the subject matter), and people like Oded and Catcall provide you with a concise answer, which is accurate and to the point, then don't ask for papers or case studies to back up their answers. If you don't want to accept the experts' expertise, then why bother asking anything in the first place ?

Artificial Intelligence undergraduate project help on idea and its influence on a later on masters degree [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I am a Computer Science student. I want to do an AI project for my 4th year with two other students. (It's a 5-year degree in my university so I can pursue the same project for two consecutive years if I want to). Our knowledge in AI is very basic at this moment since we'll be specializing in it these coming two years, so a very advanced idea will probably be hard to accomplish. We're not expected to research new untouched soils either, so the more resources the better.
I'm interested in ideas that can benefit people and not just applying algorithms and techniques. I want to do a masters after graduation, but I'm not sure in what field yet.
I'd love to do a medical application or a project that of some use to the handicapped.
Some projects that were already pursued at the university included a project to recognize breast cancer, and to teach sign language to the deaf.
I'm wondering:
1) what other ideas we can work on in those fields?
2) how much will my choice of graduation project affect my application for a masters degree?
3) Is a stocks price prediction expert system too advanced for us?
Thanks a lot.
1) what other ideas we can work on in those fields?
It's amazing to me how little imagination computer science students seem to have. Stackoverflow.com is rife with questions about first projects from beginners and students.
I think that using statistics and data in novel ways, like Peter Norvig's spell checker, would be most interesting and fruitful.
Dr. Peter Norvig is a well-known computer science professor and AI guru. He's the CTO of Google now. Perhaps you can mine a choice out of his writings.
2) how much will my choice of
graduation project affect my
application for a masters degree?
Depends on too many other factors that you don't mention, like your past record as a student, etc. Probably a minor factor, in my opinion. Nobody is admitted to a masters program on the basis of a graduation project. Neither your undergrad project nor a masters thesis is a doctoral dissertation. Don't get them confused.
3) Is a stocks price prediction expert system too advanced for us?
I think stock price prediction is too advanced for anybody. After years of applying Fourier analysis, statistical models, Monte Carlo simulations, etc. if it were possible to do it would have been done.
2) how much will my choice of graduation project affect my application for a masters degree?
If you are applying for a PhD, the faculty in the prospective department tend to favor students who are interested in the research they are doing, or who have demonstrated the ability to do their own research. For a Masters these are not much of an issue, but they can make a little difference.
3) Is a stocks price prediction expert system too advanced for us?
Well, if you did then you would start using it to make money, others would see what you are doing an imitate you so that pretty soon your arbitrage opportunity would be gone.
Still, these type of systems are often built by students in machine learning classes, mostly due to the fact that there is a lot of data freely available and well formatted data on stock prices, so its easy to get starting writing the program. It is a good way to get insight into machine learning algorithms.
1) What other ideas we can work on in those fields?
Find some problem that you are passionate about, will learn something from by tackling it, and is within the scope of your time, effort, and ability. Projects like this are relevant not only for grad school but also when applying for entry-level jobs (even if a few years off still after doing a masters degree)l. It helps to pick something you can put on a resume that shows your level of accomplishment and ability to complete a task.
2) How much will my choice of graduation project affect my application for a masters degree?
The topic choice probably won't matter significantly except perhaps for top-tier programs or if you have notable weaknesses in other admissions criteria. If the latter is true, then a good project may help, but even the latter is uncertain. Masters program admissions I think is generally handled by administrative staff, so they are probably more interested in whether or not you did a project than what the topic is.
3) Is a stocks price prediction expert system too advanced for us?
Yes, a stock price prediction system is far too difficult if you want a system that actually can work reasonably well over anything other than a small training data set.
The market is neither a natural system, a machine, nor even a system of rational collective behavior. Its pricing mechanism is in general irrational: investors/traders may make transactions at prices that are reasonable for them relative to their own decision criteria, but the market as a whole is generally not rational. The market is more an aggregation of behavior rather than collective behavior.
The above alone would make for an intensively difficult problem to solve with AI methods, but beyond that there are issues of problem scale, the amount of training data which is needed, etc.
There are of course a large number of Wall Street trading firms using quantitative methods for high-frequency trading, etc. They are effective, however, because they are focused on narrow problems (price trends over the next few seconds-to-minutes in highly-liquid stocks, S&P index futures, etc.), they put a lot of work into their models and generally are constantly rebuilding the latter on a daily/weekly basis, and they understand the market's nature, i.e., it's largely irrational as a whole and is a competitive, shifting landscape of exploiting the pricing inefficiencies inherent to large money flows.
I would only recommend this problem domain if you have an intense personal interest in financial markets and have already spent a lot of time studying them, are prepared to fail, and are interested in learning a lot. Trying to work on this problem is certainly a good learning opportunity, but it will be hard to achieve any real success except for small problems unless you have many years to devote.
1) what other ideas we can work on in
those fields?
Dr. Russel Greiner has a nice list of possible student projects in machine learning, several of which are related to medicine.
2) how much will my choice of
graduation project affect my
application for a masters degree?
It probably won't matter very much. However, choosing a ridiculously easy project probably won't help. I'm sure that you'll be vetting whatever you choose with your prof, so don't worry about that so much. Find a topic you're passionate about first and foremost.
3) Is a stocks price prediction expert
system too advanced for us?
Yes. Don't bother with that nonsense. The game of Go will be solved before anyone figures out the stock market.
1) what other ideas we can work on in
those fields?
Are there any faculty members at your university that work in the field of bioinformatics? If so, talk to them and see if they give you a suitable project idea that gets you excited. If you decide to take this path, try to enroll in an Intro to Bioinformatics course as it will help you get familiar with the field and generally make things easier.

How to value and put a price on software (license) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
In my company we often value the software to be almost the same as concurrent software on the market. While this is one way I'm interested of other (maybe more academic) ways of value and put a price on software.
Any ideas or methods that have been succesful for you?
While I'm not always a fan of Joel, this crusty old article from 2004 answers your question very well. You charge based on how many units you think you will sell at that price, with the goal of maximizing profit.
I've done some contract work in the past, and I based my estimate on:
The cost of the man hours to produce the software, from start to finish
Potential money the client saves by using the software
Cost of any planned support for the software
Any other related costs such as installation fees, documentation, training, etc.
Then I compared that to the industry standard. In my case it was usually cheaper, and I still made money. Thus both the client and myself were happy.
Edit:
Mind you the above method is for a single client, with a custom software solution, and a simple unlimited-use license.
If you want to know academic methods for pricing a product, I have an MBA I would like to sell you.
Seriously, it depends on a lot of things. Are you selling a service, subscription, or a "box"? Where is your desired position in the market? What do your customers have to spend?
Listen to your customers and be prepared to change your pricing strategy. If you ever hear a customer say, "Wow, I was expecting to pay more!", it may be time to raise your prices significantly.
Magsto, my bet is that this question will be closed pretty quickly as being "not programming related."
However, I will tell you that your question is quite a bit more complex than can be answered here. There are lots of factors including time in market, market size, ROI you can offer, competitive advantages or disadvantages, the structure of payments (credit card, purchase order, cash), and even time of year. I only have experience here because I run my own company.
Do you really want to ask programmers this question? I think not...
I value my product as a fraction of it's value for the client. My venture sells web apps so it's slightly different, but if a web app would streamline 75K worth of overhead out of an office's budget, I charge 25K for it.
If it's a one time sale you have the option to examine the client and what value it will deliver to them. If it's a publicly sold product, the options are very different.
The basic formula is to sell it for around 30% of what it's worth to clients/end users. If you can deliver better quality than the next company, pricing in step with them is a big mistake because you can make more and take a better market share by promoting the features that justify the cost.
Customers often perceive cost as equal to quality. If you want to position your product as the highest quality option, you might consider pricing at 20% more than other competitors.
The price of a product is just as much about market strategy as maximizing profit.
You might want to read this:
Securing a .NET Application
It's a bit long, but I do get into some pricing at the end.
I am a IT enterprenuer and I my venture is into web and application development domain. When we deliver any product to our clients we ask them that
-how much time they are able to save with our application in place.
-Then we ask them to value their own time. Ex how much do they make in an Hour
-Then we do
time saved * value of time (as per the client)
these is the value that turns out for one day. We do the similar computations and demonstrate them how much they save in month and in a year
Thereafter depending on the client and the result for monthly and annual saving we give them the final price quote.
We think that these is the best way to give a software pricing as in these process we are pricing the software as per the money that software is helping our client to save
I have looked around on net and these book was suggested on almost all the forums though I have not personally been through these book but here is the download link
Dont just roll the dice

Resources