heavy iTunes Connect scraping - screen-scraping

I'm looking at different options to get the sales reports and other data out of the iTunes Connect website. Since Apple doesn't provide an API, all the solutions I found are based on scraping the page.
As I need the information for a product that we offer, I'm not that happy to give all the iTunes accounts to a 3rd party service. This is why I want to scrape it myself or use a product that runs on our servers.
My questions are:
does someone have experience how frequent apple is changing the web front-end?
has someone experience in maximum request from one server to the site? I'm afraid of being baned by apple.
anything else I have to have in mind that will cause serious trouble?
Just if someone is interested in the tools I looked at, here is a list:
Services:
http://www.appfigures.com (has API)
http://www.itunesapis.com
http://www.appannie.com/
http://www.heartbeatapp.com
Products:
http://www.appclix.com (has a enterprise licence that runs on your own server, includes API. Tends to me more a mobile analytics tool in general)
http://www.ideaswarm.com/products/appviz/ (Mac enduser app)
Open Source Tools:
http://code.google.com/p/appdailysales/
http://metacpan.org/pod/WWW::iTunesConnect
http://www.rogueamoeba.com/utm/2009/05/04/itunesconnectarchiver/
http://github.com/kasatani/iphone-stats
http://bfoz.net/projects/itc/
http://sourceforge.net/projects/itunesanalytics/
UPDATE:
I started using Kirby's python script (https://github.com/kirbyt/appdailysales) and it works very well.

does someone have experience how frequent apple is changing the web front-end?
I can't speak for all of iTunes Connect, only downloading daily sales reports. My script was rock solid and didn't require a single change between November 2009 and September 2010. This changed in September 2010 when Apple rolled out the new web site. This broke the old script, and a new one had to be written. Since rolling out the new web site, I make changes every few days to handle the tweaks from Apple. I'm hoping the tweaks will end soon.
Take a look at the download page for appdailysales.py. The dates will give you a general idea of how often I make changes to the script.
https://github.com/kirbyt/appdailysales
Again, this is only for daily sales reports. I'm not sure how frequently others areas of iTC change.
has someone experience in maximum request from one server to the site? I'm afraid of being baned by apple.
I've not experienced this, but my server runs the script only once a day. I frequently hit the iTC when working on the script, but not enough to cause a load on Apple's servers.
anything else I have to have in mind that will cause serious trouble?
I don't know what might get you in trouble with Apple, but one thing that does cause a serious headache is changes to the web site. While the new version of the web site makes screen scraping the site easier, it did involve writing a new script. Apple does not give you a heads up that they are changing something. You find out after the fact when something in your screen scraper breaks.
If you depend on the data daily, then you have to drop everything and make the necessary fixes. And there is nothing stopping Apple from rolling out another new site sometime in the future.
Hope that helps.
-KIRBY

I'm using AppSalesMobile on iPhone. It get's updated pretty quickly. Another script I use is salestrends.sh that just downloads the reports in a folder for easy import into databases etc.
If you're also interested in finding out, in which countries an app is featured, you can use my iTunesFeaturedCheck script.
Also check out this question with more links.

You might also try the Autoingestion tool from Apple. Documentation here.

appdailysales is the best tool out there that I have found.
I have modified it so that the script automatically puts the ITC data into a MySQL database instead of just saving the txt files. And as Kirby pointed out, I too only have it run once a day and everything appears to be working. Nothing has been blocked by Apple so far.
As for the script breaking, the one good thing is that Apple keeps daily sales reports for 14 days (last I checked). This means that if the script breaks, one has several days to fix the script and still get the daily sales reports.
Good luck.
Kevin

Related

How to create Salesforce incremental package.xml automatically?

Does anyone experiment in creating salesforce Package.xml automatically for continuous integration? If there any script or some idea please share.
You know incremental package.xml helps to deploy only the modified files rather than using complete package.xml that redeploy unmodified files as well which takes a lot of time.
Thanks in advance!
Tricky. And not really a programming-related problem, consider cross-posting this to https://salesforce.stackexchange.com/ or maybe even https://devops.stackexchange.com/
I don't think there's no clear answer, you'll have to experiment. Especially that you tagged "migration tool" (so old-school, battle-tested but lower priority Metadata API; seems that all focus is now on SFDX style of deployments). Do you use any version control (ideally Git) or do you hope to somehow compare source & target org, figure out the deltas and deploy only them?
Remember that often SF gets better at detecting "no changes" with every release (how old is your migration tool's jar file?). For example when I deploy my current project to an empty sandbox (exact copy of prod, no custom objects, code etc yet) the initial deploy takes ~7 minutes. But any subsequent deploy with same content or slight changes takes just 3-4. So try to calculate time lost in the grand scheme of things and decide what gains you want to see / how much time you want to spend on experimenting and tweaking the solution.
You could look into dedicated deployment solutions such as Gearset, Autorabit, Odaseva (I'm not affiliated with either and this list is not exhaustive). They often are capable of running a comparison for you.
There are several projects that try to compose package.xml based on Git diff(erence) between two commits. Of course you need to have a repo first and some regime:
https://github.com/cloudsandbox/sfdx-gen-pack saw presentation about it at Cloudforce London 2019
https://github.com/Accenture/sfpowerkit seems to have a "diff" command (disclaimer: I used to work for Accenture but not affiliated now, haven't worked on the tool, haven't used it personally)
https://cumulusci.readthedocs.io/en/latest/ this seems to be interesting and mature. Built by SF employees, not an official tool but used to CI deploy the non-profit packages they build (maybe you heard about Non Profit Starter Pack, especially if you ever considered enabling Person Accounts). I'm not sure if they do delta deployments as such but there seems to be a command that updates package.xml with files in repository so it's a start? https://cumulusci.readthedocs.io/en/latest/tutorial.html#part-4-running-tasks
I'm not saying CumulusCI will be a silver bullet but out of these 3 seems to be most actively maintained ;) But sounds like you'd have to get familiar with SFDX (if not whole thing then at least commands to convert the project back and forth between "source" (SFDX) structure and Metadata API structure
Answering my question by myself: I found git diff master feature/vat | force-dev-tool changeset create vat working!
Thanks to Roman answered in https://salesforce.stackexchange.com/questions/184332/is-there-a-pre-build-solution-for-generating-a-package-xml-from-a-git-repo

React Native production app remote diagnostics / user assistance

I have published an app built with React Native. Currently it's iOS only, but eventually may be released for Android as well. I'd like a cross-platform solution to remotely assist customers that run into bugs, crashes or any unexpected behavior. While the app could continuously log everything to a server, I've found that that's not very helpful since customers usually have very specific points in time that they need help with. Sifting through continuous logs is time consuming and generally a waste of resources.
My hope is to give the user the ability to press a button to send the stack trace, the last N minutes worth of logs, etc directly to me. This wouldn't work in the case of a hard crash of course. The vast majority of the time the app is functional when there's something they need help with.
A pie-in-the-sky idea would be to let the user share their screen with me.
Found this related question but it doesn't fully encompass what I'm trying to accomplish:
Release mode diagnostics in React Native
BugSnag looks promising. It's a paid service.
https://www.bugsnag.com/platforms/react-native-error-reporting
I tried BugSnag and a few other services. In the end, Sentry has the most reliable and simplest RN library. It's also free for the Developer plan (5k errors per month is plenty enough for us, and supports multiple apps).
https://sentry.io/pricing/

clickonce deployement strategy update for specific users for beta testing

Question is plain and simple:
I want to publish my latest version to selected users for beta testing. Is there any quick and dirty way to do the same.
Here is the link that I found in MSDN which is bit old and suggesting an approach but I dont think i need to put this much effort. Rather I will just use a different installer for beta testing.
https://msdn.microsoft.com/en-us/library/aa480721.aspx[^]
I think you should create a new site with a different publish URL. It is the most manageable option I can think of. Users connect to that site and download the beta version and could run it in parallel with another version for comparison purposes.
You are going to have two versions of the client application so you are going to have two installers; I don't think you can get around that. Having multiple publish URLs means people don't have to uninstall/re-install the app to get the "right version".
Embrace MAGE and script it out.

How healthy is the LucidDB project?

I am working on a project that would greatly benefit from a column store database on the backend. I was attracted to LucidDB since the feature set seems perfect, and I cannot commit to the cost of a commercial solution like Infobright or Vertica until the project has shown value.
The problem is, I am concerned about the health of the LucidDB project. The internal wiki hasn't been updated in more than a month, and the website is full of broken links. DynamoBI dying does not help the case.
Is there anyone who knows the state of the project, and how comfortable you'd be with production code relying on this database?
LucidDB is no longer supported by DynamoBI as they are closing the shop.
http://www.nicholasgoodman.com/bt/blog/2012/10/08/dynamobi-is-dead/
Dr.Bharatheesh Jaysimha

How to check if a visitor is using the latest version of his/her browser?

Is there a simple and automatic way of checking if a visitor to my website (written in asp.net) is using the latest version of his browser? This would allow me to display a message to inform them that they're running an old version and that they might want to upgrade.
My website is tested on most broswers but I don't test old versions (such as Internet Explorer 6 etc). When one of my visitors is using such an old version, basically, I would like to encourage (not force) them to upgrade.
Of course I could do this myself by getting the version of the browser and look it up in my database but I don't want to have to maintain a 'browser version' database myself.
Any ideas?
Speaking as a user of websites, if I come across a site that advised me to upgrade my browser then that would be an immediate black mark against that site.
I might not be able to upgrade (if I'm accessing from a corporate network for example); I might have a specific reason for using a particular version (if I'm a web developer wanting to ensure compatibility with my user community for example).
So personally, I would say that a blanket disclaimer that you don't test this site on earlier versions would be the way to go. That's quite apart from the technical challenge of what you want to do.
Edit: as Yeti points out, however valid my concerns, I don't answer the question directly. This is done in Pace's answer, and the w3schools resource he points to gives you what you need to do this on the client side.

Resources