Mobile advertising - how are invalid or fraud clicks determined? - mobile

For mobile advertising, how are invalid or fraud clicks determined ?
For web traffic, one of the filters is IP address based, however this does not apply for mobile traffic since mobile devices may share the IPs or may change IPs during a particular session.
Thanks.

We can only speculate how they detect fraud/invalid clicks.
The only thing I could think of is the detection of your IP address and cookies.

Could also be linked to your IMEI/UDID/unique ID code for your phone. It's in the network's best incentive to not mention how their fraud detection works.

Some of the obvious indicators could be,
You are seeing thousands or millions of clicks / installs from same device ( device identifiers - uuid, ifa etc.) - it's impossible for humans to deliver more than couple dozens clicks a day.
If the CTR/IR for a campaign is too high (I would think ctr > 45% is fraud, practically its way too lower ( ~ 15%)
Particularly you are seeing a very high CTR but conversion ( e.g converted installs) is too low - it's also an indicator of fraud clicks.

Related

Should I specify regional NTP pool or just pool.ntp.org?

From the NTP Pool Project (https://www.ntppool.org/en/use.html):
Looking up pool.ntp.org (or 0.pool.ntp.org, 1.pool.ntp.org, etc) will
usually return IP addresses for servers in or close to your country.
For most users this will give the best results.
...
You can also use the continental zones (For example europe,
north-america, oceania or asia.pool.ntp.org), and a country zone (like
ch.pool.ntp.org in Switzerland) - for all these zones, you can again
use the 0, 1 or 2 prefixes, like 0.ch.pool.ntp.org. Note, however,
that the country zone might not exist for your country, or might
contain only one or two timeservers.
I'm reading this several times and its not obvious whether I should specify the regional pool or not. I know which country my machine is in of course, so am happy to specify the pool. But it sounds like it doesn't result in better results.
Since the pool option means the protocol will select the best server after statistical analysis, I would have expected that specifying a regional pool would give a better distribution of servers from which to choose the winner. Any insights appreciated.
The country specific will contain a few servers that are only in the country zone you pick.
The region specific will contain more servers across an entire region (individual servers may be within any country)
The general pool will return lots of servers from anywhere.
There maybe less country specific that region specific servers for example the UK may have 20 servers, but Europe may have 100 servers over all the countries that make up the Europe region.
The best thing you can do is test what works best for your setup and locations. You should run a test deployment for a while and gather some stats. If you are using ntp' you can run ntpq -pcrv which will give you all the info you need.
You can also refer to my answer here as to what all the information means and how to interpret it.

Sloot Digital Coding System

" In the late 1990s, a Dutch electronics technician named Romke Jan Berhnard Sloot announced the development of the Sloot Digital Coding System, a revolutionary advance in data transmission that, he claimed, could reduce a feature-length movie down to a filesize of just 8KB. The decoding algorithm was 370MB, and apparently Sloot demonstrated this to Philips execs, dazzling them by playing 16 movies at the same time from a 64KB chip. After getting a bunch of investors, he mysteriously died on September 11, 1999"
it's possible or just a story
There are two views on the story of the Sloot Digital Coding System. They are incompatible: In one view it is impossible, in the other it is possible.
What is impossible?
To store every possible movie down to a file size of just 8KB. This boils down to the Pigeonhole principle.
A key of a limited length (whether it is a kilobyte or a terabyte) can
only store a limited number of codes, and therefore can only
distinguish a finite number of movies. However, the actual number of
possible movies is infinite. For, suppose it were finite; in that case
there would be a movie that is the longest. By just adding one extra
image to the movie, I would have created a longer movie, which I
didn't have before. Ergo, the number of possible movies is infinite.
Ergo, any key of limited length cannot distinguish every possible
movie.
The SDCS is only possible if keys are allowed to become infinite, or
the data store is allowed to become infinite (if the data store
already contains all movies ever made, a key consisting of a number
can be used to select the movie you want to see -- however, in that
case it is impossible to have keys for movies that have not been made
yet at the time the data store was constructed). This would, of
course, make the idea useless.
Pieter Spronck
What is possible?
To store or load a finite amount of feature-length movies on a device and be able to unlock them with a 8KB key.
Then it is not so about compression, but encoding / databases / data transmission. This is a change in distribution model: Why ship software/data at a later time over telephone or DVD, when you can pre-store it during fabrication, or pipe it all at once at intervals. This model is pretty close to how phones come with pre-loaded apps, or how some games allow you to unlock new game elements by entering a key.
The Sloot patents never claim feature-length movie -> 8KB data compression. They claim an 8x compression rate.
It is not about compression. Everyone is mistaken about that. The principle can be compared with a concept as Adobe-postscript, where sender and receiver know what kind of data recipes can be transferred, without the data itself actually being sent.
- Roel Pieper
In this view SDCS is a primitive form of DRM, that would reduce the band-with of getting access to a certain piece of pre-stored data to an 8KB key.
Imagine storing that month's popular movies by bringing your device to your local video store. Then when you want to see an available movie, you just call for your key, or buy a chipcard at the gas station. Now we have enough band-width for streaming Netflix, but back in the late 90s we were on dial-up and there was a billion dollar data transmission industry (DVD's, CD's, Video tapes, floppies, hard disks).
Was playing 16 movies at once possible?
This is unverified. Though many investors claim to have seen the demonstration. These people worked for respected companies like Philips, Oracle, Endemol, 'Kleiner, Perkins, Caufield and Byers'. I'd say it is not impossible, but await more verification.
A very interesting concept. Conceptually, the Sloot encoding premise seems to be that the "receiver" would have a heavy data rich (DRM-Like) program, capable of a large pre-programmed capabilities ready and able to execute complex programming tasks with minimal data instruction.
I am not a programmer, however, at present, current data transfer challenges exist where there seems to be more focus on the "transmission" of the of data (dense and voluminous), versus the capability of the receiving program/hardware. Whereas with Sloot, the emphasis is on the pre-loading of such data (with hardware/software that has much higher capabilities built-in). I hope I'm not saying the obvious here.
As an example, using sound files for simplicity, rather than sending a complex sound file containing say an Mp3 of Vivaldi – The Four Seasons, the coding just instructs the receiver the "musical notes" of the composition, where the system is pre-programmed to play the notes. Obviously there is more to it than that, however, the concept makes perfect sense. In other words rather than transmitting "Vivaldi" data rich signal, send simpler instructions to a "Vivaldi" trained receiver. Don't send the composer, send the instructions to a composer already there.
Yes, movies can contain billions of instructional data under the current system (and that of 1999), however, can beefing up the abilities, the pre-progammed functions, of the receiver achieve what Sloot had figured out?
Currently, the data stream seems to be carrying the load, where instead the receiver should be, as suggested by Sloot. So, does it make more sense to send the music composer by train to the concert hall across the country, or to send the music notes to another composer who is already there? This not to be confused with pre-loaded movies being "unlocked", rather that the movie player has infinite abilities that simple coding can instruct within an order of magnitude greater ability.
Just some random thoughts from a layman.

Are there data sources for high fidelity cloud cover data?

Is there anything near the resolution of Himawari-8 for the entire globe?
The only one I know of is http://neo.sci.gsfc.nasa.gov/view.php?datasetId=MODAL2_D_CLD_FR&date=2016-01-01 however there are missing stripes and the granularity is on a daily basis.
Do you mean the globally data for cloud coverage? Himawari-8 is geostationary satellite. So it is able to observe continuously every 10 min for fulldisk area. But only cover one part of earth.
So if you want cloud coverage product for globally from geostationary satellite, you may look for the combined data from several geo-satellite. Like Geos-R (US), MSG (europe), Himawari-8 (japan), etc.
Previously there was CLAUS data, developed by BADC in UK. But the data (I guess) only until 2008...(need to check)
Hope it might help.

Database Synchronization Algorithm Advice

I'm working on an application that needs an algorithm for data synchronization to be implemented.
We'd be having a main server , and multiple subordinate devices , which would need to be synced together.
Now , I have three algorithms and I'd like advice on which one would be the best according to any of you.I'd really really appreciate your opinions.
1. A description of the algorithm can be found here.Its a scientific research paper by Sang-Wook Kim
Division of Information and Communications
Hanyang University, Korea
http://goo.gl/yFCHG
2 This algorithm involves maintaing a record of timestamps and version numbers of databases
If for instance , one has version v10 , on one’s mobile device and the server , has v12 , the mobile, assuming that the current timestamp on the mobile device is less recent as compared to the timestamp on the server,
If we denote a deletion by - , an insertion by a + and a change by ~
And the following change logs are associated with a few versions :
v11: +r(44) , ~r(45),-r(46)
v12: -r(44),~r(45),+r(47)
Then the overall change in the database is , ~r(45) ( from v12),+r(47),-r(46)
Hence it can be seen that the record r(44) , wasn’t needed ,even though it was added, and then subsequently deleted. Hence no redundant data needs to be transferred.
The whole algorithm can be found here ( I've put it up in a pdf ) http://goo.gl/yPC7A
3 This algorithm in effect - keeps a table that records the last change timestamp for each record.And keeps rows sorted according to timestamp.It synchronizes only those rows that have been changed ,the only con i see here is sorting the table each time according to timestamps .
Here's a link http://goo.gl/8enHO
Thanks a ton for your opinions ! :D
I have not been involved in this directly myself, but I have been around when people were working on this sort of thing. Their design was driven not by algorithm analysis or a search for performance, but by hours spent talking to representatives of the end users about what to do when conflicting update requests were received. You might want to work through some use cases with users. It is even possible that users will want different sorts of conflict resolution for different sorts of data in different places.
All the designs here save bandwidth by propagating changes. If something ever causes one side to stop being an exact copy of the other, this inconsistency can persist indefinitely. You could at least detect such a problem by exchanging checksums (SHA-2 or SHA-3 if you are worried enough). One idea is to ask the recipient system for a checksum and then select an package of updates based on that checksum.

Is the CELL-ID stored in HLR database , how to get location of a Cell-Id

Is it easy to get the current LIST of mobile phone users under a particular tower(cell-id) from the Home Location Register, does the network operators or service providers have mapping of location information to a particular cell like latitude ,longitude.
To get actual cell id of a mobile phone form HLR/VLR - you should use SS7 signaling commands: sendRoutingInfoForSM and provideSubscriberInfo
To ged cell id latitude/longitude - you can use services like opencellid.org or locationapi.org
someonespecial
You will not get this information from the HLR.
Obviously operators know where their own cells are but it's often considered business intelligence and not public information.
There are some commercial and non-commercial list of cell towers and their GPS locations but those information may not be updated real time and may misguide you because of continuous operational tasks carried by MNOs such as moving cell stations to another locations, installing or decommissioning them.
There are several ways to get current cell id of mobile number but all require integration with MNO’s equipment using with either SS7 protocol suite or other available proprietary interfaces.
Sending AnyTimeInterrogation MAP operation to HLR and processing its response
Getting Location Update dumps by means of core network specific interface or port mirroring on L3 switches of HLR for all Sigtran links between HLR and MSSs.
Using existing SMLC platform of MNO
But it would not be enough to build up-to-date geolocation database. In any case you would need to get CELL ID/LAC/GPS location list regularly as exportable dump from MNO you are willing to work.

Resources