For example site,
https://www.textmagic.com/free-tools/carrier-lookup
The above website provides reverse mobile number carrier lookup.
Would you anyone explain how this works ?
It looks like that there is a tool like "nslookup".
Anyone can say we have DB and serve them via web.
My question is how the behind process works.
If it is just crawling, where is the source ?
The very page you linked asks and answers that question:
How does the phone carrier lookup tool work?
Our software uses two
databases: one for landline numbers, with their corresponding data,
and one for mobile numbers, with their corresponding data. You can
find any number from the United States, United Kingdom, Canada and
other countries. When you enter a phone number in the carrier lookup
service, our software compares it with the information in our database
and extracts information on it instantly.
Now, you might say that only begs the question and then want to know where the data in the database comes from. But at the very least, it answers your suspicion that there's an nslookup-like tool for phone numbers. According to the web site, they created the tool, and it looks like the web page is the only way you can access it.
Related
Me and my team will be developing a POS system for a restaurant chain.
In addition to a windows application for the POS, the main idea is to make a native (not HTML5) mobile application that will help in tracking orders.
The mobile app should do this -
1. Waiters will take and track order.
2. The manager should be able to check employees's leaves,timings,etc.
On the POS part -
We are thinking to make a particular restaurant work with its local database that will get synced with the central database.
Are we on the right track?
Can HTML5 help us in the mobile application?
How many do you willing to pay for me to give you the architecture? Just kidding.
Think of a retailer that:
Has only 10 kind of items
With sell only 5 quantity of each items each day
That has only 1 staff and it is his/her family
That sell only with cash and not electronic payments
Do you think that kind of retailer need a mobile application which can do POS with cloud computing support? Of course not. He/she maybe only need a book written manually with simple accounting debit credit to track his/her income/expense. But a big company like W*lmart will do.
If you want to make an application, make sure that you has the requirement. Why bother create something that is not needed?
If you want to make a general POS application, collect the basic requirement first. What they need. Will they do input by entering the item code, or scanning using barcode? Will they need to make a payment after the item being entered (typical retailer system), or they need to make a payment after they asked for the bill (typical restaurant system). It will later decide your architecture.
Now asking your question:
In addition to a windows application for the POS, the main idea is to make a native (not HTML5) mobile application that will help in tracking orders
As I have have said before, the architecture will depend on the device used and how the process be done.
The mobile app should do this -
Waiters will take and track order.
The manager should be able to check employees's leaves,timings,etc.
First point is your requirement but 2nd point. What? Track employee's leaves, timings,etc? It should be included in HR module, not POS module. Keep your application scope to the smallest and make sure that each of your module are in same topic.
On the POS part -
We are thinking to make a particular restaurant work with its local database that will get synced with the central database.
Try it and you will know whether this distributed database design are good or not. Search for articles about pros and cons for this design and make sure you know the sequences before starting design.
Are we on the right track? Can HTML5 help us in the mobile application?
It is like a same question as: Can a keyboard help us type? HTML5 is a language used in any web - server programming. It is kind alike of rendering / layout technique. If you want to aim for modern browser such as latest version of firefox, then go for it. But really, do you know how HTML5 can help you, or do you know what are the downside of using HTML5?
Since this app will only be used by employees of the restaurant,
(not by customers), it would be safe to assume targeted mobile
platform, Android or IOS.
Since, the inventory, employees leaves etc is being tracked in
restaurant locally, the idea of using local DB which syncs with
the remote DB seems the right approach.
The menu of items and their codes should stored at central DB
server and fetched from there.
I feel like i should almost give a friggin synopsis to this/these lengthy question(s)..
I apologize if all of these questions have been answered specifically in a previous question/answer post, but I have been unable to locate any that specifically addresses all of the following queries.
This question involves data extraction from the web (ie web scraping, data mining etc). I have spent almost a year doing research into these fields and how it can be applied to a certain industry. I have also familiarized myself with php and mysql/myphpmyadmin.
In a nutshell I am looking for a way to extract information from a site (probably several gigs worth) as fast and efficiently as possible. I have tried web scraping programs like scrapy and webharvey. I have also experimented with programs like HTTrack. All have their strengths and weaknesses. I have found that webharvey works pretty good yet it has its limitations when scraping images that are stored in gallery widgets. Also I find that many of the sites I am extracting from use other methods to make mining data a pain. It would take months to extract the data using webharvey. Which I can't complain given that I'd be extracting millions of rows worth of data exported in csv format into excel. But again, images and certain ajax widgets throw the program off when trying to extract image files.
So my questions are as follows:
Are there any quicker ways to extract said data?
Is there any way to get around the webharvey image limitations (ie only being able to extract one image within a gallery widget / not being able to follow sub-page links on sites that embed their crap funny and try to get cute with coding)?
Are their any ways to bypass site search form parameters that limit the number of search results (ie obtaining all business listings within an entire state instead of being limited to a county per search form restrictions)**
Also, this is public information so therefore it cannot be copyrighted; anybody can take it :) (case in point: Feist Publications v. Rural Telephone Service). Extracting information is extracting information. Its legal to extract as long as we are talking facts/public information.
So with that said, wouldn't the most efficient method (grey area here) of extracting this "public" information (assuming vulnerabilities existed), be through the use of sql injection?... If one was so inclined? :)
As a side question just how effective is Tor at obscuring ones IP address? Lol
Any help, feedback, suggestions or criticism would be greatly appreciated. I am by no means an expert in any of the above mentioned fields. I am just a motivated individual with a growing interest in programming and automation who has a lot of crazy ideas. Thank you.
You may be better off writing your own Linux command-line scraping program using either a headless browser library like PhantomJS (JavaScript), or a test framework like Selenium WebDriver (Java).
Once you have your scrape program completed, you can then scale it up by installing it on a cloud server (e.g. Amazon EC2, Linode, Google Compute Engine or Microsoft Azure) and duplicating the server image to as many are required.
I am looking for a method of dynamically linking product information based on the name of the product.
For example: User types in "Playstation 3", the site would then go out and grab any information it can, such as picture, retail price, etc. Ideally, it would let you choose the correct item (returns both ps3 controller and ps3 console, user can choose which). It would then use this information in a product listing.
The easiest way I can think to implement this is to use the existing API of a major retailer such as Amazon. I have a couple completely different ideas for sites, one of which would involve selling from amazon (which I would assume they would be ok with) and another which would only be data mining the information. I am concerned they would not take it very kindly if I was just stealing their images and descriptions.
Is there another way, maybe less "sneaky" way to accomplish this that wouldn't be in legally frowned upon ?
Many web-commerce companies use a data stream known as an API - EBay, Etsy, and Amazon all have API feeds for their products. If you can convince the company to allow you access to their API (usually they will give you a key/password), then you can directly access their back-end database, typically at the read-only level. Depending on the company, you can just write them directly for access.
You are correct when you say that most companies wouldn't take kindly to someone web-scraping their product directory and re-using it. That is unethical, and could lead to big trouble with larger companies with a significant legal presence.
On the other hand, there is nothing to prevent you from cobbling together several API feeds into a Mash-Up - try Yahoo Pipes! to learn the basics of API/Mash-Up integration:
Yahoo Pipes:
http://pipes.yahoo.com/pipes/
Here is the link to Amazon's Product Advertising API program:
https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html
Good luck, and happy development!
Many online retailers provide a product feed - either well-publicized (William M-B has listed some examples), or sorta-kinda hidden, for the purposes of affiliate marketing. They usually have terms of use around those product feeds, describing in detail what you're allowed to do with them, and exactly how many of your limbs are at risk if you don't play by their rules.
However, the mechanism you're describing sounds remarkably similar to a search engine; there's a well-established precedent for search engines indexing sites, and using their content to reason about the underlying site. Get a lawyer to validate this, but there's a good chance that your intended purpose falls under "fair use".
I'm representative of http://aerse.com.
We are building service, that do the following:
search product by name. For example: galaxy s3, galaxy s 3 or galaxy sIII
return technical specifications (CPU, RAM etc) and product images (thumbnails and high-res images)
provide API http://aerse.com/p
deal with legal issues, provide licenses & etc.
I'm interesting in finding songs based on attributes (minor key tonality, etc). These are things listed in the details of why Pandora picks songs, but using Pandora, I have to give it songs/artists.
Is there any way to get the Music Genome database (or something similar) so I can search for songs based on attributes (that someone else has already cataloged)
You can use Gracenote's Global Media Database and search with Track-level attributes.
"Gracenote's Media Technology Lab scientists and engineers take things further by utilizing technologies like Machine-Listening and Digital Signal Processing to create deep and detailed track level descriptors such as Mood and Tempo."
I don't think there is any way to access this proprietary data, something I asked them about long ago. It seems to me they want to protect this unique part of their system; after all, they've paid for the man hours to label each song. Even if Pandora releases a developer API, which they've hinted at, I doubt it will provide access to the Music Genome information.
Give Echo Nest a shot!
To add to above answers, Pandora's statement (as viewed using the above link in combination with the Internet Archive) was:
"A number of folks also asked about the prospect for an open API, to allow individual developers to start building on the platform. We're not there yet, but it's certainly food for thought."
Given that this was seven years ago, I think their decision is pretty clear.
I'm in the planning stages of building a SQL Server DataMart for mail/email/SMS contact info and history. Each piece of data is located in a different external system. Because of this, email addresses do not have account numbers and SMS phone numbers do not have email addresses, etc. In other words, there isn't a shared primary key. Some data overlaps, but there isn't much I can do except keep the most complete version when duplicates arise.
Is there a best practice for building a DataMart with this data? Would it be an acceptable practice to create a key table with a column for each external key? Then, a unique primary ID can be assigned to tie this to other DataMart tables.
Looking for ideas/suggestions on approaches I may not have yet thought of.
Thanks.
The email address or phone number itself sounds like a suitable business key. Typically a "staging" database is used to load the data from multiple sources and then assign surrogate keys and do other transformations.
Are you familiar with data warehouse methods and design patterns? If you don't have previous knowledge or experience then consider hiring some help. BI / data warehouse projects have a very high failure rate and mistakes can be expensive.
Found more information here:
http://en.wikipedia.org/wiki/Extract,_transform,_load#Dealing_with_keys
Well, with no other information to tie the disparate pieces together, your datamart is going to be pretty rudimentary. You'll be able to get the types of data (sms, email, mail), metrics for each type over time ("this week/month/quarter/year we averaged 42.5 sms texts per day, and 8000 emails per month! w00t!"). With just phone numbers and email addresses, your "other datamarts" will likely have to be phone company names, or internet domains. I guess you could link from that into some sort of geographical information (internet provider locations?), or maybe financial information for the companies. Kind of a blur if you don't already know which direction you want to head.
To be honest, this sounds like someone high-up is having a knee-jerk reaction to the "datamart" buzzword coupled with hearing something about how important communication metrics are, so they sent orders on down the chain to "get us some datamarts to run stats on all our e-mails!"
You need to figure out what it is that you or your employer is expecting to get out of this project, and then figure out if the data you're currently collecting gives you a trail to follow to that information. Right now it sounds like you're doing it backwards ("I have this data, what's it good for?"). It's entirely possible that you don't currently have the data you need, which means you'll need to buy it (who knows if you could) or start collecting it, in which case you won't have nice looking graphs and trend-lines for upper-management to look at for some time... falling right in line with the warning dportas gave you in his second paragraph ;)