Amazon MWS: Convert ASIN to EAN/UPC - amazon-mws

I have a list of ASINs and need to get the corresponding EAN/UPC values.
I am aware this is possible using AWSECommerceService and ItemLookup call. However, my application already uses MWS, and I'd like to avoid using two APIs, two access keys, etc.
The most similar API call in MWS is GetMatchingProduct. However, the returned data does not include an EAN/UPC. I would be astonished if this is impossible with MWS, however, I can't see anyway to get EAN/UPC.
Any suggestions appreciated,
Paul

I don't think there is a call that does what you want. There is a call that does the opposite, if that is of any help: GetMatchingProductFromId will return the ASIN for a given EAN or UPC. Why the result from this call (and from GetMatchingProduct) does not return EANs etc. is beyond me.
If you already have items listed through MWS, the _GET_MERCHANT_LISTINGS_DATA_ report might help

Just answering this question for my own amusement and because I might need it in the future when I have forgotten I previously looked at this.
Amazon apparently consider EAN for ASIN/SellerSKU proprietary information which is why their standard seller APIs don't return it. This doesn't make a huge amount of sense to me personally because you would think that it would at least return them for your own products (when specifying your own sku and authentication information.)
I've combed the documentation, mws forums and also asked Amazon directly but it looks like it's not available through standard APIs.
I've ready somewhere that it may be possible via APIs available to associates but that's not me so remains a rumour.

Related

How to not expose base64 encoded UUIDs

I have a doubt regarding the exposure of internal database primary keys.
I have decided to use UUIDs in place of auto-increment longs (see here for details). This way, among other things, people cannot discover the relative size of my data or their growth over time.
Now, the UUID doesn't provide any internal information but it is not very URL friendly, although it is URL safe. Furthermore if long PKs shouldn't be exposed, then UUIDs shouldn't either.
Usually to make UUIDs more user friendly, people base64 encode them.
Example:
- UUID: 7b3149e7-bdab-4895-b659-a5f5b0d0
- base64: ezFJ572rSJW2WQAApfWw0A
My point is: anyone could still take those base64 string from the url and decode them in order to obtain the original UUID. This means that even in this case UUIDs would end up being exposed as well.
Should I use another type of encoding? Is out there something already known or should I create my custom encoding? If yes, should I follow any guidelines?
Thank you
On the first look to be able to provide a small tiny level of Secrecy to those Identifiers you can use one way Hash functions such as SHA2(which is a Cryptographic function and not Encoding). This will literally buy you no specific security advantage.
If you are relying only on Object Reference IDs for access control and try to make them secret then I suggest you think twice at your Access Control and Authorization Model.
It is good to have random/non-guessable/Collision Free Object Reference IDs, however If you are relying on Secrecy of Reference ID for security this is a big flaw (in Old OWASP Top10 this was referred as Direct Object Reference Identifier Issue and in OWASP 2017 this is referred as Broken Access Control Issue). You need to consider a Full AAA chain: Authentication,Authorization,Audit/Accountability for Access by relying on a Random unique Token with a short validity period, which later on can be used to decide on Authorization and Access levels of your system's to be tied with a subject and permit them to interact with the Objects that they are entitled with.
The reason you aren't supposed to expose PKs is that they may (a) leak information and (b) allow people to guess other values. Neither is true of UUIDs (at least v3/4/5), which is one of the main reasons to use them in the first place. The human factor you mention is why so many folks use base64 (or other) encoding; it's not for security.
That said, you should never rely on URL secrecy as security; there are far too many ways that URLs leak, and your users may even do it intentionally--but they'd be very upset if sending a link to their friend meant that friend had full access to their account.

google app engine computedProperty: when to use? When not to use?

When would using a ComputedProperty (ndb) in google app engine give you a distinct advantage over just computing when needed, on the backend (such as in a handler), without the datastore being involved?
Everything I'm reading seems to indicate that it's mostly useless, and would just slow queries down (at least the put operation if nothing else).
Thoughts?
I did see this:
"Note: Use ComputedProperty if the application queries for the computed value. If you just want to use the derived version in Python code, define a regular method or use Python's #property built-in."
but that doesn't really explain any advantage (why query if you can derive?)
The documentation is quite clear on that regard, and i'll cite it again for reference, the Computed Properties section:
Note: Use ComputedProperty if the application queries for the computed value. If you just want to use the derived version in Python code, define a regular method or use Python's #property built-in.
When to use it? When you need to query some derived data, it needs to be written to the datastore so it gets indexed.
First example that came to mind: You're already storing the birthday of a user, but also need to filter by actual age, adding a property to derive that value might be the easiest and most efficient solution:
age = ndb.ComputedProperty(lambda self: calc_age(self.birthday))
Of course you could just have a function that returns the age, but that's only useful after you get the entity, can't use it for queries.

Understanding vCloud statueses

I'm trying to wrap my mind around the statuses that vCloud returns in their SDK, but there seems to be very light documentation on them. A few of them I don't understand what they're about, and in practice I'm only seeing POWERED_ON, POWERED_OFF, and SUSPENDED. The only documentation on the statuses that I can find are here:
http://www.vmware.com/support/vcd/doc/rest-api-doc-1.5-html/operations/GET-VApp.html
What confuses me are things like "what is an 'entity'? And what does it mean when it's 'resolved'?" When I go to provision a VM and monitor its state, it starts at POWERED_OFF and goes to POWERED_ON, when I would expect to see some intermediary statuses while it's in the process of provisioning. Does anyone know where I can go to find out more about this?
This page from the vCD 5.1 documentation shows the possible values of the status field for various entities. The current doc uses numerical values but the API also has a few spots where string values are returned instead. The reference you found from the 1.5 API includes some of them; I think as part of the 5.1 doc update the string values were dropped from the schema reference.
An entity in the vCloud API is very similar to the likewise-named notion in database modeling. Wikipedia provides a fair definition of the term from entity-relationship modeling:
An entity may be defined as a thing which is recognized as being
capable of an independent existence and which can be uniquely
identified.
The RESOLVED (numerical value 1) state means that most of the parts of the entity are present, but it isn't fully constructed yet. You typically see it when uploading an OVF and all of the bits have be transferred to vCD but stuff is still happening in the background prior to it being usable.

Best approach for retrieving data in a nested 1-to-many relationship

I have the following models: Client, Device and Revision
Cliente has many Device, and Device has many Revision
I want to retrieve all the latest revisions from a Client. I know I can use recursive=2 in client and I will get revisions like this:
$client['Client']['Device'][$i]['Revision'] //(array of revisions)
What I actually want is getting the latest revisions like these:
$client['Client']['Revision'][$i] //(array of revisions)
$client['Client']['Revision'][$i]['Device'] //and I may see the device like this. I know this is duplicated info, but the order is pretty much different.
I know there are plenty of ways of doing it. Much of the ones I think involve using direct SQL or processing the arrays, but is there actually a method that may be able to do this by just passing parameters?
I also thought of first getting all the device_id of a certain client, and the find all revisions where device_id IN (device1_id, device2_id, ...), but I don't really this, thought, not sure if there is a better way.

What are some techniques for stored database keys in URL

I have read that using database keys in a URL is a bad thing to do.
For instance,
My table has 3 fields: ID:int, Title:nvarchar(5), Description:Text
I want to create a page that displays a record. Something like ...
http://server/viewitem.aspx?id=1234
First off, could someone elaborate on why this is a bad thing to do?
and secondly, what are some ways to work around using primary keys in a url?
I think it's perfectly reasonable to use primary keys in the URL.
Some considerations, however:
1) Avoid SQL injection attacks. If you just blindly accept the value of the id URL parameter and pass it into the DB, you are at risk. Make sure you sanitise the input so that it matches whatever format of key you have (e.g. strip any non-numeric characters).
2) SEO. It helps if your URL contains some context about the item (e.g. "big fluffy rabbit" rather than 1234). This helps search engines see that your page is relevant. It can also be useful for your users (I can tell from my browser history which record is which without having to remember a number).
It's not inherently a bad thing to do, but it has some caveats.
Caveat one is that someone can type in different keys and maybe pull up data you didn't want / expect them to get at. You can reduce the chance that this is successful by increasing your key space (for example making ids random 64 bit numbers).
Caveat two is that if you're running a public service and you have competitors they may be able to extract business information from your keys if they are monotonic. Example: create a post today, create a post in a week, compare Ids and you have extracted the rate at which posts are being made.
Caveat three is that it's prone to SQL injection attacks. But you'd never make those mistakes, right?
Using IDs in the URL is not necessarily bad. This site uses it, despite being done by professionals.
How can they be dangerous? When users are allowed to update or delete entries belonging to them, developers implement some sort of authentication, but they often forget to check if the entry really belongs to you. A malicious user could form a URL like "/questions/12345/delete" when he notices that "12345" belongs to you, and it would be deleted.
Programmers should ensure that a database entry with an arbitrary ID really belongs to the current logged-in user before performing such operation.
Sometimes there are strong reasons to avoid exposing IDs in the URL. In such cases, developers often generate random hashes that they store for each entry and use those in the URL. A malicious person tampering in the URL bar would have a hard time guessing a hash that would belong to some other user.
Security and privacy are the main reasons to avoid doing this. Any information that gives away your data structure is more information that a hacker can use to access your database. As mopoke says, you also expose yourself to SQL injection attacks which are fairly common and can be extremely harmful to your database and application. From a privacy standpoint, if you are displaying any information that is sensitive or personal, anybody can just substitute a number to retrieve information and if you have no mechanism for authentication, you could be putting your information at risk. Also, if it's that easy to query your database, you open yourself up to Denial of Service attacks with someone just looping through URL's against your server since they know each one will get a response.
Regardless of the nature of the data, I tend to recommend against sharing anything in the URL that could give away anything about your application's architecture, it seems to me you are just inviting trouble (I feel the same way about hidden fields which aren't really hidden).
To get around it, we usaully encrypt the parameters before passing them. In some cases, the encyrpted URL also includes some form of verification/authentication mechanism so the server can decide if it's ok to process.
Of course every application is different and the level of security you want to implement has to be balanced with functionality, budget, performance, etc. But I don't see anything wrong with being paranoid when it comes to data security.
It's a bit pedantic at times, but you want to use a unique business identifier for things rather than the surrogate key.
It can be as simple as ItemNumber instead of Id.
The Id is a db concern, not a business/user concern.
Using integer primary keys in a URL is a security risk. It is quite easy for someone to post using any number. For example, through normal web application use, the user creates a user record with an ID of 45 (viewitem/id/45). This means the user automatically knows there are 44 other users. And unless you have a correct authorization system in place they can see the other user's information by created their own url (viewitem/id/32).
2a. Use proper authorization.
2b. Use GUIDs for primary keys.
showing the key itself isn't inherently bad because it holds no real meaning, but showing the means to obtain access to an item is bad.
for instance say you had an online store that sold stuff from 2 merchants. Merchant A had items (1, 3, 5, 7) and Merchant B has items (2, 4, 5, 8).
If I am shopping on Merchant A's site and see:
http://server/viewitem.aspx?id=1
I could then try to fiddle with it and type:
http://server/viewitem.aspx?id=2
That might let me access an item that I shouldn't be accessing since I am shopping with Merchant A and not B. In general allowing users to fiddle with stuff like that can lead to security problems. Another brief example is employees that can look at their personal information (id=382) but they type in someone else id to go directly to someone else profile.
Now, having said that.. this is not bad as long as security checks are built into the system that check to make sure people are doing what they are supposed to (ex: not shopping with another merchant or not viewing another employee).
One mechanism is to store information in sessions, but some do not like that. I am not a web programmer so I will not go into that :)
The main thing is to make sure the system is secure. Never trust data that came back from the user.
Everybody seems to be posting the "problems" with using this technique, but I haven't seen any solutions. What are the alternatives. There has to be something in the URL that uniquely defines what you want to display to the user. The only other solution I can think of would be to run your entire site off forms, and have the browser post the value to the server. This is a little trickier to code, as all links need to be form submits. Also, it's only minimally harder for users of the site to put in whatever value they wish. Also this wouldn't allow the user to bookmark anything, which is a major disadvantage.
#John Virgolino mentioned encrypting the entire query string, which could help with this process. However it seems like going a little too far for most applications.
I've been reading about this, looking for a solution, but as #Kibbee says there is no real consensus.
I can think of a few possible solutions:
1) If your table uses integer keys (likely), add a check-sum digit to the identifier. That way, (simple) injection attacks will usually fail. On receiving the request, simply remove the check-sum digit and check that it still matches - if they don't then you know the URL has been tampered with. This method also hides your "rate of growth" (somewhat).
2) When storing the DB record initially, save a "secondary key" or value that you are happy to be a public id. This has to be unique and usually not sequential - examples are a UUID/Guid or a hash (MD5) of the integer ID e.g. http://server/item.aspx?id=AbD3sTGgxkjero (but be careful of characters that are not compatible with http). Nb. the secondary field will need to be indexed, and you will lose benefits of clustering that you get in 1).

Resources