Do MWS API report time limits apply to data being added or data within the report? - amazon-mws

I’m looking specifically at the FBA Customer Shipment Sales Report report, but I believe the question applies more generally to most reports.
One of the columns in the report is the “Shipment Date”. When I request this report via the MWS API, I can specify a StartDate and an EndDate. Do these dates filter on the “Shipment Date” column, or do they instead filter based on the date that the data was added to the report?
For example, if an order ships at 2019-07-29T12:00:00Z, but Amazon doesn’t actually add it to the report until an hour later at 2019-07-29T13:00:00Z, then if I generate this report with an EndDate of 2019-07-29T12:00:00Z, will this shipment appear in the report? Or will it only appear if the EndDate is greater than or equal to 2019-07-29T13:00:00Z since that’s the time the shipment was actually added to the report?
I understand that in general this report is near real-time, so it may not matter 99% of the time, but I’m concerned about the rare times where the data my be delayed coming into the report. I want to make sure I will still be able to see the new data based on my data filters.

I think I found my answer here: https://sellercentral.amazon.com/gp/help/200453120?language=en_US&ref=ag_200453120_cont_201074420
It says:
The report contains all completed shipments reported to FBA during the specified time period. This may not include all items that were shipped during that time frame if they have not yet been reported to our system. Those items will be reported in a future time period. This ensures that the report data will always be consistent for any given date range.
And:
Shipment dates are based on when the shipment was reported to the system, which is generally a few hours after the actual ship date. Other reports may calculate shipment dates differently.
So the answer is actually that the "Shipment Date" is the date the shipment was reported and added to the report, which is not necessarily the same as the date and time the shipment actually took place.

Related

Add conditional columns and custom value columns to database-linked query

TL;DR version: I have a query linked to a database. I need to add some columns to it for data that isn't linked to the database, but don't know how.
I'm quite new to SQL and Access (got a reasonable grasp of Excel and VBA though) and have a pretty complex reporting task. I've got halfway (I think) but am stuck.
Purpose
A report showing how many (or what percentage of) delivery lines were late in a time period, with reasons for their being late from a set list, and analysis of what's the biggest cause of lateness.
Vague Plan
Create a table/query showing delivery lines with customer, required date and delivery date, plus a column to show whether they were on time, plus another to detail lateness reason. Summaries can be done afterwards in Excel. I'd like to be able to cycle through said table in form view entering lateness reasons (they'll be from a linked table, maybe 4 or 5 options).
Sticking Point
I have the data, but not the analysis. I've created the original data output query, it's linked to a live SQL database (Orderwise) so it keeps updating. However I don't know:
how to add extra columns and lookups to that to figure / record whether it's ontime and lateness reason
how I'll be able to cycle through the late ones in form view to add reasons
How do I structure the access database so it can do this please?

Database Design: How do I handle tracking goals vs. actuals over time?

This isn't exactly a programming question, as I don't have an issue writing the code, but a database design question. I need to create an app that tracks sales goals vs. actual sales over time. The thing is, that a persons goal can change (let's say daily at most).
Also, a location can have multiple agents with different goals that need to be added together for the location.
I've considered basically running a timed task to save daily goals per agent into a field. It seems that over the years that will be a lot of data, but it would let me simply query a date range and add all the daily goals up to get an goal for that date range.
Otherwise, I guess I could simply write changes (i.e. March 2nd - 15 sales / week, April 12th, 16 sales per week) which would be less data, but much more programming work to figure out goals based on a time query.
I'm assuming there is probably a best practice for this - anyone?
Put a date range on your goals. The start of the range is when you set that goal. The end of the range starts off as max-collating date (often 9999-12-31, depending on your database).
Treat this as "until forever" or "until further notice".
When you want to know what goals were in effect as of a particular date, you would have something like this in your WHERE clause:
...
WHERE effective_date <= #AsOfDate
AND expiry_date > #AsOfDate
...
When you change a goal, you need two operations, first you update the existing record (if it exists) and set the expiry_date to the new as-of date. Then you insert a new record with an effective_date of the new as-of date and an expiry_date of forever (e.g. '9999-12-31')
This give you the following benefits:
Minimum number of rows
No scheduled processes to take daily snapshots
Easy retrieval of effective records as of a point in time
Ready-made audit log of changes

Hit Counter: Separate Date + Time Fields vs one DateTime2 field

I am planning out a hit counter, and I plan to make many report queries to show number of hits total in a day, the past week, the past month, etc, as well as one that would feed a chart that shows what time of day was most popular, within a specific date range, for a specific page.
With this in mind, would it be beneficial to store the DATE in a separate field from the TIME that the hit occurred, then add indexes? I would be using a where clause with a range (greater than x and less than y) for some of these queries. I do expect to have queries that ask about both the Date and the Time, such as "within the past 6 months, show me number of hit grouped per hour of the day."
Am I over complicating it? should I just use a single DateTime2(0) field or is there some advantage to using two fields for this?
I think you are bordering premature optimization with this approach.
Use Datetime. In due time (i.e. after your application has reached Production and you have a better idea of the actual requirements and how it performs) you can for example introduce views to aggregate your data in a way that proves more useful for any reporting/querying you have to perform frequently.
In the most extreme case you can even refactor your schema and migrate everything from Datetime to two distinct fields, but I doubt this will prove necessary.

How to design a schema for periods of dates with exceptions?

The site is about special discount events. Each event contains a period of time (dates to be more precise) that it is valid. However there will often be a constrain that the deal is not valid in say Saturdays and Sundays (or even a specific day).
Currently my rough design would be to have two tables:
Event table store EventID, start and end date of the duration and all other things.
EventInvalidDate table stores EventID, and specific dates which the deals are not valid. This requires the application code to calculate invalid dates upfront.
Does anyone know of a better pattern to fit this requirement, or possible pitfall for my design? This requirement is like a subset of a general calender model, because it does not require infinite repeating events in the future (i.e. each event has a definite duration).
UPDATE
My co-worker suggested to have a periods table with start and end dates. If the period is between 1/Jan and 7/Jan, with 3/Jan being an exception, the table would record: 1/Jan~2/Jan, 4/Jan~7/Jan.
Does anyone know if this is better the same as the answer's approach, in terms of SQL performance. Thanks
Specifying which dates are not included might keep the number of database rows down, but it makes calculations, queries and reports more difficult.
I'd turn it upside down.
Have a master Event table that lists the first and last date of the event.
Also have a detail EventDates table that gets populated with all the dates where the event is available.
Taking this approach makes things easier to use, especially when writing queries and reports.
Update
Having a row per date allows you to do exact joins on dates to other tables, and allows you to aggregate per day for reporting purposes.
select ...
from sales
inner join eventDates
on sales.saleDate = eventDates.date
If your eventDates table uses start and end dates, the joins become harder to write:
select ...
from sales
inner join eventDates
on sales.saleDate >= eventDates.start and sales.SaleDate < eventDates.finish
Exact match joins are definately done by index, if available, in every RDBMS I've checked; range matches, as in the second example, I'm not sure. They're probably Ok from a performance perspective, unless you end up with a metric ton of data.

Date / Time reference table needed for Analytic?

Is it better to keep Days of month, Months, Year, Day of week and week of year as separate reference tables or in a common Answer table? Goal is allow user content searches and action analytic to be filtered by all the various date-time values (There will be custom reporting for users based on their shared content). I am trying to ensure data accuracy by using IDs, and also report out on numbers of shares, etc by time and date for system reporting by comparing various user groups. If we keep in separate tables, what about time? A table with each hour, minute and second also needed?
Most databases support some sort of TIMESTAMP data type plus assciated DAY(), MONTH(), DAYOFWEEK() functions.
The only valid reason for separate DAY or HOUR columns in a separate table is if you have procomputed totals and averages for each timeslot.
Even then its only worth it if you expect a lot of filtering based on these values, as the cost of building these tables is high, and, for most queries the standard SQL "GROUP BY ... HAVING .. " will perform well enough.
It sounds like you may be interested in a "STAR SCHEMA" wikipedia a common method in data warehosing to speed up queries -- but be warned designing and building a Star Schem is not a trivial exercise.

Resources