MS SQL Server, arrays, looping, and inserting qualified data into a table - arrays

I've searched around for answer and I'm not sure how best to frame the question since I'm rather new to SQL Server.
Here's what I got going on: I get a weekly report detailing the products that have been sold and the quantity of each. This data needs to go into a yearly totals table. In this table the first column is the product_id and the next 52 columns are week numbers, 1-52.
There's a JOIN that runs on the product_id of both the weekly and yearly tables. That finds the proper row and column to put the weekly quantity data for that product.
Here's where I'm not sure what to do. In 2019 there are no product_id in that column. So there's nothing to JOIN on. Those product_id need to be added weekly if they aren't there. I need to take the weekly report of product_id and quantity and check each product_id to see if it's in the yearly table. If not I need to add it.
If I had it my way I'd create an array of the product_id numbers from the weekly data and loop through each one creating a new record in the yearly table for any product_id that is not already there. I don't know how best to do that in SSMS.
I've searched around and have found different strategies for this. Nothing strikes me as being a perfect solution. There's creating a #temp table variable, a UNION using exclude to get just those that aren't in the table, and a WHILE loop. Any suggestions would be helpful.

I ended up using a MERGE to solve this. I create a table WeeklyParts to dump the weekly data into. Then I do a MERGE with the yearly table inserting only those where the is no match. Works well.
-- Merge the PartNo's so that only unique ones are added to the yearly table
MERGE INTO dbo.WeeklySales2018
USING dbo.WeeklyParts
ON (dbo.WeeklySales2018.PartNo = dbo.WeeklyParts.PartNo)
WHEN NOT MATCHED THEN
INSERT (PartNo) VALUES (dbo.WeeklyParts.PartNo);

Related

How to create a table with days column in react js?

I create one attendance management system right now. I stuck in someplace like I need to display all employee attendance report at a single table in that table I need to display name, days, and total hour of the month columns. I try some ways to achieve to create the day's column but I am failed to find a proper way to achieve this table so please help me how can I create this table.
i want the final output like this :-

How to replace a column's value with another column's value in SQL Server? And how to save average of a column in another column?

I have a table called Teams which lists teamnames, teamid and another table called Players that is related to Teams with a primary key teamnames. However I want to replace the column teamname in the Players table to teamid as it would be more performant.
How could I manage to do it?
Also each player has an overall column. I want to calculate the average overall of a team (The sum of the overall of players corresponding to this team via join/ Number of players) and save it in the team table for each team. I searched in the internet but did not find if it's possible in ssms.
Thank you so much for taking your time to help me:) I understand most of select, update, delete statements and inner joins but could not find a way to save the information in a table for each team.
Simple! Make a new column team_id and then UPDATE using a join between teams and players.
You could use a "computed column" for this: https://learn.microsoft.com/en-us/sql/relational-databases/tables/specify-computed-columns-in-a-table?view=sql-server-ver15
I finally figured out how to do the second part. I had to use Group By to be able to do the average for each team separately.

MS Excel - Find Missing Date values between two dates in a column full of dates

I have a master list of SKUs in one table set as a column with a different SKU listed in each row.
I have another table with a daily snapshot of items which I have in stock between 7/1/17 and 7/31/17 in each row. The table shows the item SKU in one column, the warehouse where there is quantity in another column, and the quantity available within that warehouse in another column. There can be multiple occurrences of a SKU on one date if there are quantity in multiple warehouses. The table only lists SKUs on occurrences where there is quantity within a warehouse. If a SKU has no quantity within any of the warehouses on a date, it will not be listed in the table corresponding with that date.
In my table with the master list of SKUs I want to create a column that will show the the number of days within a range of dates (7/1/17 to 7/8/17) in which there were no quantity of the SKU being referenced available in any warehouse.
To show a more precise idea of what it is I am trying to do, I have posted a youtube video here: http://youtu.be/6CWLN6wzaWQ?hd=1
Have a look at the following URLs - I feel partially similar cases are discussed and resolved over there.
It will surely give you some lead to further work on.
How to return multiple values between two dates in excel?
https://answers.microsoft.com/en-us/office/forum/office_2007-excel/finding-missing-dates-in-column/8c49800a-6997-4585-b1f4-abdeaa64e718

ora 01858 while inserting data from other table with same schema in same database

I am an application programmer but currently I have a situation in which I need to copy a huge amount of data which is collected for 1 month say approx 653 GB of data from the table in one database to exact similar table in other database (both oracle 11G). Each row size is approx 150 bytes. So the number of rows are approximately 4000 millions. I am not joking.
I have to do this. The table which holds this data (source table) is partitioned based on date column. So there is a partition for each day of the month and hence in total 31 partitions for December month.
The target db is partitioned based on month. So there is a single partition in the target db for complete december month.
I have chosen to copy the data over db link and with the help of dba's I created a db link between this 2 databases.
I have a store procedure in target db which accepts input parameter as (date, tablename). What this procedure does is it creates a temporary table in target db with name as tablename and copies all the data from the source db for the given date into this temporary table in target database. I have done it successfully for 2-3 days. Now I want to insert this data in the temporary table into actual table in the same target database. For that I executed following query:
insert into schemaname.target_table select * from schemaname.temp_table;
But I am getting folloowing ORA error.
ORA-01858: a non-numeric character was found where a numeric was expected
Both the tables have exact same table defination. I searched over internet for copying data and found the above query to insert as simplest. But I don't understand the error. Searching for this errors show it is has something to do with date column. But shouldn't it work as both the tables have same table structure?
Data types used in the table are varchar2(x), date, number(x,y), char(x).
Please help me to get over this error. Let me know if any other information is required.
It means your schemaname.temp_table table has some non-numeric value which doesn't inserted in your new table. Did schemaname.temp_table table populated with some script or any automated tool? There is possibility of empty space or junk character inserted in schemaname.temp_table. Kindly check once again using any sql tool.

data synchronization from unreliable data source to SQL table

I am looking for pattern, framework or best practice to handle a generic problem of application level data synchronisation.
Let's take an example with only 1 table to make it easier.
I have an unreliable datasource of product catalog. Data can occasionally be unavailable or incomplete or inconsistent. ( issue might come from manual data entry error, ETL failure...)
I have a live copy in a Mysql table in use by a live system. Let's say a website.
I need to implement safety mecanism when updating the mysql table to "synchronize" with original data source. Here are the safety criteria and the solution I an suggesting:
avoid deleting records when they temporarily disappear from datasource => use "deleted" boulean/date column or an archive/history table.
check for inconsistent changes => configure rules per columns such as : should never change, should only increment,
check for integrity issue => (standard problem, no point discussing approach)
ability to rollback last sync=> restore from history table ? use a version inc/date column ?
What I am looking for is best practice and pattern/tool to handle such problem. If not you are not pointing to THE solution, I would be grateful of any keywords suggestion that would me narrow down which field of expertise to explore.
We have the same problem importing data from web analytics providers - they suffer the same problems as your catalog. This is what we did:
Every import/sync is assigned a unique id (auto_increment int64)
Every table has a history table that is identical to the original, but has an additional column "superseded_id" which gets the import-id of the import, that changed the row (deletion is a change) and the primary key is (row_id,superseded_id)
Every UPDATE copies the row to the history table before changing it
Every DELETE moves the row to the history table
This makes rollback very easy:
Find out the import_id of the bad import
REPLACE INTO main_table SELECT <everything but superseded_id> FROM history table WHERE superseded_id=<bad import id>
DELETE FROM history_table WHERE superseded_id>=<bad import id>
For databases, where performance is a problem, we do this in a secondary database on a different server, then copy the found-to-be-good main table to the production database into a new table main_table_$id with $id being the highest import id and have main_table be a trivial view to SELECT * FROM main_table_$someid. Now by redefining the view to SELECT * FROM main_table_$newid we can atomically swicth the table.
I'm not aware of a single solution to all this - probably because each project is so different. However, here are two techniques I've used in the past:
Embed the concept of version and validity into your data model
This is a way to deal with change over time without having to resort to history tables; it does complicate your queries, so you should use it sparingly.
For instance, instead of having a product table as follows
PRODUCTS
Product_ID primary key
Price
Description
AvailableFlag
In this model, if you want to delete a product, you execute "delete from product where product_id = ..."; modifying price would be "update products set price = 1 where product_id = ...."
With the versioned model, you have:
PRODUCTS
product_ID primary key
valid_from datetime
valid_until datetime
deleted_flag
Price
Description
AvailableFlag
In this model, deleting a product requires you to update products set valid_until = getdate() where product_id = xxx and valid_until is null, and then insert a new row with the "deleted_flag = true".
Changing price works the same way.
This means that you can run queries against your "dirty" data and insert it into this table without worrying about deleting items that were accidentally missed off the import. It also allows you to see the evolution of the record over time, and roll-back easily.
Use a ledger-like mechanism for cumulative values
Where you have things like "number of products in stock", it helps to create transactions to modify the amount, rather than take the current amount from your data feed.
For instance, instead of having a amount_in_stock column on your products table, have a "product_stock_transaction" table:
product_stock_transactions
product_id FK transaction_date transaction_quantity transaction_source
1 1 Jan 2012 100 product_feed
1 2 Jan 2012 -3 stock_adjust_feed
1 3 Jan 2012 10 product_feed
On 2 Jan, the quantity in stock was 97; on 3 Jan, 107.
This design allows you to keep track of adjustments and their source, and is easier to manage when moving data from multiple sources.
Both approaches can create large amounts of data - depending on the number of imports and the amount of data - and can lead to complex queries to retrieve relatively simple data sets.
It's hard to plan for performance concerns up front - I've seen both "history" and "ledger" work with large amounts of data. However, as Eugen says in his comment below, if you get to an excessively large ledger, it may be necessary to to clean up the ledger table by summarizing the current levels, and deleting (or archiving) old records.

Resources